Author Topic: Integrating LARGE Numbers of Files  (Read 4099 times)

Offline james7

  • Member
  • *
  • Posts: 55
Integrating LARGE Numbers of Files
« on: 2013 December 08 14:18:12 »
I'm using a DSLR (actually a mirrorless, APS-C camera, from the Sony NEX series) and because of the light pollution I typically have to deal with I very often end up with hundreds of short exposures that need to be stacked into my final result. My problem is that I can't stack more than about 240 images before the PixInsight integration process ends with an error related to file i/o (apparently, but I'm thinking it may also have to do with RAM usage). I'm running under Mac OS X and have tried to increase the limit on open files and while that initially seemed to help I'm now gathering data sets that seem to exceed what I can get PixInsight to process (specifically, during integration). I'm going to continue to look into this problem (or a solution to same), but for the time being I'm wondering if there might be a recommended method to process these files in smaller-numbered groups and then combine those groups into a final result.

What I'm asking is that if I start with 500 calibrated and debayered images can I integrate these into smaller groups (say 100 images each) and then combine those into a final result (by taking those intermediates and integrating or combining those back into a single, final master). It's easy enough to just re-integrate the groups but I'm wondering if that comes anywhere near to producing the same results as integrating ALL of the images at once. I'm thinking that the pixel rejection that occurs during the integration may be the key (or the problem).

In any case, can anyone recommend a integration method (via groups, as I explained above) that would work the best, or come near -- theoretically or mathematically -- to doing the integration in one big set? Should I change the integration method or pixel rejection type when combining these groups? Would it be better (or make any difference) if I used many small groups rather than a few large groups (i.e. ten groups of 50 versus of two groups of 250 -- both producing a final master based upon the initial 500 subframes)?

I've already done some work in combining images using groups of masters, but I'm concerned that the results could be far from optimum and not nearly the equivalent of integrating all of the images at once.

Offline pfile

  • PTeam Member
  • PixInsight Jedi Grand Master
  • ********
  • Posts: 4729
Re: Integrating LARGE Numbers of Files
« Reply #1 on: 2013 December 08 14:54:16 »
i hit this a while back and i guess you must have seen the thread. i have not checked this lately but it seems like the per-process and global open file limits are kind of toy knobs in OSX. despite being able to set the limits to big numbers it seems like behind the scenes the number is clamped at 256 or maybe 255.

anyway to your question, i think groups of 100 are more than enough to do statistically robust pixel rejection. you should be able to make stacks of 100 with all outliers rejected properly and then combine the stacks with no pixel rejection.

i wonder though if the subs should be sorted by noise weighting… in other words each stack should get a representative mix of image noises - so that you don't have one stack composed of all low SNR images and another composed of all high SNR images.

not sure how to automate this. mschuster probably knows :)

rob

Offline Alejandro Tombolini

  • PTeam Member
  • PixInsight Jedi
  • *****
  • Posts: 1267
    • Próxima Sur
Re: Integrating LARGE Numbers of Files
« Reply #2 on: 2013 December 08 15:23:05 »
Hi James,
You can also tried decreasing the Buffer size in the image integration tool. That could help.

Offline mschuster

  • PTeam Member
  • PixInsight Jedi
  • *****
  • Posts: 1087
Re: Integrating LARGE Numbers of Files
« Reply #3 on: 2013 December 08 15:31:23 »
If buffer size doesn't work, I would try what Rob suggests.

Maybe create the groups randomly, i.e. don't use sequential date/time subs? This will tend to unify the rejection across groups.

Maybe use the same reference sub for all of the groups? This sounds strange, but it will normalize the weighting, at the (small?) price of an over-represented reference in the integration.

Mike

Offline james7

  • Member
  • *
  • Posts: 55
Re: Integrating LARGE Numbers of Files
« Reply #4 on: 2013 December 08 16:15:45 »
I agree that it would be better to randomize the members of the groups (not take them in sequential order), but I'm not certain that combining the groups with no pixel rejection would be the "best" option. I'm thinking that a simple percentile rejection to eliminate obvious outliners would be somewhat better (and if nothing got eliminated then it would be no different that selecting no rejection). The weighting is another factor and if each group contains a relatively large number of images and I've randomized the selection then I'm thinking that the weight should be set to one (i.e. no weighting) for the final integration (of the groups).

Of course, I still don't know whether this will produce any significant benefit to my end results or whether it will even approach what could be had by integrating all of the images at once. In any case I suspect that once your integration reaches a few hundred images you'd have to massively increase the number of subframes to see much improvement. So, to improve from 240 images you'd have to go to 500 or more to even see much difference and then to go beyond that perhaps thousands of images which from a practical standpoint would be almost impossible (or nearly a complete waste of time). The largest single integration that I've ever done was 512 images, but that was done with DeepSkyStacker not PixInsight.


Offline mschuster

  • PTeam Member
  • PixInsight Jedi
  • *****
  • Posts: 1087
Re: Integrating LARGE Numbers of Files
« Reply #5 on: 2013 December 08 16:46:28 »
I agree on group weights of one, but if you do see a group rejection then the scheme is not working as intended. Rejecting a group is like rejecting 100 subs at a go, not good, something is wrong.

It is possible to integrate a very large number of subs all at once, and also limit the number of open files to one. A multiple pass scheme is easy to implement, where a small portion of each sub is read, integrated, and written, repeat until done. The problem is that it is slow. With more work this can be made faster using a data scatter/reduce scheme.

Mike
« Last Edit: 2013 December 08 19:29:07 by mschuster »

Offline pfile

  • PTeam Member
  • PixInsight Jedi Grand Master
  • ********
  • Posts: 4729
Re: Integrating LARGE Numbers of Files
« Reply #6 on: 2013 December 08 17:33:14 »
unless limiting the buffer size somehow limits the number of open files (i don't think it does) then i don't think lowering the buffer size will help. it's not so much about absolute memory usage as it is about the number of open files that OSX will permit a process to have… if i understand it right PI is going to open every single file in the list at once so it can see all the pixels in a given stack at one time.

i think i agree with mike that after each stack of 100 there should be no more 'true' outlier pixels to reject.

rob

Offline Phil Leigh

  • PixInsight Addict
  • ***
  • Posts: 220
Re: Integrating LARGE Numbers of Files
« Reply #7 on: 2013 December 09 03:19:24 »
A somewhat off-topic word of caution... the expected shutter life on a non-Professional modern DSLR is typically in the range 75,000-150,000... that's only 75-150 1,000 shot images!.

Given that 100 shots reduces the shot noise to 1/10th of that of a single frame... and 400 reduce it to 1/20th... you may want to balance these noise figures against shutter life...

Modern software noise reduction techniques such as TGVDenoise are very good. I know prevention is better than cure - but there are costs involved.

Offline james7

  • Member
  • *
  • Posts: 55
Re: Integrating LARGE Numbers of Files
« Reply #8 on: 2013 December 09 04:40:36 »
A somewhat off-topic word of caution... the expected shutter life on a non-Professional modern DSLR is typically in the range 75,000-150,000... that's only 75-150 1,000 shot images!.

Given that 100 shots reduces the shot noise to 1/10th of that of a single frame... and 400 reduce it to 1/20th... you may want to balance these noise figures against shutter life...

Modern software noise reduction techniques such as TGVDenoise are very good. I know prevention is better than cure - but there are costs involved.
That COULD be a concern, however, since I generally get only two or three nights per month to do imaging (at the very best) and that shutter life works out to several years of use, probably beyond the useful life of the technology (i.e. by that time it would be better to buy a new camera anyway). There have been a few times when I've done well over 1,000 shutter activations in a single night (counting both light frame, bias, and dark field) but that only happens a few times each year (on average). Most other nights I only do several hundred and at that rate I might be able to go on for three or four years.

I also use three different cameras, so it's likely that I will replace one or more of those before I have any problems with shutter failure (since over the last three years I have alternately used each camera and two of those are completely interchangeable).
« Last Edit: 2013 December 09 14:43:52 by james7 »

Offline georg.viehoever

  • PTeam Member
  • PixInsight Jedi Master
  • ******
  • Posts: 2132
Re: Integrating LARGE Numbers of Files
« Reply #9 on: 2013 December 09 09:20:29 »
Did you try to convert your RAWs to FITS first? ImageIntegration appears to be much more efficient (both in speed and memory) with FITS than with RAW.
Georg
Georg (6 inch Newton, unmodified Canon EOS40D+80D, unguided EQ5 mount)

Offline james7

  • Member
  • *
  • Posts: 55
Re: Integrating LARGE Numbers of Files
« Reply #10 on: 2013 December 09 14:16:42 »
Did you try to convert your RAWs to FITS first? ImageIntegration appears to be much more efficient (both in speed and memory) with FITS than with RAW.
Georg
I generally use the Batch Preprocessing Script and I assume that the needed conversions are done during the calibration phase of the script (the calibrated files are output as FITS, so I've concluded that is all that needs to be done, or is there an additional internal FITS format that needs to be explicitly requested with a format hint or other option?). In any case, but the time I get to the integration phase my files are always of type FITS.

Offline georg.viehoever

  • PTeam Member
  • PixInsight Jedi Master
  • ******
  • Posts: 2132
Re: Integrating LARGE Numbers of Files
« Reply #11 on: 2013 December 09 22:54:37 »
You are right: calibration converts to FITS.
Georg (6 inch Newton, unmodified Canon EOS40D+80D, unguided EQ5 mount)