PixInsight 1.6.1 - New RANSAC fitting rejection algorithm in ImageIntegration

Juan Conejero

PixInsight Staff
Staff member
NOTE -- This thread is obsolete - for up-to-date information on the new linear fit clipping pixel rejection algorithm please go to this thread.
Sorry for the inconvenience.


Hi there,

The new version of the ImageIntegration tool that comes with PixInsight 1.6.1 implements a new pixel rejection algorithm: RANSAC linear fitting. This is the first of a series of improvements in this essential tool, which will be introduced during the 1.6.x cycle.

The random sample consensus (RANSAC) algorithm is already being used in PixInsight with great success. The StarAlignment tool implements a sophisticated RANSAC procedure as the final stage of its star matching engine. The main advantage of RANSAC is its robustness. In the case of the new rejection method implemented in ImageIntegration, RANSAC is used to fit the best possible straight line (in the twofold sense of minimizing average deviation and maximizing inliers) to the set of pixel values of each pixel stack. At the same time, all pixels that don't agree with the fitted line (to within user-defined tolerances) are considered as outliers and thus rejected to enter the final integrated image.

The new pixel rejection algorithm is controlled, as usual, with two dedicated threshold parameters in sigma units: RANSAC low and RANSAC high, respectively for low and high pixels. The screenshot below shows the new version of ImageIntegration and some results of the new rejection method.


Click on the image for a full size version

The next screenshot is a comparison between an integrated image (raw data courtesy of Oriol Lehmkuhl and Ivette Rodr?guez) with the new RANSAC fitting rejection (left) and without rejection (right). Both images integrate 10 frames with signal-to-noise improvements of 2.86 and 2.92, respectively. SNR degradation is very low with excellent outlier rejection properties.


Click on the image for a full size version

The new RANSAC linear fitting rejection requires at least five images. It is best suited for large sets of images (>= 8).
 
Very nice Juan - keep up the good work!!

When all the sub-panels if ImageIntegration are 'expanded', does the GUI still fit on a 1920x1200 screen? If not, you might want to re-think the layout, otherwise screenshots, video tutorials, etc., are going to be difficult in the future - and remember, not everyone has the luxury of 1920x1200 either.

Cheers,
 
When are we going to get a script or something to help link the work flow process together?
ie calibration, registration, and integration.

No one has stepped forward to write this yet.

Max
 
Hi Harry,

The following is how I understand the various forms of 'clipping'

Percentile Clipping (the simplest) - any data that is too far away from the central peak of a distribution curve is simply excluded from the integration process

Sigma Clipping - an iterative approach, whereby data ouwith the clipping zone is set to zero, and plays no further part in iterations. The clipping points are set in terms of k-sigma units, where 'sigma' is the 'standard deviation' of the data set. The aim of iteration is to fine-tune the 'mean value' of the data-set from that which is calculated when NO DATA is clipped, to that which is re-calculated after the data outside the k-sigma points has been eliminated. Each iteration loop will be based on fewer and fewer data elements as the data is clipped more and more

Winsorised Sigma Clipping - is very similar to Sigma Clipping, with the important exception that the number of data points in the set remains the same throughout the clipping iterations. This is because data is not ELIMINATED outside the clipping zone - instead all data points outside the clip-zone are substituted with the data value AT the clipping point.

RANSAC Clipping - if my assumptions are correct, it proceeds in a similar fashion to Winsorised Sigma Clipping until a certain 'quality' point is reached in the iteration process. At that point, the assumption is made that all the data points are so similar in value that they should more or less fall on a 'straight line' if plotted on a graph. The algorithm will then set out to establish the 'best fit' straight line through these data points. Importantly, this 'straight-line' will also have to be HORIZONTAL, and it is the 'height' of this line that gives the 'average result' that we are looking for.

So, the main difference is that Winsorisation does NOT make the final hypothesis that the data-set could, plausibly, all have had the SAME original value in the first place.

Now, this is only MY INTERPRETATION - and it is simplistic at that. Although I do 'understand' the principles behind Winsorised Clipping, thanks to hours of Google-pedia research, and gentle kindness from Juan, what I have not (yet) done is repeat that research on the principles behind the RANSAC algorithm.

Juan (and others??) will (as usual) help to clarify things - and then we will get to 'experiment' when it makes its appearance in Vx.y.z  ::)

Cheers,

 
Harry page said:
Hi

Iam still with you max  >:D

Harry

I wonder if we nonchalantly threw this in as a request to Nikolay if he'd do it with default parameters as part of his animation script.....
:D   ;)    (every other suggestion has come back in a day or two!)

The RANSAC integration option sounds like a great new addition thanks Juan - the stacking seems to give exceptional quality even in my beginner's hands versus any other program I've tried, so any improvement in this critical area more than welcome.  These are things people have to take notice of in the marketplace too.....
 
RobF2 said:
I wonder if we nonchalantly threw this in as a request to Nikolay if he'd do it with default parameters as part of his animation script.....
Friends, if you like to see simple calibration (by master files w/o dark optimization) in animation script, I will do it. It's easy to implement. You really want to see result w/o dark optimization?

mmirot said:
When are we going to get a script or something to help link the work flow process together?
ie calibration, registration, and integration.
For me it's: integration ( masters ) > calibration > registration > integration.
How you like to care all settings for this long process? You want to see 7 tabs (bias/dark/flat/darkflat/calibrate/register/integration) and many-many buttons/checkers/slider/list on every tab?
You really want to get the monster?
 
Perhaps some confusion has arisen due to my choice of a wrong title for this thread. Actually, the most important part of the new rejection algorithm is linear fit, not RANSAC. The use of the RANSAC algorithm is an implementation detail, but the core of the method is that we are fitting all pixels in a stack to a straight line. In fact I am considering the possibility of not using RANSAC at all, in favor of other robust algorithms that are more efficient.

The difference between linear fit clipping and sigma clipping is best explained with a graphical example. Consider the following figure.

sigma-clipping.png

Suppose we are integrating N images. In the figure, we have represented a stack of N pixels at the same coordinates, whose values have been sorted in ascending order and plotted as circles on the graphic. The horizontal axis is the ordered sample number, and the vertical axis represents the available range of pixel values.

What we pursue with pixel rejection is to exclude those pixels that have spurious values, such as pixels pertaining to cosmic rays, plane and satellite trails, and other too high or too low pixels that are not valid data for some reason. We want to reject those pixels to make sure that they won't enter the integrated image.

The symbol m represents the median of the distribution. The median is the value of the central element in the ordered sequence, which in statistical terms corresponds to the most probable value in the distribution. Here, the median is being used as a robust estimate of the true value of the pixel, or more realistically, the most representative value of the pixel that we have available. Robustness here means unaffected by outliers. An outlier is a value abnormally low or high, as are both extreme values in the graphic, for example. Outliers are precisely those values that we want to exclude, or reject. By contrast, valid pixel values or inliers are those that we want to keep and integrate as a single image to achieve better signal-to-noise ratio. Our goal is to reject only really spurious pixels. We need a very accurate method able to distinguish between valid and not valid data in a smart way, adapted to the variety of problems that we have in our images. Ideally no valid data should be rejected at all.

With two horizontal lines we have represented the clipping points for high and low pixels, which are the pixels above and below the median of the stack, respectively. Pixels that fall outside the range defined by both clipping points are considered as outliers, and hence rejected. In the sigma clipping algorithm, as well as in its Winsorized variant, clipping points are defined by two multipliers in sigma units, namely KH and KL in the figure. Sigma is an estimate of the dispersion in the distribution of pixel values. In the standard sigma clipping algorithms, sigma is taken as the standard deviation of the pixel stack. These algorithms are iterative: the clipping process is repeated until no more pixels can be rejected.

In the figure, rejected pixels have been represented as void circles. Six pixels have been rejected, three at each end of the distribution. The question is, are these pixels really outliers? The answer is most likely yes, under normal conditions. Normal conditions include that the N images have flat illumination profiles. Suppose that some of the images have slight additive sky gradients of different intensities and orientations ?for example, one of the images is slightly more illuminated toward its upper half, while another image is somewhat more illuminated on its lower half. Due to varying gradients, the sorted distribution of pixels can show tails of apparent outliers at both ends. This may lead to the incorrect rejection of valid pixels with a sigma clipping algorithm.

Enter linear fit clipping:

linear-fit-clipping.png

In this figure we have the same set of pixels as before. However, the linear fit clipping algorithm is being used instead of sigma clipping. The new algorithm fits a straight line to the set of pixels. It tries to find the best possible fit in the sense of minimizing average deviation. Average deviation is computed as the mean of the distances of all pixel values to the line. When we find a line that minimizes average deviation, what we have found is the actual tendency in the set of pixel values. Note that a linear fit is more robust than a sigma clipping scheme, in the sense that the fitted line does not depend, to a larger extent, on all the images having flat illumination profiles. If all the images are flat (and have been normalized, as part of the rejection procedure), then the fitted line will tend to be horizontal. If there are illumination variations, then the line's slope will tend to grow. In practice, a linear fit is more tolerant to slight additive gradients than sigma clipping.

The neat result is that less false outliers are being rejected, and hence more pixels will enter the final integrated image, improving SNR. In the figure, r is the fitted line and d represents average deviation. HL and KH, as before, are two multipliers for high and low pixels, in average deviation units. However, instead of classifying pixels with respect to a fixed value (the median in the case of sigma clipping), in the linear fit algorithm low and high pixels are those that lie below and above the fitted line, respectively. Note that in the linear fit case only two pixels have been rejected as outliers, while the set of inlier pixels fit to a straight line really well.

The remaining question is: is linear fit clipping the best rejection algorithm? I wouldn't say that. Linear fit rejection requires a relatively large set of images to work optimally. With less than 15 images or so its performance may be worse or similar to Winsorized sigma clipping. To reject pixels around a fitted line with low uncertainty, many points are required in order to optimize it in the least average deviation sense. The more images, the better linear fit is possible, and hence the new algorithm will also perform better. It's a matter of testing it and see if it can yield better results than the other rejection algorithms available, for each particular case.
 
Nikolay,

You really want to get the monster?

My advice is to avoid asking that kind of questions to these guys. They tend to say yes when they see the words "want" and "monster" together  >:D

Beer? someone said beer?  :-*
 
Hi Nikolay,

You know, I do honestly think that we need to start with the belief that we MUST end up with the 'monster'

We certainly need a tab for each group if captured sub (so, that means Lights, Darks, Flats, FlatDarks and Biases). And each tab group can then be populated as required by the user

It could even be argued that these five tabs could be replicated as sub-tabs in a parent-level tab structure, covering up to four channels (allowing for a 'one-touch' L+R+G+B integration).

By this stage, all that will have been implemented is the ability to advise PI of which 'source images' belong to which 'tab group'. A sophisticated GUI would allow quick re-use of one group of data in the same group-type of another tab - if appropriate, and under user control.

Now, working up-and-down the two primary tab-levels (top level==channels; bottom level==sub-frame types) and left-to-right (either channels, or sub-types) the user can then fine-tune the images that will ACTUALLY be used in the 'integration' stages. This is where I see your 'Animation' script sitting.

In other words, for a given channel, and a given sub-type, the user then inspects the loaded images and enables/disables those images which are visually unacceptable for further inclusion. Obviously, as the 'image inspection' powers of PI increase, this can even become an 'automatic' process - where images are 'scored' in a similar fashion to DSS, and excluded from further processing based on some overall score and user-definable threshold (thresholds being set either 'globally' or 'locally', on a tab-by-tab basis). The auto-scoring and global thresholding allow the user to get closer and closer to a one-click "Go For It" button - that will result in PI spitting out one calibrated image per assigned (top-level) channel.

Within the lower-level subframe-type 'tabs', the user then has to deice what is going to happen to the 'selected' images (remember, these may have been 'auto-selected' as described above).

At the simplest level, for example, Biases would just be ImageIntegrated - they would not need to be ImageCalibrated. However, even this 'simplest level' may need to consider using the ImageCalibration process, if the Bias frames are to be calibrated using OverScan data. No matter though, only when ALL appropriate parameters have been entered or selcted - and they have been validated as being 'acceptable' (rather than 'nonsense'), only then would the 'Biases' tab be 'green lighted'. I foresee some sort of 'summary screen' that would show the 'green light' status of ALL the tabs in use. The user could either then process JUST 'that tab', or could wait until ALL the tabs had been correctly configured, such that a 'master' green-light was indicated, and could then just hit the global "Go For It" button.

Obviously, as different sub-types require extra levels of calibration, these have to be set up correctly as well. For example, a 'Darks' tab could not be 'green-lighted' if it referred to calibration using a MasterBias, if the associated 'Biases' tab was itself not 'green-lighted'. The same approach would work up, all the way to the 'Lights' tabs - which (if the user had set it up to need 'full calibration') would require 'green lights' for ALL associated sub-stages. And, obviously, it would take ALL the 'Lights' tabs for ALL the 'Channels' tabs to be 'green-for-go' in order to enable the global "Go For It" button.

Now, this approach is certainly one way to fulfil the 'one-click' facility that some users are looking for. It may even fulfil simpler needs as well. It is probably overkill for someone like me, who needs to 'tweak and tune' the various clipping sliders (etc.)  on the ImageIntegration process. But, it might NOT be overkill in thatt situation, if I also want to see how the ImageIntegration process behaves as I quickly enable/disable selected images through your 'Animation' interface.

And, it might also not be overkill, as I can then use your Animation interface on sub-types higher and higher 'up the food chain' - where any calibration data that IS applicable to the sub-level that I might be Animating can be (optionally, as always) applied 'on the fly')

Finally, it could also be the case that the 'Channel' tab is a 'special case' - and has extra parameters to define that the frame data contained in the sub-groups below it is all RAW data from a One-Shot-Colour camera. These parameters would define which type of deBayering algorithm would need to be invoked, and whether debyering happened on a sub-type by sub-type basis (resulting in 'full-colour' Master frames), or whether deBayering happened just before ImageAlignment, or whether deBayering happened AFTER final integration of the calibrated RAW frames.

An extra tab, or pop-up GUI DialogueBox could be invoked to handle these scenarios.

So, Harry, you had better start shipping beer to Nikolay every day from now on ::)

Cheers,
 
How you like to care all settings for this long process? You want to see 7 tabs (bias/dark/flat/darkflat/calibrate/register/integration) and many-many buttons/checkers/slider/list on every tab?
You really want to get the monster?

ImagePlus has a well thought tabbed interface for the whole process. AA4 is not bad either but not as "clean" as IP.

Tasos
 
Juan Conejero said:
Nikolay,

You really want to get the monster?

My advice is to avoid asking that kind of questions to these guys. They tend to say yes when they see the words "want" and "monster" together  >:D

Beer? someone said beer?  :-*


Yes! Yes! Yes!   (to monster AND beer : :p)

I actually had in mind the bare essentials to start with - a script that called saved process icons preconfigured with your library bias/darks/flats and usual integration settings.  You would run it for each filter then manually do RGB or LRGB combines.  Could even be a button on your fantastic animation script that called process icons for Image cal, alignment, then stacking - that alone would be a huge benefit saving have to specify related images over and over.  I know some people don't like the idea of library calibration images, but a lot of us do work this way.  

The "monster" with tabs and endless configuration for each filter/channel would be the ultimate - I get frightened of asking the IT guys at work for the ultimate though 'cause they get frightened off and I get nothing..... :yell:

Perhaps this should go to another thread though if you do run with it Nikolay - to get us out of Juan's excellent integration explanation?

 
Howdy, Guess i am a little late to the party.  Only one small point.  I note you have included a tab for DarkFlat n your proposed Frankenstein of processing tools.  I have yet to hear a good argument as to why Flats need to be dark calibrated.  Bias Yes! Dark, not so much?
 
NKV said:
RobF2 said:
I wonder if we nonchalantly threw this in as a request to Nikolay if he'd do it with default parameters as part of his animation script.....
Friends, if you like to see simple calibration (by master files w/o dark optimization) in animation script, I will do it. It's easy to implement. You really want to see result w/o dark optimization?

mmirot said:
When are we going to get a script or something to help link the work flow process together?
ie calibration, registration, and integration.
For me it's: integration ( masters ) > calibration > registration > integration.
How you like to care all settings for this long process? You want to see 7 tabs (bias/dark/flat/darkflat/calibrate/register/integration) and many-many buttons/checkers/slider/list on every tab?
You really want to get the monster?

Sounds like a good plan. You might want to look at MaxIMs calibrate and Stack commands for ideas.
They did a nice job on the worth flow but PI has better processes now.
You did a great job on blink. I think you can handle it!

Max

 
Juan Conejero said:
The more images, the better linear fit is possible, and hence the new algorithm will also perform better. It's a matter of testing it and see if it can yield better results than the other rejection algorithms available, for each particular case.

Great explaination.

We will have see how it works out in practice.
I commonly get 15-25 images, so it is worth a try.

The hardest problem to deal with is slight changes in transparency this can occur even on very good nights. These end up being complex gradients. It is hard to identify the inliers.

(  Of course this does not fix the gradients , we still do DBE after our integration)

Max
 
Hi Jack,

Howdy, Guess i am a little late to the party.  Only one small point.  I note you have included a tab for DarkFlat n your proposed Frankenstein of processing tools.  I have yet to hear a good argument as to why Flats need to be dark calibrated.  Bias Yes! Dark, not so much?

The reason I have included this is that this is how I have ALWAYS calibrated my imaging data. My Flats tend to be acquired as 2s exposures (or more, for NB data). And, as far as I am concerned, there is enough 'dark' signal at 2s to allow it to be statistically evaluated and therefore correctly eleiminated by using a MasterFlatDark.

I am 'not comfortable' with Biases being used to calibrate Flats, unless the Flats can be demonstrated to have no 'significant' signal attributable to 'thermal noise'. Certainly, because of the ultra-short Bias exposures, no statistical 'thermal' signal can be obtained. So, if the same cannot be said for the Flats, then using only a MasterBias to calibrate the Flats will still leave the MasterFlat with a thermal-noise content.

Then, is there not also the issue with the likes of full-frame CCDs? There is a finite minimum shutter-speed that these will operate with, and that shutter-speed is SIGNIFICANTLY longer than typical exposure times for Bias frames. So, I would want to be using multi-second FlatDarks once again, and just not bothering with Bias frames at all (because, they really just will NOT be 'true' Bias frames, because their exposure times will be too long). Have I just failed to understand something here?

In any case - I think it would be wrong to 'force' users to work with Biases as FlatDarks. I think that the option should remain a 'user option'.

Cheers,
 
Back
Top