Author Topic: LinearFitClipping simulation software  (Read 5332 times)

Offline georg.viehoever

  • PTeam Member
  • PixInsight Jedi Master
  • ******
  • Posts: 2132
LinearFitClipping simulation software
« on: 2012 January 08 08:05:11 »
Hi,

when working with ImageIntegration, I was interested in understanding how Linear Fit Clipping http://pixinsight.com/doc/tools/ImageIntegration/ImageIntegration.html#linear_fit_clipping works. As a result I wrote a simulator for this in R http://www.r-project.org/, which is an open source tool for statistical analysis (R is available for all relevant platforms). I found it useful for understanding how Linear Fit Clipping works, and for experimenting with other rejection algorithms.

In the following, I would like to present the R script, how it works, and which results I found. I am publishing it in case there is someone who would like to experiment with this as well. The script is open source as specified in the header:
Code: [Select]
# (C) 2012 Georg Viehoever, except code for wle.poisson() .
# Permission is granted to use the code any way to like, but I would welcome if you make
# any derived work available as source as well.
#
# Code for  wle.poisson() is licensed under GPL2, see header of this function

The script is attached below.

The Script

ImageIntegration works by creating a PixelStack of pixels at the same location in different images. It analyses the stack to remove outliers (e.g. from cosmic rays or airplanes), and integrates the remaining pixels into a true pixel value (e,g. by computing the average). The script provides  a number of pixel rejection and integration algorithms (PI.*Fit()), amongst them a reimplementation of the Linear Fit Clipping algorithm with averaging (PI.linFit(), I hope my implementation derived http://pixinsight.com/doc/tools/ImageIntegration/ImageIntegration.html#linear_fit_clipping is correct...).

Pixels (as modeled by the script, there are other sources like amplifier noise as well) receive noise (unwanted variation) from two sources: 1. Poisson noise due to the random nature of arriving photons, 2. impulse noise from airplains, cosmic rays, ... . The method PI.genPixelStack(n=40,meanVal=maxVal/16,spikeProb=0.05,minVal=0,maxVal=16383) allows to generate such pixel stacks of size n, mean true value meanVal, a probility of uniform impulse noise of spikeProb with values in the [minVal,maxVal] interval. I think the defaults shown here make sense for my Canon EOS40D....

I can now evaluate the quality of a pixel rejection algorithm using the method PI.fitExperiment(first=100,last=16300,runs=100,rep=100,sigma=2,spikeProb=0.05,fitFunc=PI.linFit). For runs different mean values in [first,last], it will do rep runs for each mean.  It returns a table of the true mean, the estimated mean using the given fitFunc, and other values. A run with the default paramaters consumes 5-10 minutes on my laptop.

I can evaluate this table using commands from the R toolset. For example, r.lin is the table regenerated for the PI LinearFitClipping algorithm. boxplot(mean-estimate~mean,data=r.lin,ylim=range(-100,100)) creates a box plot (see screenshot 1) showing the difference between true mean and estimated mean. The closer the fat black bars are to 0, the better is the average estimate. The other boxes visualize the variance within the estimates. The smaller the variance, the less remaining noise has the image.

The script as delivered here will automatically compute some of those tables (see the last lines). It needs roughly 30 minutes for those.

Georg
Georg (6 inch Newton, unmodified Canon EOS40D+80D, unguided EQ5 mount)

Offline georg.viehoever

  • PTeam Member
  • PixInsight Jedi Master
  • ******
  • Posts: 2132
Results 1
« Reply #1 on: 2012 January 08 08:46:34 »
Here some results.

Slope Map

One of the first observations was that the slope map generated by LinearFitClipping depends on the mean pixel value. In post http://pixinsight.com/forum/index.php?topic=3745.0 I present this result and propose a workaround.

Quality of LinFitClipping

I compared LinFitClipping as implemented in PI.linFit() to other algorithms, amongst them using the median as a robust estimator PI.medianFit() and using a robust implementation of logistic regression PI.glmFit().  The LinFitClipping performance is simply excellent (see previous post)! For comparison, screenshot 1 shows the result for PI.glmFit(): The estimates show larger variation, and they show an upward trend that is undesirable.

Q-Q Plots

I had the suspicion (see http://pixinsight.com/forum/index.php?topic=2013.msg13022#msg13022) the way Linear Fit Clipping evaluates the data in essence is a Q-Q plot http://en.wikipedia.org/wiki/Q-Q_plot. If the random data conforms to a certain random distribution P, the points in a Q-Q plot should generate a straight line. In the case of Line Fit Clipping, P is assumed to have a uniform distribution, see screenshot 2 generated with  PI.linFit(runif(1000),bPlot=TRUE). Since pixel data is not uniformly distributed, we do not get a straight line, see screenshot 3, generated with PI.linFit(PI.genPixelStack(1000,spikeProb=0),bPlot=TRUE).

If, however, we change the plot to assume that P has a Poisson noise distribution, we get the desired straight line, plus a slope=1 that does not depend on the mean value. See screenshot 4, generated with PI.poisFit1(PI.genPixelStack(1000,spikeProb=0),bPlot=TRUE).

Why is this important? The downward and upward ends of the first plot will lead to rejection even for data that is perfectly fine (i.e. can contribute useful information to the estimate of the mean). My hope is to develop a better rejection method using this insight.

Georg
Georg (6 inch Newton, unmodified Canon EOS40D+80D, unguided EQ5 mount)

Offline Juan Conejero

  • PTeam Member
  • PixInsight Jedi Grand Master
  • ********
  • Posts: 7111
    • http://pixinsight.com/
Re: LinearFitClipping simulation software
« Reply #2 on: 2012 January 26 00:11:58 »
Georg,

I have been thinking on your analysis, which is excellent, as always. I ask for just a bit of patience, I'll answer here as soon as I can get a time window of the required length. We are working hard right now on two new tools (with documentation!) that will surprise everybody *for sure* ;)
Juan Conejero
PixInsight Development Team
http://pixinsight.com/

Offline georg.viehoever

  • PTeam Member
  • PixInsight Jedi Master
  • ******
  • Posts: 2132
Re: LinearFitClipping simulation software
« Reply #3 on: 2012 February 05 06:26:39 »
Some more results:

- I wanted to know if the signal that my DSLR provides really follows the assumed Poisson Distribution. For this purpuse I took 400 flats, 30 seconds each at ISO 800, in front of a luminescent panel. I retrieved a 400 pixel stack from these images. The values can be seen in attached screenshot 1 (R command: plot(x)). Clearly, the panel got dimmer beyond shot 300 due to the weakening battery, so I used only the first 300 shots (R command: x1<-x[1:300])
- The histogram for these shots is a the expected bell shape (screenshot 2, R command: hist(x1,20)). Why does it look like a normal distribution? The poisson distribution  gets similar to the normal distribution with mean and variance=lambda for large lambdas, and 2000 is large in this sense.
- Is it compatible to a Poisson distribution? No  :(. mean(x) is 2141, and var(x) is 3600, while it should be var(x)==mean(x) for a true poisson distribution. I think this is the case because the raw values provided by the camera a multiplied in the amplifier depending on the chosen ISO value, plus there is some bias correction. The relationship is mean(rpois(...,lambda))==var(rpois(...,lambda))==mean(rpois(...,lambda)*n)/n==var(rpois(...,lambda)*n)/n^2, that is with multiplier n the values for mean and variance are no longer equal.
- Is it possible to compute the true lambda and bias from this? Unfortunately not, because there are many combinations of lamba, multiplier and bias that produce almost identical results (for "large" lambda, it may be possible for small values below 20)
-  For any practical purposes, we can assume that we cannot decide if we have a normal distribution or a poisson distribution for reasonable large lambdas. And since normal distribitions are so much easier to handle, I used them in the next steps. I did some qqplots for uniform distribution (that is what linear fit rejection uses, svreenshot 3, R-Command:  qqmath(x1,distribution=function(p)qunif(p,min(x1),max(x1)))), and for normal distribution (screenshot 4, R-command: qqmath(x1,distribution=function(p)qnorm(p,mean(x1),sd(x1)))). We clearly see that the normal distribution is a much better model for the pixel distribution, and the linear fit would cut quite a number of points at the upper and lower ends.

Having said that: I dont expect spectacular improvements when switching Linear Fit Clipping to normal distribution. With a LinearFit Low/High value of 2, it will clip 5% of pixels that are not outliers. There should be a minimal gain in SNR if we avoid that, but not much.

Enjoy,
Georg
Georg (6 inch Newton, unmodified Canon EOS40D+80D, unguided EQ5 mount)