This topic has always puzzled me. Although I think I understand the concept and theory behind it, its algorithmic implementations and results many times left me wondering what is going on exactly.
I am fairly aware of the difficulties in finding a robust metric for SNR, particularly on the signal side of that equation (ie, scale). It is often the case, when I use SubframeSelector to look at a data set, or when watching the weights being calculated during ImageIntegration, that higher weighted subs don't generally coincide with what I would consider the better subs upon visual inspection (sharper, darker background, etc.) and other metrics (fwhm, star support, estimated noise, etc.).
We have discussed here before, the effects of significant and changing gradients on estimating signal strength that often skew scale estimates, and the way to handle this during processing. But many times, even with no apparent gradients, I also see unexpected results. Could it be subtle changes in transparency? of maybe bloated stars due to poor guiding/focus read as better signal? or a bug? I don't know.
Lately, I have been trying to combine two data sets of M33: one from a couple of years back and of lesser quality (shorter integration, worse fwhm due to seeing/focus/guiding, worse camera), with another from last new moon. I calibrated and stack both sessions independently, removed background gradients and color-calibrated both stacks. Then I registered one to the other with no problems.
To my eyes, the latter stack looks better than the old one (sharper, more detail, and better low-scale contrast), although from a distance they both look similar (after a STF stretch). But when I try to combine them (using ImageIntegration with SNR weighting), I see higher weights on the older stack (say, 1.35 vs 1), and I have tried many if not all of the scale estimator methods available in II.
Then I thought that maybe the fact that the old camera was 12-bit (canon 1000D) and the new one is 14-bit (canon 6D), was getting in the way. So I proceed to apply a LinearFit of one stack to the other, as a sort of rescaling. Then ImageIntegration went bananas, attributing a weight between 100 and 170 to the RGB channels of the worse stack!
By close inspection of the console output, I see noise estimations an order of magnitude smaller than what I get with the NoiseEvaluation script, but only in the older (ie, 1000D) stack.
One last note: since ImageItegration cannot work with only two images, I had to duplicate both entries in order for it to run. Hope this is not causing an unexpected problem.
I am uploading both stacks to the Endor server (under the folder "M33 SNR"), so that you can try to replicate the problem, and maybe help me understand what is going on. Be aware that each file is 119 MB.
Thanks,
Ignacio