Here it is the module and solver (code only, I'll compile them for windows and linux x64 tomorrow).
I ended putting an extra parameter to the module, to use Neumann padding (extending the boundary values) or Dirichlet padding (zero values). The second one yields better results for low amounts, and the first is better for large amounts.
Take into consideration that the result may also be dependent of the orientation of the image. Don't ask me exactly why
So, if you want to try different results, rotating is a good way to try new things.
I tested it with one astronomical image, and the results were awfull. I think that we'll need another way to build the mapping function (I have some ideas that I'll try tomorrow). For daylight images works like a charm.
Also, please note that in general best results are obtained with linear data.