Hi Peter,
The Richardson-Lucy and Van-Cittert deconvolution algorithms, being iterative deconvolution algorithms, cannot be parallelized at a high level. Basically, the problem is that each deconvolution iteration depends on all previous iterations. Wavelet transforms, which are used for deconvolution regularization and deringing, cannot be parallelized for the same reason: each wavelet layer depends on all previous layers.
Basically, the only operations that are efficiently parallelized in this process are convolutions and arithmetic operations. However, since parallelization cannot work continually during most of the process, it cannot provide high performance benefits. A GPU-based implementation of convolutions will provide significant performance improvements in a new version of PixInsight to be released after the incoming 1.8.8-6 version (probably 1.8.9, hopefully before the end of this year), but nothing really spectacular in this particular case.