Hi Larry
There are two "topics" that deal with background modelization (PixInsight's ABE and DBE processes). One of them is flat calibration and the other is gradient removal. I'll explain that a little further, to be clear.
First of all, a flat field is a calibration shot that is a "precise" model for the following effects: uneven field illumination over the device due to optical flaws, obstructions, etc.; different quantum efficiency of each pixel, etc. All of these are multipying effects. To properly calibrate the image you'll need to divide by the flat frame, just after bias and darks, and this is the very first steps of the processing overflow.
Please note, that a synthetic flat is very likely to be unable to reproduce all of these effects, specially obstructions due to dust particles, sharp edges or each pixel differential response. So, make use of this alternative as a last resource.
Sky gradients, by the other way, usually is a signal that is lying over the data, or more precisely, it is an additive term. To remove them, we must subtract the model. For most cases, sky gradients (due to light pollution, different altitude, etc.) is very smooth, so our implementations do a very good job with them.
Now,let's go a bit deeper into how to apply those two kinds of calibration frames. We said that flats are divided. But this operation in most cases is not applied directly. Flat frames are normaizated (because their data is linear, so any multiplying factor is "harmless") so that it's median value is 1. When you apply this normalized flat, some parts of the image will be darkened, and others brightened, but as a whole it will remain with the same values.
By the other hand, a gradient removal ideally should use no normalization, so it is a straight subtraction. But, in some cases, there is a risk of clipping data on the shadows, so a small bias value is used, or we just normalize the model by adding it's median value.
As you know, both treatments should be applied to linear data, in the first steps. If you subtract the gradients after RGB combination is ok, but try to do so before any nonlinear stretch. I think that there is not much difference from doing it before the channel combination, other than beeing a little more "elegant". Having said that, if you perform LRGB combination, or other similar technique, I think that it would be better to perform the gradient subtraction before it, to achieve better data (from a strict point of view... in my opinion, in most cases the gain is not noticeable, since these operations are linear too, so there should be no big difference, "but" gradients may become wilder and harder to model).
Hope this helps... at least, is just a matter of choice and see what works better for you.