IMHO the technical reasons to bin are 1) matching your image scale to your sky and 2) boosting SNR, provided that the hardware binning actually does result in an effective reduction in read noise.
along the way someone realized that since the human eye perceives the sharpness in an image from the image's L component, why not bin the color frames? this assumes #2 is true, or else there's not much point. the idea is that if you are collecting LRGB frames, you should capture the L at the seeing limit and then merge it together with the lower-resolution color image, which in theory can be shorter total integration time since the SNR is higher to begin with.
i guess the counter-argument is that unless your sky is very dark, or you are doing narrowband imaging, you are very unlikely to be read-noise limited. the background sky signal is far above the read noise anyway, so the SNR boost from binning is effectively the same as software binning. apparently also some sensors do not actually exhibit an SNR boost when hardware binning - the read noise is not amortized over all the pixels in the bin for some reason.
the advantage of using bin 1x1 everywhere (again, assuming bin 1x1 does not way oversample your sky) is that you can use your RGB frames to construct the L frame. or if you are also taking separate L frames, use the RGB frames together with your L frames to boost the SNR of the final L. with bin 2x2 RGB this would not be a good idea.
omitting the L filter and making a pseudo-luminance frame from RGB may be advantageous in high-LP environments - in some (most?) filter sets there are gaps between the R, G and B filters which in theory omit some of the wavelengths generated by artificial light sources. of course nowadays with these white LEDs we are probably doomed as the LP signature is going to become more and more broad-spectrum.
rob