What I have read and makes sense to me is to normalize the long-duration frame to the short-one. So if your short-duration is 60-second and your long duration is 600-second, you then divide all pixels of the long-duration frame by 10. Then you can combine both frames (after they themselves are corrected for dark, etc...) In this way you do not emphasize noise.
Doing the reverse, multiplying the short-duration frame by a factor of 10 would boost the random (and other) noise in that frame.
Now, of course, you may also want to mask any area of the long-exposure frame that are over-exposed as an additional step but I have never performed it myself.