yeah ADU means "analog-digital unit" and you sometimes see DN (for "data number"). of course the signal received by the sensor is analog in nature (a voltage is created as photons knock electrons off of atoms in the sensor). that analog voltage is measured with an analog-digital converter. this is where the number of bits come in, since the output of the A/D converter is a binary number. some sensors have 12-bit A/D converters, some 14, and some 16. incidentally the gain of the sensor is the relationship between the number of electrons need to tick the A/D converter by 1 ADU. so if you see that a sensor has a gain of 0.33 then you need 3 electrons to register 1 ADU. on a DSLR the ISO setting is essentially the analog gain we're talking about here.
since 16 bits is a very convenient size for computers (2 bytes), almost all capture programs output frames as 16-bit integer samples. so then the topmost bits of the 16-bit integers are 0 when the A/D converter width is less than 16 bits. that's the origin of the math above. a 14 bit sensor can only output numbers between 0 and 16383 because the two topmost bits of the 16 bit words are always 0.
internally, PI represents all numerical data as floating point numbers in the 0.0-1.0 range. so if you fully exposed a flat the ADU value everywhere in the flat would be 16383 and thus since the file format is 16-bit integer, the corresponding value in PI will be 0.25 (the biggest 16-bit number is 65535.) if you forget about trying to display 16-bit statistics in PI, a good "brightest" value for a DSLR flat would be 0.125, which is 1/2 the "full-well" (saturated) value of 0.25 (or 16383 ADUs). the 0.125 value equates to 8192 ADU.
as for mean vs. median, those are exactly the same as the mean and median in math. half the pixels are brighter than the median and half are dimmer. the mean value is the average value of all the pixels in the image. note that if you go for a median value of 8192 then the brightest parts of the flat are going to be greater than 8192 ADU. this is probably OK but it is worth checking that the brightest part (usually the center of the flat) is not too bright.
BYE does a virtual "back of camera" stretch on the data and the histogram it shows is therefore as though the data has been stretched. BYE doesn't actually stretch the data; it's really just the same as PI's STF ("screen transfer function", aka a "screen stretch") additionally BYE is normalizing the histogram to 8-bits per channel, maybe because that's what a JPEG image is. you actually have 14-bits per channel in the canon sensors. bottom line is that for flats i wouldn't try to measure them with BYE's histogram. a good flat is going to look overexposed for sure. just take some test flat exposures of varying lengths and go into PI and find the one where the brightest part is near 0.125, then use that exposure length. when i used a DSLR i'd just set it for Av and +2EV and that seemed to be about right.
note that with any OSC (one-shot color) camera you will sometimes get color casts in the flats. this is not a problem, unless your dim channel is underexposed compared to the brighter ones. in this case you need to take more flats than otherwise necessary to make sure the SNR of the dim channel is sufficient. or, you can try to fix the cast with a colored T-shirt, assuming you are doing T-shirt flats. these casts can be pretty significant when you use LP filters like the CLS or IDAS LP filters...
rob