I'm looking for guidance on tuning PI and macOS Catalina so that my system stops crashing. If I happen to be working on the machine while it's processing then I can prevent a crash, but if I leave it overnight with a big job then I come back the next morning to the machine having restarted, which isn't good ;-). I'm not sure if this is a product defect or just something that needs tuning out, so I thought I'd put it under Bug Reports but feel free to dump it wherever is needed.
My machine is a 2019 16" MacBook Pro, with 32Gb of Memory, an OctoCore i9 running at 2.4Ghz, and a 512Gb SSD. This is my primary work machine, and I just use it for PI on the side (as I don't have another machine with this kind of processing power). In general I am very happy with running PI on it. I limit PI to 15 of the 16 available CPU threads so that I can keep doing work in email clients, Web browsers, terminal emulators etc, and PI honours that very well, even when it's running at 1500% of my CPU (cool!) the rest of the system is responsive and runs well so I can keep working. Since this is a work machine, I am unable to plug in external disk drives. I am able to use a network based storage solution, which I do use to free space up on the local SSD as needed, but I cannot just go and plug in a 6Tb USB disk and be done with it; the USB ports on the machine are technically disabled by my work's hardening process and I cannot get around this.
I live in LPLondon and am forced to take short exposures, so to get any meaningful integration I need to take a lot of images. I therefore routinely process large numbers of files (hundreds or even thousands). My primary imaging camera currently is an Altair 269C (the IMX269 sensor is pretty awesome by the way), which produces ~40Mb files; my imaging capture software NINA stores these natively as LZ4HC compressed XISF files which take up ~30Mb each.
When going through an integration process (assume I've prepared masters of my calibration frames to use), I need to take my lights and calibrate them (1 copy, which doubles the file size to 80Mb or around 60Mb compressed), then debayer them (which multiplies file size by 3 - we're now up to 240Mb or 180Mb compressed), then go through subframe selector (another copy of the 240Mb files), then go through Star Alignment (another copy of the 240Mb files), then perform the actual integration. And that's skipping cosmetic correction and local normalisation (I don't consider myself proficient enough to bring those into my stacking workflow yet). So each file actually needs around 840Mb or 640Mb compressed of disk space. I run out of disk space very easily!
When trying to integrate large numbers of files, the PI process grows to huge memory sizes; I've seen 80Gb. Obviously that doesn't all fit in my 32Gb memory, so it's using virtual memory, which macOS is pretty good at using. However this means that macOS virtual memory is fighting for disk resources with my image files. In addition, PI itself stores a bunch of temporary files, and these are also fighting for disk resources.
So -- we know so far that I am constrained on disk space. However, what happens if I start running out of disk space? From my observations, one of three things:
1. If PI "notices" that I am out of disk space first -- say it happens to be trying to write a file -- then it can handle it, and the way it does this can be configured (I usually have it set to ask the user, meaning that it pauses what it was doing and throws up an Abort or Ignore prompt, I can then clear up disk space, abort and restart -- a "retry" prompt would be great but I can live without it).
2. If the Operating System Virtual Memory runtime "notices" that I am out of disk space first, then it throws out a window saying "Your system has run out of memory" and offering me the opportunity to kill running processes. The first few times this happened I got very confused, because when at this stage I look at the list of running processes, then yes PI is taking a huge amount of memory, but the OS shows that all as paged memory and usually shows a good 20Gb of memory free! It took me a while to realise that what it actually means is that the system has run out of hard disk space meaning that the virtual memory can no longer accommodate further growth in the huge PI process. Note that if at this point I don't clear up disk space, then very soon the machine just crashes.
3. If neither of these "notice" that I am out of disk space, the machine just crashes. This has happened to me twice while I was using the machine (editing documents, browsing the Web, etc) while PI was working away in the background. Boom, just turns itself off and restarts. I've also had multiple crashes where I leave PI running overnight, but I don't know if those were case 2 or case 3 (since I'm not in front of the computer I don't know if it tried to tell me that it's running out of memory or not before rebooting the whole machine!).
Note that I do try and be careful with my disk space. I use a network drive to store images as I go through the steps (I'm not trying to do all of this with the automated scripts, I do each stage in turn and copy off or delete the files from previous stages). However while processing at an absolute minimum I still need to have available around 1/2Gb of disk space per image (e.g. in the Subframe Selector or for Star Alignment I need to have an input image of 240Mb and an output image of 240Mb). While integrating at a minimum I need the 240Mb per image, or 180Mb if compressed. I also plan to leave "enough" space, I hope, for the Operating System virtual memory, for the PI temporary files, and for anything else that might be needed, when I perform one of these large operations.
I consider it a bug that, rather than getting an "Out of Memory" message and crashing, the PI process seems to keep trying to run at all costs, to the extent of crashing the machine. I don't know if that's a "product bug", or just an artefact of how macOS works, or if by being storage constrained rather than physical memory constrained I'm just an edge case, or what, but clearly this behaviour is wrong
So, what I am looking for is some kind of explanation for how I can get around these crashes. I need a solution which is workable within my constraints (in particular one that doesn't require me to add more storage space to the machine, which I cannot do). Is there a way of limiting the memory size of the PI process? Either technically (by somehow telling it "don't use more than X gb") or configuratively (when you do your thing, don't do so many of Y which will be more CPU intensive but will use less memory)? Is there a way of telling macOS to just fail a memory allocation if it's running low on space to create more virtual memory, rather than just allowing it to grow to 80Gb or more? Is there a way to stop the file sizes from growing so much? (I don't understand much about storage of astronomical data; I use compressed XISF because NINA supports it natively, but it seems that the 40Mb data turning into 250Mb files seems somewhat excessive!). Can I tell PI not to use "temporary files" at all? How can I lower memory (and thereby disk) usage at a cost of more CPU usage? Can I make this fail in a better way (e.g. if this situation happens, PI stops processing and calmly tells me "get more disk space")? What other suggestions or solutions can you provide?
Many thanks,
Frustrated of London.
My machine is a 2019 16" MacBook Pro, with 32Gb of Memory, an OctoCore i9 running at 2.4Ghz, and a 512Gb SSD. This is my primary work machine, and I just use it for PI on the side (as I don't have another machine with this kind of processing power). In general I am very happy with running PI on it. I limit PI to 15 of the 16 available CPU threads so that I can keep doing work in email clients, Web browsers, terminal emulators etc, and PI honours that very well, even when it's running at 1500% of my CPU (cool!) the rest of the system is responsive and runs well so I can keep working. Since this is a work machine, I am unable to plug in external disk drives. I am able to use a network based storage solution, which I do use to free space up on the local SSD as needed, but I cannot just go and plug in a 6Tb USB disk and be done with it; the USB ports on the machine are technically disabled by my work's hardening process and I cannot get around this.
I live in LPLondon and am forced to take short exposures, so to get any meaningful integration I need to take a lot of images. I therefore routinely process large numbers of files (hundreds or even thousands). My primary imaging camera currently is an Altair 269C (the IMX269 sensor is pretty awesome by the way), which produces ~40Mb files; my imaging capture software NINA stores these natively as LZ4HC compressed XISF files which take up ~30Mb each.
When going through an integration process (assume I've prepared masters of my calibration frames to use), I need to take my lights and calibrate them (1 copy, which doubles the file size to 80Mb or around 60Mb compressed), then debayer them (which multiplies file size by 3 - we're now up to 240Mb or 180Mb compressed), then go through subframe selector (another copy of the 240Mb files), then go through Star Alignment (another copy of the 240Mb files), then perform the actual integration. And that's skipping cosmetic correction and local normalisation (I don't consider myself proficient enough to bring those into my stacking workflow yet). So each file actually needs around 840Mb or 640Mb compressed of disk space. I run out of disk space very easily!
When trying to integrate large numbers of files, the PI process grows to huge memory sizes; I've seen 80Gb. Obviously that doesn't all fit in my 32Gb memory, so it's using virtual memory, which macOS is pretty good at using. However this means that macOS virtual memory is fighting for disk resources with my image files. In addition, PI itself stores a bunch of temporary files, and these are also fighting for disk resources.
So -- we know so far that I am constrained on disk space. However, what happens if I start running out of disk space? From my observations, one of three things:
1. If PI "notices" that I am out of disk space first -- say it happens to be trying to write a file -- then it can handle it, and the way it does this can be configured (I usually have it set to ask the user, meaning that it pauses what it was doing and throws up an Abort or Ignore prompt, I can then clear up disk space, abort and restart -- a "retry" prompt would be great but I can live without it).
2. If the Operating System Virtual Memory runtime "notices" that I am out of disk space first, then it throws out a window saying "Your system has run out of memory" and offering me the opportunity to kill running processes. The first few times this happened I got very confused, because when at this stage I look at the list of running processes, then yes PI is taking a huge amount of memory, but the OS shows that all as paged memory and usually shows a good 20Gb of memory free! It took me a while to realise that what it actually means is that the system has run out of hard disk space meaning that the virtual memory can no longer accommodate further growth in the huge PI process. Note that if at this point I don't clear up disk space, then very soon the machine just crashes.
3. If neither of these "notice" that I am out of disk space, the machine just crashes. This has happened to me twice while I was using the machine (editing documents, browsing the Web, etc) while PI was working away in the background. Boom, just turns itself off and restarts. I've also had multiple crashes where I leave PI running overnight, but I don't know if those were case 2 or case 3 (since I'm not in front of the computer I don't know if it tried to tell me that it's running out of memory or not before rebooting the whole machine!).
Note that I do try and be careful with my disk space. I use a network drive to store images as I go through the steps (I'm not trying to do all of this with the automated scripts, I do each stage in turn and copy off or delete the files from previous stages). However while processing at an absolute minimum I still need to have available around 1/2Gb of disk space per image (e.g. in the Subframe Selector or for Star Alignment I need to have an input image of 240Mb and an output image of 240Mb). While integrating at a minimum I need the 240Mb per image, or 180Mb if compressed. I also plan to leave "enough" space, I hope, for the Operating System virtual memory, for the PI temporary files, and for anything else that might be needed, when I perform one of these large operations.
I consider it a bug that, rather than getting an "Out of Memory" message and crashing, the PI process seems to keep trying to run at all costs, to the extent of crashing the machine. I don't know if that's a "product bug", or just an artefact of how macOS works, or if by being storage constrained rather than physical memory constrained I'm just an edge case, or what, but clearly this behaviour is wrong
So, what I am looking for is some kind of explanation for how I can get around these crashes. I need a solution which is workable within my constraints (in particular one that doesn't require me to add more storage space to the machine, which I cannot do). Is there a way of limiting the memory size of the PI process? Either technically (by somehow telling it "don't use more than X gb") or configuratively (when you do your thing, don't do so many of Y which will be more CPU intensive but will use less memory)? Is there a way of telling macOS to just fail a memory allocation if it's running low on space to create more virtual memory, rather than just allowing it to grow to 80Gb or more? Is there a way to stop the file sizes from growing so much? (I don't understand much about storage of astronomical data; I use compressed XISF because NINA supports it natively, but it seems that the 40Mb data turning into 250Mb files seems somewhat excessive!). Can I tell PI not to use "temporary files" at all? How can I lower memory (and thereby disk) usage at a cost of more CPU usage? Can I make this fail in a better way (e.g. if this situation happens, PI stops processing and calmly tells me "get more disk space")? What other suggestions or solutions can you provide?
Many thanks,
Frustrated of London.