Experimental TensorFlow GPU acceleration repository

rcroman

Well-known member
I finally got around to preparing a repository containing the CUDA/cuDNN software libraries needed to enable GPU acceleration of AI-based tools. This didn't used to be possible due to onerous license restrictions, but those have since been relaxed a bit.

This is for Windows only at the moment: if this goes well I'll work on the Linux version. This isn't needed for MacOS users – the "CoreML" library provided by Apple is used, so Mac users with capable hardware get GPU acceleration of RC Astro tools out of the box.

THIS IS EXPERIMENTAL – please READ this post completely, and proceed at your own risk. I can only test on a small number of hardware configs. I recommend backing up your PixInsight installation so you can roll back if it goes sideways. At the very least make a copy of the existing tensorflow.dll file in PI's bin directory – putting that back and restarting PI should effectively revert.

There is a second repository to revert to CPU-only operation if something goes wrong. In the worst case, you may have to re-install PixInsight. Please roll with this if it happens, and provide kind feedback so I can try to fix it.

You need a capable NVIDIA GPU. This means one with ≥ 2GB of RAM and compute capability ≥ 3.5. Check this NVIDIA page if in doubt about your GPU's capabilities. You'll need at least several GB of free disk space.

The GPU repository installs version 2.10 of the "GPU-only" version of tensorflow.dll. On the TensorFlow project page, it clearly says "GPU only," but in my testing it ran in CPU-only mode just fine if there was no GPU installed. I have no idea if this will be the case on every machine. Along with the TensorFlow library, the necessary CUDA/cuDNN libraries (version 11.8) and the zlib compression library (version 1.2.3.0) are installed.

The CPU repository installs version 2.10 of the CPU-only build of tensorflow.dll.

These libraries are all the property of the respective organizations that created them. They are distributed via these repositories for use with PixInsight/RC Astro tools only in accordance with relevant license agreements. You can find these agreements in PI's etc/legal/licenses directory after installation.

To proceed, add the following to your PI repository list using Resources -> Updates -> Manage Repositories:


Then run Resources -> Updates -> Check for Updates, proceed with the download (be patient – it's a 1.5GB package), and then quit PI to complete the update. You should now enjoy GPU acceleration of RC Astro tools. This should "just work." No environment variables to set unless you want to constrain TensorFlow to only use as much GPU memory as needed (see the notes about TF_FORCE_GPU_ALLOW_GROWTH here). I've had limited success getting that setting to actually work – TensorFlow is very GPU-memory-grabby.

If it doesn't work – running BXT/NXT/SXT causes a crash or doesn't use the GPU – let me know your hardware details, PI version, OS version, etc., and the full text of any error messages. If you manage to get things working where others have not, please share.

You can revert to the CPU-only tensorflow.dll by changing the repository entry to


and going through the update process again. If you want to revert yet again to the GPU version, you'll need to change the repo entry again AND do a Resources -> Updates -> Reset Updates.

Thanks for testing... let me know how it goes!
 
Russ...truly appreciate your work!

I can confirm that the process described above worked for me: Win 11 Pro 23H2 22631.2861, Ryzen 9 5950X, GTX 1080Ti 546.17
 
This is excellent! Are there any plans for AMD in the future? Or is this entirely up to the tensorflow devs?
 
This is excellent! Are there any plans for AMD in the future? Or is this entirely up to the tensorflow devs?
This unfortunately is either up to the TF folks or Microsoft's DirectML effort, and/or maybe OpenCL. TF seems thoroughly wedded to NVIDIA at the moment, and Microsoft hasn't updated DirectML in over a year. For better or worse, NVIDIA is basically the only game in town when it comes to AI on Windows and Linux boxes for the time being. They basically deserve it: they made a massive investment in CUDA starting many years ago to enable devs to do general-purpose compute on their hardware (very smart move), and now they're reaping the rewards. Can't just be a chip company anymore and build great hardware – gotta have the software stack to go with it. AMD is playing catch-up.

Apple has their own (very good) thing going GPU-wise and made the smart move to make it relatively easy to port an AI model to their platform.

It'll eventually settle out – NVIDIA's competitors aren't going to let them keep making 70% gross margin (!) forever.
 
From what I learned in my cursory dive down this rabbit hole, there is AMD support on linux using ROCm in tensforflow. The word is it's "coming soon" for windows. Which obviously means nothing, but I guess I just wanted to clarify this didn't require any effort on your part and we can just wait and see.

Seems to me that if the day does come, we'll get a tensorflow update via this new repo, and it'll all just magically work!
 
Hi Russ

worked perfect on my dell laptop with windows 11 and a RTX 3050 with 11th gen i7 proccessor

As usuall excellent work

many thanks

Harry
 
Omg, what an update! It probably increased the speed of BlurXTerminator and Starnet++ v2 up to 20x times (didn't measure, but this is how I feel). It was taking minutes previously (depending on file resolution, up to 5 minutes for BXT), and now 30 seconds!

And it is now possible to use BXT real-time preview mode without waiting forever even for a small portion of the source file!

Ryzen 9 3900X
64 GB DDR4
RTX 3090
 
Rip Snorting Fast
Installed using the update repositories described by Russell above...
Still testing all of the processes and scripts since installing latest version of PI and completing the updates to the repositories...

Tested BXT on a 2x Drizzle image of the HorseHead |368 MB (386,655,056 bytes)
Image had lots of issues and BXT AI made dramatic improvements - it was the operator and collection of data and not BXT - will re-run it on some other better data...

Here's the process log and speed results
BlurXTerminator: Processing view: BXT_Test2
Writing swap files...
348.114 MiB/s
RC-Astro BlurXTerminator, version 2.0.0, BlurXTerminator.4.pb
Initializing...
Processing: done
33.096 s

Device name XPS8940_HVSYQJ3
Processor 11th Gen Intel(R) Core(TM) i7-11700 @ 2.50GHz 2.50 GHz
Installed RAM 80.0 GB (79.6 GB usable)
System type 64-bit operating system, x64-based processor
Pen and touch No pen or touch input is available for this display
See attached image for GPU information and utilization with test run.
NVIDIA GeForce RTX 3060 | 12GB
Driver Version: 30.0.14.7212 (9/13/2021)

Edition Windows 11 Pro
Version 22H2
Installed on ‎7/‎22/‎2023
OS build 22621.2861
Experience Windows Feature Experience Pack 1000.22681.1000.0
 

Attachments

  • GPU Screenshot 2023-12-18 171929.png
    GPU Screenshot 2023-12-18 171929.png
    11.3 KB · Views: 43
Last edited:
Hi Russ.....

I was just beginning to go through the process of doing this manually a while back and resumed the endeavor yesterday when I saw your recent note on the top of your instruction list and said to myself (Thank you again Russ :)(y)). So, I added the URL, and everything worked beautifully, and the speed increased enormously. I am using Windows 10 with an NVIDIA GeForce RTX 3070.

Also, the update on BXT is awesome and I just did a wide field of the Orion Complex with a Rokinon 135 lens and had some star coma in corners. I used the "correct only" mode in BXT and those stars went from elongated oblongs to circular pinpoints. Wonderful! I watched the video Adam Block made of his interview with you and that's where I found how to correct these stars by using BXT very early on in the post processing.....even before DBE and SPCC.

Also, I watched a video of HDRMT regarding color correction on Adam Block's Studio. I recall it was regarding OSC?.... but you had written a script file and I downloaded it, but it wouldn't run. Then Adam informed me this script had been incorporated in PI's HDRMT in a box called Intensity. I just tried that last night and it worked beautifully. I was able to bring out the hourglass in M8 from a very short integration time of a wide field session of M8 and M20 shot with my Canon Ra.

Awesome work Russ....thank you so much.
 
Hello Russ,
I have had this setup the old fashion way, working great. I updated to the newest version of PI 1.8.9-2 and lost my GPU acceleration. I gave your suggestion a try with my ASUS i5 6400 16gb ram and still a no-go. I don't know what the next step would be from your standpoint, so I'm asking for a bit of guidance before I start the process of doing it the "OLD" way.

Here is a screenshot of my system properties. If you need more info please let me know and I'll do my best to get it to you.

Full disclosure: I'm not the best tech savvy person that's for sure, but I do try my best to understand how and what to do with guidance.

I look forward to hearing your response, and I hope we can help others as well as myself that may have the same issue I'm having.

Thanks!
Dale
 

Attachments

  • PI GPU Issue.JPG
    PI GPU Issue.JPG
    42.9 KB · Views: 81
Hello Russ,
I have had this setup the old fashion way, working great. I updated to the newest version of PI 1.8.9-2 and lost my GPU acceleration. I gave your suggestion a try with my ASUS i5 6400 16gb ram and still a no-go. I don't know what the next step would be from your standpoint, so I'm asking for a bit of guidance before I start the process of doing it the "OLD" way.

Here is a screenshot of my system properties. If you need more info please let me know and I'll do my best to get it to you.

Full disclosure: I'm not the best tech savvy person that's for sure, but I do try my best to understand how and what to do with guidance.

I look forward to hearing your response, and I hope we can help others as well as myself that may have the same issue I'm having.

Thanks!
Dale
A new install of PI will not alter your CUDA installation at all. It only replaces the GPU tensorflow dll with the CPU one. So only that one file needs to be restored. I also have had CUDA acceleration enabled for a long time, so haven't loaded this repository. I assume that doing so is safe in an environment where CUDA is already installed? That is, I assume this approach isn't just for clean installs, but doesn't mess up existing ones?
 
Hello Russ,
I have had this setup the old fashion way, working great. I updated to the newest version of PI 1.8.9-2 and lost my GPU acceleration. I gave your suggestion a try with my ASUS i5 6400 16gb ram and still a no-go. I don't know what the next step would be from your standpoint, so I'm asking for a bit of guidance before I start the process of doing it the "OLD" way.

Here is a screenshot of my system properties. If you need more info please let me know and I'll do my best to get it to you.

Full disclosure: I'm not the best tech savvy person that's for sure, but I do try my best to understand how and what to do with guidance.

I look forward to hearing your response, and I hope we can help others as well as myself that may have the same issue I'm having.

Thanks!
Dale
Which GPU do you have?
 
A new install of PI will not alter your CUDA installation at all. It only replaces the GPU tensorflow dll with the CPU one. So only that one file needs to be restored. I also have had CUDA acceleration enabled for a long time, so haven't loaded this repository. I assume that doing so is safe in an environment where CUDA is already installed? That is, I assume this approach isn't just for clean installs, but doesn't mess up existing ones?
If you have GPU acceleration working, I would leave it alone. It should be okay, but you will end up with two copies of the CUDA/cuDNN libraries on your machine, and any version conflicts could cause some ugliness.
 
Back
Top