Procedure to enable GPU acceleration for BXT, StarNet, etc within Linux Mint with a supported nvidia graphics card

twivel

Member
My goals with this process were:
  1. Obviously, enable GPU acceleration for tensorflow
  2. Simplify the process by installing the nvidia libraries, using nvidia documented processes.
  3. Reduce modification of PixInsight install to just editing the LD_LIBRARY_PATH (no need to move/delete libraries in PixInsight/bin/lib)
  4. Use the native package manager where possible and avoid overriding package manager installed drivers with manually installed ones.
  5. Explore whether or not newer versions of tensorflow, cuda and cudnn will work. It turns out, yes!
I am using Linux Mint 21 (based on Ubuntu 22.04, so should work on Ubuntu as well)
Reported Video Card is the NVIDA GeForce GTX 1650

Process Summary:

  1. Ensure prerequisites are in place (build essentials, kernel headers, etc)
  2. Download and install the latest version of cuda from nvidia, following nvidia instructions to install. (Use Ubuntu 22.0.4 local install deb version)
  3. Download and install 8.7.9 version of cudnn from nvdiia website, following nvidia instructions to install. (Use Ubuntu 22.0.4 local install deb version)
  4. Download and extract the latest pre-built version of tensorflow from the nightly builds site.
  5. Add the new tensorflow to the start of the PixInsight.sh LD_LIBRARY_PATH
  6. Start up PixInsight.
Results:
  1. BXT process runs much faster, while spiking GPU utilization to 100%
  2. StarNet2 runs noticeably faster, but only spikes GPU utilization to about 50% (up from around 12% at the time)

Detailed Instructions:
I initially started out by trying to follow some of the processes reported previously. Some instructions even suggested using "pip" for parts of the install which was an alternate approach but was entirely unnecessary with the procedures I am using. If you see instructions that suggest using pip or python3, you are venturing down your own path.

Ultimately, I tried a few approaches until I found something that worked. This means I may have missed a prerequisite or two in my detailed instructions, but hopefully these work for you.

One thing to keep in mind: If you do something that results in BXT or StarNet going away, you will need to re-install those process modules before they work again. (PROCESS->Modules->Install Modules, Search, Install). Just fixing the missing libraries will not make them come back again.

Install prerequisites (Hope I didn't miss any that were already on my computer, I already had kernel-headers for example so you may need to check that).
$ sudo apt install build-essential
$ sudo apt install nvidia-driver-550
[If using ubuntu 24.04, see the comment after this by @taurici about adding the prior version of libtinfo. Perform those steps here]

Install the latest version of nvidia-cuda (cuda-repo-ubuntu2204-12-4-local_12.4.1-550.54.15-1_amd64.deb) by following the instructions at the nvidia site:
Download the local repository version, then follow these steps:
  1. $ sudo dpkg -i cuda-repo-ubuntu2204-12-4-local_12.4.1-550.54.15-1_amd64.deb
  2. $ sudo cp /var/cuda-repo-ubuntu2204-12-4-local/cuda-*-keyring.gpg /usr/share/keyrings/
  3. $ sudo apt-get update
  4. $ sudo apt-get -y install cuda-toolkit-12-4

Install the prior version of nvidia-cudnn from the archive (cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb) for cuda 12.x
Downloaded from the archive listed here: https://developer.nvidia.com/cudnn-downloads
I had to use the older version, because the nightly builds of tensorflow do not yet support cudnn 9.x
Steps:
  1. $ sudo dpkg -i cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb
  2. $ sudo cp /var/cudnn-local-repo-ubuntu2204-8.9.7.29/cudnn-*-keyring.gpg /usr/share/keyrings/
  3. $ sudo apt-get update
  4. $ sudo apt-get -y install libcudnn8
Install the pre-compiled tensorflow with GPU support
I picked one that supports the versions of cuda and cudnn I chose to install. In my case, it was: libtensorflow-gpu-linux-x86_64-2.15.0.tar.gz
Steps:
  1. Downloaded the archive to my Linux home directory
  2. $ mkdir tensorflow
  3. $ cd tensorflow
  4. $ tar -xvf ../libtensorflow-gpu-linux-x86_64-2.15.0.tar.gz
Edit the PixInsight.sh and add the tensorflow/lib folder to the library path.
NOTE: If you ensure that the new tensorflow is at the start of the LD_LIBRARY_PATH, you do not need to move the PixInsight provided libtensorflow files out of its lib folder.
For me, it looks like this:
#!/bin/bash
appname=`basename $0 | sed s,\.sh$,,`
dirname=`dirname $0`
if [ "${dirname:0:1}" != "/" ]; then
dirname=$PWD/$dirname
fi
LD_LIBRARY_PATH=$HOME/tensorflow/lib:$dirname/lib:$dirname
 
Last edited:
Thanks for the nice guide :)

If someone, like me, is following this guide under a fresh new Ubuntu 24.04 distro, he should receive an error while installing CUDA 12.4.
The problem is that this version of CUDA requires the "libtinfo5" package, while ubuntu 24.04 works with "libtinfo6".

You can install this older version following this link: https://packages.ubuntu.com/jammy-updates/amd64/libtinfo5/download

You will have to edit your sources file like that:

sudo nano /etc/apt/sources.list.d/ubuntu.sources

And add those lines:

Types: deb
URIs: http://cz.archive.ubuntu.com/ubuntu
Suites: jammy-updates
Components: main universe


CTRL+O to save
CTRL+X to exit

After that just type

sudo apt-get update
sudo apt-get install libtinfo5


Now you should be able to proceed with CUDA install ;)
 
Thanks for the guide.

I followed the directions exactly, including the specific versions that you installed. I am running on Kubuntu 24.04LTS, so I also added the ubuntu.sources lines to be able to install libtinfo5.

The install appeared to work fine. I did make a few notes for things that weren't mentioned in the instructions:

After adding the sources line for http://cz.archive.ubuntu.com/ubuntu, I now get the following error message. This did not prevent me from being able to install libtinfo5, but is a bit of a nuisance. I'm not sure where to find the key to clear it up:

N: Missing Signed-By in the sources.list(5) entry for 'http://cz.archive.ubuntu.com/ubuntu'

Downloading cudnn requires an account with NVidia. I know that this is unrelated to what's written here, but it is annoying. This is especially true, since NVidia has an excessive number of required pieces of information to create the account.

Unfortunately, I am not able to make it work. I believe that the issue is the pre-compiled tensorflow library that I downloaded. The instructions about don't include a link to the download used. I found the download for libtensorflow-gpu-linux-x86_64-2.15.0.tar.gz at https://www.tensorflow.org/install/lang_c. I may have downloaded the wrong thing...

The reason that I think that it's the tensorflow library is because when I start PixInsight from the console using the PixInsight.sh script (with the LD_LIBRARY_PATH updated to include the downloaded library), I get the following error message in the console when I apply BlurXTerminator:

2024-05-27 08:49:08.400882: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library
(oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-05-27 08:49:08.422282: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:357] MLIR V1 optimization pass is not enabled

The BlurXTerminator process then uses the CPU and not the GPU.

I will toss out the caveat that I did alter one aspect of the directions. I have a .pixinsight folder under my home directory that I use for things like swap directories and storage for things that I want to survive a PI uninstall. So my updated PixInsight.sh file contains the following for the library location:
LD_LIBRARY_PATH=$HOME/.pixinsight/tensorflow/lib:$dirname/lib:$dirname

Do you have any suggestions? In particular, is there a way to enable tracing or logging to see what it's trying to do?

Thanks,
-Wade
 
Never mind the above part about not working. It was an incredibly dumb user error. My .pixinsight folder is actually called .PixInsight. Once I corrected the path name, it all works fine.

The time for BlurXTerminator to run on one of my images dropped from about a minute, to just over 25 seconds, and all the processing was on the GPU.
 
Never mind the above part about not working. It was an incredibly dumb user error. My .pixinsight folder is actually called .PixInsight. Once I corrected the path name, it all works fine.

The time for BlurXTerminator to run on one of my images dropped from about a minute, to just over 25 seconds, and all the processing was on the GPU.

Nice! I'm happy you managed to let it work correctly ;)
 
Back
Top