How To GPU Accelerate StarXTerminator and StarNet2 on Linux

SteveD

Member
If you have an Nvidia card, you can greatly improve the performance of StarXTerminator and StarNet2. These instructions are for Linux.

This is an update to a post by Ajay Narayanan.
https://pixinsight.com/forum/index....cuda-and-libtensorflow-gpu-under-linux.18180/

Most of the credit should go to Ajay. I am just updating his instructions and providing troubleshooting tips.

A summary of the key commands is provided at the end of this post.

System Used to Test Instructions
Ubuntu 22.04 with KDE desktop
Nvidia GTX Titan X (Compute Compatability 5.2)
PixInsight 1.8.9-1
StarXTerminator 2.0.5 AI Version 11
StarNet2

Speed improvement is roughly 10x.

Mostly complete list of compatible GPUs
https://developer.nvidia.com/cuda-gpus

apt
If apt install fails at any point, you can try
$ sudo apt install --fix-broken install

apt and apt-get are essentially the same in modern versions of Debian based Linuxes.

Install CUDA
If you do not already have build-essential
$ sudo apt install build-essential

At https://developer.nvidia.com/cuda-toolkit-archive click the links for your system. For me, this was CUDA Toolkit 11.8.0 -> Linux -> x86_64 -> Ubuntu -> 22.04 -> deb (local)
Follow the instructions presented by Nvidia. There are about 7 commands beginning with wget.

Install libcudnn8
$ sudo apt install libcudnn8

Cuda will be installed to /usr/local

libcudnn.so.8 will be put in a system directory such as /usr/lib/x86_64-linux-gnu/

Reboot
Installing Cuda installs kernel modules which may not go into effect without rebooting.

Install TensorFlow
Download the tar file for Linux GPU support at
https://www.tensorflow.org/install/lang_c

Extract to /usr/local
$ sudo tar -C /usr/local -xzf libtensorflow-gpu-linux-x86_64-2.11.0.tar.gz
$ sudo ldconfig /usr/local/lib

The above is the current tar file as of this writing. It will change.

ldconfig tells your system where to look for the tensorflow libraries.

I highly recommend creating, compiling, linking, and executing the very short C program as per https://www.tensorflow.org/install/lang_c It will only take a few minutes. You do not need to understand this; follow TensorFlow's instructions. This will verify that TensorFlow was properly installed. The program should respond with something like Hello from TensorFlow C library version 2.11.0

Remove PixInsight's TensorFlow Libraries
This will ensure that PI picks up the new tensorflow libraries. Create a directory to save the libraries and move them. Be careful not to move any other files.
$ sudo mkdir /opt/pi_lib_tmp
$ sudo mv /opt/Pixinsight/bin/lib/libtensor* /opt/pi_lib_tmp

The above commands moved 6 files (2 TensorFlow libraries and 4 soft links). The updated GPU version of these same 6 files will be in /usr/local/lib

Update ~/.bashrc
Add the following lines:
export TF_FORCE_GPU_ALLOW_GROWTH="true"
export CUDA_VISIBLE_DEVICES="0"

If you have more than 1 GPU, you may want a different CUDA_VISIBLE_DEVICES value. See
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars

Start a new bash shell and verify your .bashrc changes worked with commands such as:
$ env | grep TF_FORCE
$ env | grep CUDA

Verify PixInsight is now using the GPU
Install performance meter for nvidia. Run it in a command terminal.
$ sudo apt install nvtop
$ nvtop

Run PixInsight from the command line to see helpful messages
$ PixInsight

In PixInsight, bring up an image and try StarXTerminator or StarNet2.

nvtop should show the GPU's memory and CPU being used graphically and on the text line where type=compute

If PixInsight cannot run star removal, it will output text to the command terminal you started it from. This text will give you a clue, such as missing a library associated with cuda, or missing libcudnn, or finding the CPU version of TensorFlow instead of the GPU version.

If PixInsight can run star removal with your GPU, it will output a line in the command terminal that includes the name of your GPU, its memory, and compute capability.
For my system, PixInsight outputs
Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10641 MB memory: -> device: 0, name: NVIDIA GeForce GTX TITAN X, pci bus id: 0000:05:00.0, compute capability: 5.2

List of Essential Commands from above
$ sudo apt install build-essential

# Follow Nvidia instructions at https://developer.nvidia.com/cuda-toolkit-archive
$ sudo apt install libcudnn8

# reboot

# download tar file from https://www.tensorflow.org/install/lang_c
$ sudo tar -C /usr/local -xzf libtensorflow-gpu-linux-x86_64-2.11.0.tar.gz
$ sudo ldconfig /usr/local/lib

# move PI TensorLibrares to another directory so PI won't find them
$ sudo mkdir /opt/pi_lib_tmp
$ sudo mv /opt/Pixinsight/bin/lib/libtensor* /opt/pi_lib_tmp

# add to .bashrc
export TF_FORCE_GPU_ALLOW_GROWTH="true"
export CUDA_VISIBLE_DEVICES="0"

# start PixInsight and verify GPU is being used

Clear Skies!
Steve
 
Hi Steve,

Thanks a lot for the howto.

I tried to install and configure my system but faced a lot of issues; mostly related to my particular setup and/or drivers version.

I finally opted for an anaconda approach. You need to have, obviously, anaconda (miniconda in my case) installed. Once installed:

- create a dedicated env, if you wish: conda create -n nvidia && conda activate nvidia
- install cudnn from anaconda channel: conda install -c anaconda cudnn
- adjust LD_LIBRARY_PATH within the PixInsight.sh boot script:
...
LD_LIBRARY_PATH=/PATH/WHERE/YOUR/ANACONDA/IS/envs/nvidia/lib:${LD_LIBRARY_PATH}
...

And start PI within this conda env.
Honestly, I'm not sure if some of the modifications I did at system level contributed finally to get this to work. Also, probably, this works for my particular setup; YMMV

m
 
hi again,

quick update. Even easier. All installation done by anaconda:


conda create -n tensorflow-gpu
conda activate tensorflow-gpu
conda install -c conda-forge mamba
mamba install -c conda-forge libtensorflow libtensorflow_cc tensorflow tensorflow-base tensorflow-gpu


cat PixInsight.sh
[...]
# edit accordingly:
LD_LIBRARY_PATH=/home/user/miniconda2/envs/tensorflow-gpu/lib:${LD_LIBRARY_PATH}
[...]


Still testing this installation
 
Here is how I did it

Screenshot from 2023-03-24 10-40-52.png
 
Hello Steve

I have Kubuntu and have followed your instructions to be the best of my ability, but if I remove the libtensor files from /opt/Pixinsight/bin/lib then starNet and starNet2 no longer shows in the process list, and process/modules does not find them. If I put the files back then they reappear but still no GPU acceleration.

You mentioned above that the updated GPU version of these same 6 files will be in /usr/local/lib but i seem to have a libtensorflow-gpu-linux-x86_64-2.11.0 directory in usr/local, not inside usr/local/lib. I can see 6 files inside ibtensorflow-gpu-linux-x86_64-2.11.0/lib which I believe are the required files. Do you think this misplaced directory could be the problem and if so can you please explain how to move forward from here. I am a complete novice at the system level commands so any help would be greatly appreciated.

Thanks
Ken
 
I made one attempt at the anaconda way. It did not work for me. I believe the Ubuntu system drivers must also be compatible. For example, nvidia-driver-515 and nvidia-utils-515 or similar are probably required.

I identified the following as compatible as of the last week of April 2023. These are Ubuntu 22.04 packages available in one of the standard Ubuntu repositories unless otherwise noted.

I installed the above without visiting the Nvidia site at all. Before doing so, I removed everything cuda related and nvidia-driver related. Do not blacklist nouveau as some sites recommend when installing nvidia drivers. Do not use the latest nvidia-driver metapackage as that requires cuda 12.x

If you just update libcudnn8 with apt, without specifying a version, you get the cuda12.1 version which is not compatible with the 2.11 TensorFlow library which uses cuda11.x.

I do not have my LD_LIBRARY_PATH set, so Ubuntu searches where it normally searches for libraries. You can view this list of directories using: cat /etc/ld.so.conf.d/* This includes /usr/local/lib/, where I have my TensorFlow libraries.

To Ken: not having the libraries (shared objects) in a searched directory will be a problem. These are owned by root, so you need to use sudo to move them, for example: sudo mv /usr/local/libtensorflow* /usr/local/lib/

Good luck!
Steve
 
Steve,

I guess I don't have any good luck, I moved the libtensorflow* to /usr/local/lib/ and removed the original PI TensorLibrares and again pixinsight no longer shows starnet or starXterminator. I also tried copying the new tensorfiles to /opt/Pixinsight/bin/lib/ with no improvement.

nvidia-smi shows the following NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1, higher than the 11.5 you mentioned.

You also mentioned you removed everything cuda related and nvidia-driver related. I was wondering how you did this, when I tired removing CUDA a few days ago I ended up have to reinstall Kubuntu and pixinsight etc,etc !! I dont think I can survive going through that experience again.

If there is a method to remove and reinstall tensorflow so I can install directly into your suggested directory.

Sorry for all the questions, it seems very disapointing that one can purchase starXterminator and then having to go through all these steps in order for it to work at a better speed.

Ken
 
Corrected the driver, cuda and tensorflow levels and ran pixinsight from console with the following;

2023-05-02 17:00:01.688205: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

2023-05-02 17:00:01.690364: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2023-05-02 17:00:01.743447: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/PixInsight/bin/lib:/opt/PixInsight/bin

2023-05-02 17:00:01.743472: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1934] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.

Skipping registering GPU devices...
2023-05-02 17:00:01.780386: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:357] MLIR V1 optimization pass is not enabled

I tried to install a missing libcudnn8 with the following
sudo apt-get install libcudnn8
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
E: Unable to locate package libcudnn8

Was hoping someone might know what any of the above means before I give up !
 
Kubuntu is Ubuntu with a different window manager, so any library (*.so) that works for Ubuntu will work for Kubuntu (aside from window manager libraries).

As mentioned above, the libcudnn8 that I installed, which worked for my installation of Kubuntu 22.04, is:
libcudnn8_8.9.0.131-1+cuda11.8_amd64.deb - from https://pkgs.org/download/libcudnn8 for Ubuntu 22.04

Good Luck!
 
Ok I will give it a try. I think my problems are because I came from the Mac world where everyone has the latest software levels and everything works. Seems with linux it's not so coordinated so I need to overcome my instincts to install the latest levels. I saw an article on the web for installing two different cuda levels so will also try to get cuda, tensorflow and nvidi driver to compatible levels.

Thanks
 
Finally got it working, I have not timed the RS Astro apps but they are significantly faster. I have no linux command experiance but for those of us who may benefit from slightly more detailed instructions the following worked on my computer.

Compatibility requirements for latest tensorflow gpu
https://www.tensorflow.org/install/source
Version tensorflow 2.12.0
Python version 3.8-3.11
Compiler GCC 9.3.1
Build tools Bazel 5.3.0
cuDNN 8.6
CUDA 11.8

My linux instalation is Ubuntu 22.04.2 LTS

Each time Konsole is opened go to root directory cd / For me this gives ken@linux:/$
On my first try some new directories went to the wrong location, so remember to go to root each time Konsole is opened

Install build essential, this will include gcc
sudo apt update
sudo apt install build-essential

Check gcc install
ken@linux:/$ gcc --version
gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
Note; gcc is backward compatible

Check recommended gpu driver
https://www.nvidia.com/download/index.aspx, for my card its version 525.116.04 (release date 2023.5.9)
I continued with my existing driver 525.105.17
Note; drivers are backward compatible

To see your current driver
nvidia-smi
NVIDIA-SMI 525.105.17 Driver Version: 525.105.17 CUDA Version: 12.0
Do not be concerned if it show currently cuda version 12.0

To change you current driver go to System Settings, select Driver Manager, enter password and make selection

Install cuda 11.8
Goto https://developer.nvidia.com/cuda-toolkit-archive
Select Linux, x86_64, Ubuntu, 22.04, deb(local)
sudo wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo wget https://developer.download.nvidia.c...u2204-11-8-local_11.8.0-520.61.05-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda
Cuda will be installed to /usr/local
reboot

Install libcudnn8
https://pkgs.org/download/libcudnn8 and look under Ubuntu 22.04
select libcudnn8_8.9.0.131-1+cuda11.8_amd64.deb
Highlight binary package https://developer.download.nvidia.c...6_64/libcudnn8_8.9.0.131-1+cuda11.8_amd64.deb
Right click and select open link
Check for completed download
Double click package in download directory and wait for package installer to open and select install
sudo apt-get update
libcudnn.so.8 will be installed to /usr/lib/x86_64-linux-gnu

Install Tensorflow
Download the tar file for Linux GPU support at https://www.tensorflow.org/install/lang_c
sudo tar xvzf /home/ken/Downloads/libtensorflow-gpu-linux-x86_64-2.11.0.tar.gz -C /usr/local (/home/ken/Downloads was my location)
sudo ldconfig /usr/local/lib

sudo mkdir /opt/pi_lib_tmp
sudo mv /opt/PixInsight/bin/lib/libtensor* /opt/pi_lib_tmp/

These notes are based on SteveD post, I'm sure they are very basic for some but they maybe useful for the none programmers.
 
Forgot to add last step
sudo nano ~/.bashrc
Go to end of file and enter
export TF_FORCE_GPU_ALLOW_GROWTH="true"
export CUDA_VISIBLE_DEVICES="0"
Than control X, enter y and return
 
SteveD and Ken, thank you VERY MUCH for providing such great detailed instructions!

I believe I have followed all of your instructions and got exactly the expected output at each step.

Unfortunately, there's no effect. In PixInsight with BlurXTerminator, StarXTerminator, and NoiseXTerminator, the GPU/CUDA doesn't get used. I get massive CPU activity but the GPU utilization and memory stay near zero.

I'm running PI 1.8.9-1 under kubuntu 22.04 on a brand-new AMD 7950x with 128 GB of RAM and fast SSDs. I get pretty decent performance from the RC-Astro "X" utilities using just the CPU (a minute or two to process a big image). But I'd really like to figure out how to get the software to use the GPU and get the speedup you're seeing.

When I start PixInsight, I get the following console messages regarding the hardware:
PixInsight Core 1.8.9-1 Ripley (x64) (build 1556 | 2022-05-18)
Copyright (c) 2003-2022 Pleiades Astrophoto
----------------------------------------------------------------------
Welcome to PixInsight. Started 2023-06-18T15:46:25.000Z

* Parallel processing enabled: Using 32 logical processors.
* CUDA device available: NVIDIA GeForce RTX 3060

However I do NOT get the expected "device created" console message about the GPU (quoted from SteveD above):
Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10641 MB memory: -> device: 0, name: NVIDIA GeForce GTX TITAN X, pci bus id: 0000:05:00.0, compute capability: 5.2

nvidia-smi shows I'm running driver 530.30.02 and CUDA v 12.1:
denning@astro:~$ nvidia-smi
Sun Jun 18 09:29:06 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1 |

Here are the contents of the relevant directories where tensorflow libraries are supposed to be:
denning@astro:~$ ls -l /opt/PixInsight/bin/lib | grep tensor
denning@astro:~$
denning@astro:~$ ls -l /usr/local/lib | grep tensor
lrwxrwxrwx 1 root root 28 Dec 31 1999 libtensorflow_framework.so -> libtensorflow_framework.so.2
lrwxrwxrwx 1 root root 33 Dec 31 1999 libtensorflow_framework.so.2 -> libtensorflow_framework.so.2.11.0
-r-xr-xr-x 1 root root 46207600 Dec 31 1999 libtensorflow_framework.so.2.11.0
lrwxrwxrwx 1 root root 18 Dec 31 1999 libtensorflow.so -> libtensorflow.so.2
lrwxrwxrwx 1 root root 23 Dec 31 1999 libtensorflow.so.2 -> libtensorflow.so.2.11.0
-r-xr-xr-x 1 root root 923756824 Dec 31 1999 libtensorflow.so.2.11.0
denning@astro:~$

My /usr/local/lib is indeed included in my /etc/ld.so.conf.d* files as shown here:
denning@astro:~$ cat /etc/ld.so.conf.d/* | grep usr/local/lib
/usr/local/lib/i386-linux-gnu
/usr/local/lib/i686-linux-gnu
/usr/local/lib
/usr/local/lib/x86_64-linux-gnu
denning@astro:~$

My bash environment does have the relevant variables set correctly:
denning@astro:~$ env | grep TF_FORCE
TF_FORCE_GPU_ALLOW_GROWTH=true
denning@astro:~$ env | grep CUDA
CUDA_VISIBLE_DEVICES=0
denning@astro:~$

Any hints for helping me diagnose what's wrong?

Thanks again,
Scott
 
Last edited:
Scott, unless it's changed recently, TF is not yet compatible with CUDA 12.x. You need to remove & purge CUDA 12.1 (which nvidia-smi shows you have) and install 11.8. It happens often that just installing CUDA will install the latest version, i.e. 12.x.

I've recently installed & used GPU support on Ubuntu 22.04 LTS for two fresh EC2 instances, the relevant commands for this part are (note: the commands to remove CUDA 12.1 are not included, make sure you do that part first):

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.c...u2204-11-8-local_11.8.0-520.61.05-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt update
sudo apt-get -y install cuda


If it still doesn't work, I'll post the full bash history but give the above a try,
Razvan
 
Hello Scott

I followed your screen shots with the following;
PixInsight Core 1.8.9-1 Ripley (x64) (build 1556 | 2022-05-18)
Copyright (c) 2003-2022 Pleiades Astrophoto
----------------------------------------------------------------------
Welcome to PixInsight. Started 2023-06-19T00:19:41.817Z
* Parallel processing enabled: Using 64 logical processors.
* Thread CPU affinity control enabled.
* CUDA device available: Quadro RTX 4000
* Maximum number of simultaneous open files: 8192.
* Using desktop OpenGL graphics acceleration.

nvidia-smi shows;
NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8

ls -l /usr/local/lib | grep tensor shows;
lrwxrwxrwx 1 root root 28 Dec 31 1999 libtensorflow_framework.so -> libtensorflow_framework.so.2
lrwxrwxrwx 1 root root 33 Dec 31 1999 libtensorflow_framework.so.2 -> libtensorflow_framework.so.2.11.0
-r-xr-xr-x 1 root root 46207600 Dec 31 1999 libtensorflow_framework.so.2.11.0
lrwxrwxrwx 1 root root 18 Dec 31 1999 libtensorflow.so -> libtensorflow.so.2
lrwxrwxrwx 1 root root 23 Dec 31 1999 libtensorflow.so.2 -> libtensorflow.so.2.11.0
-r-xr-xr-x 1 root root 923756824 Dec 31 1999 libtensorflow.so.2.11.0

cat /etc/ld.so.conf.d/* | grep usr/local/lib shows;
/usr/local/lib/i386-linux-gnu
/usr/local/lib/i686-linux-gnu
/usr/local/lib
/usr/local/lib/x86_64-linux-gnu

I only added 2 lines to bash;
export TF_FORCE_GPU_ALLOW_GROWTH="true"
export CUDA_VISIBLE_DEVICES="0"

gcc --version shows;
gcc (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0
Copyright (C) 2021 Free Software Foundation, Inc.

Just in case Razvan comments don't work.
It is much faster with the GPU acceleration so hope you find a solution.
Ken
 
Scott, unless it's changed recently, TF is not yet compatible with CUDA 12.x. You need to remove & purge CUDA 12.1 (which nvidia-smi shows you have) and install 11.8. It happens often that just installing CUDA will install the latest version, i.e. 12.x.

I've recently installed & used GPU support on Ubuntu 22.04 LTS for two fresh EC2 instances, the relevant commands for this part are (note: the commands to remove CUDA 12.1 are not included, make sure you do that part first):

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.c...u2204-11-8-local_11.8.0-520.61.05-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt update
sudo apt-get -y install cuda


If it still doesn't work, I'll post the full bash history but give the above a try,
Razvan
Razvan,

Fantastic -- sure enough I still had a bunch of residual CUDA 12.1 stuff that had to be purged before this would work.

All's well now and I am getting much faster performance with the RC-astro X-utilities!

Thank you VERY MUCH!

Scott
 
Last edited:
Great to hear! And just a minor correction to my own post, I meant to write that cuDNN (not TF) was not compatible with CUDA 12.x.
 
Back
Top