document version 0.0.4, vep2, 2024-12-07

GPU System Setup

0) OS Installation

Update

whatever floats your ⛵

Upgrade

whatever floats your ⛵

Backup `etc`, change default shell

# cd /etc

# git config -g user.email=root@vertex
# git config -g user.name=root
# git init .
# git add .
# git commit -m "[+] init"

# vim /etc/default/useradd # change default shell to bash

Add scratch disk

if available:

# ln -s /scratch /srv/data
# mkdir  /srv/data/extended-local-storage

Additional packages

#apt install restic # maybe get it from GitHub (?)
apt install vim tree ncdu htop btop nvitop nvtop

1) LDAP

follow the instructions given here:

https://ubuntu.com/server/docs/how-to/sssd/with-ldap/

then read the document sssd-ldap-setup.md :

working BFH /etc/sssd/sssd.conf file: ...
working BFH /etc/ldap.conf file: ...
working BFH /etc/pam.d/sshd file: ...

2) CUDA

General

ubuntu-drivers devices to get an overview of installed/supported GPU devices ubuntu-drivers --gpgpu list

https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html

https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#ubuntu

https://docs.nvidia.com/datacenter/tesla/tesla-installation-notes/index.html

Detailed

adapt the linke below w/ the appropriate target_version=..., acc. to your Ubuntu release

https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/
(for Ubuntu 24.04)

follow these steps for the straight-forward approach

https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=24.04&target_type=deb_network
(read through this)

add some tools

apt install nvitop && apt install nvtop

after installation, reboot the system

check w/ nvcc --version (might not work, as this has not been added to your $PATH),
nvidia-smi, nvitop, nvtop, cat /proc/driver/nvidia/version

if the above does not work (!)

1) intro: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#ubuntu

1 a) https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#prepare-ubuntu

remove old keyring: sudo apt-key del 7fa2af80

wget https://developer.download.nvidia.com/compute/cuda/repos/<release>/x86_64/cuda-keyring_1.1-1_all.deb # placeholder <release> means `ubuntu2x04`
sudo dpkg -i cuda-keyring_1.1-1_all.deb
rm cuda-keyring_1.1-1_all.deb
sudo apt update

2) https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#network-repo-installation-for-ubuntu

    wget https://developer.download.nvidia.com/compute/cuda/repos/<release>/x86_64/cuda-<release>.pin
    mv cuda-<release>.pin /etc/apt/preferences.d/cuda-repository-pin-600 # placeholder <release> means `ubuntu2x04`

4) driver: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#driver-installation

4 a) https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#additional-package-manager-capabilities

see Section 3.12.2. Meta Packages to get current newest version; this info will be used for the placeholder <driver_branch>

sudo apt-get install cuda-drivers-<driver_branch>, e.g., sudo apt-get install cuda-drivers-555

5) https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#post-installation-actions

6) check w/ nvcc --version (might not work, as this has not been added to your $PATH),
nvidia-smi, nvitop, cat /proc/driver/nvidia/version

for (un)-supported platforms (?)

1) Check GPU/CUDA capabilities

https://docs.nvidia.com/dgx/dgx-os-5-user-guide/installing_on_ubuntu.html#installing-the-dgx-software-stack

https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=20.04&target_type=deb_network

3) `docker`

NOT: do not apt install docker.iohttps://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html To fully profit from the GPU hardware, do not rely on the stock docker service; follow the guidelines for NVIDIA's docker runtime:

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

# su sysadmin
$ cd ~ # change to /home/sysadmin
$ mkdir -p workspaces/docker/pyenv
$ cd ~/workspaces/docker/pyenv

$ scp -r sysadmin@peak.ti.bfh.ch:workspaces/docker/pyenv .
# check `Dockerfile`
# docker build . -t local:pyenv

Others

# pull a docker image for `gpu_burn`

Test docker `pyenv`

pyenv install 3.13.1
pyenv global 3.13.1