This snap includes NVIDIA DCGM and DCGM-Exporter to manage and monitor NVIDIA GPUs via the CLI or via Prometheus metrics.
Grafana dashboards can then be used to visualize the exported metrics, see for example:
https://grafana.com/grafana/dashboards/12239-nvidia-dcgm-exporter-dashboard/
The snap includes the following components:
Please see the links at the bottom of the page for more details about the included components and their purpose.
How-To
How to install the snap:
sudo snap install dcgm
How to enable metrics collection:
# Start the DCGM-Exporter service (disabled by default)
sudo snap start dcgm.dcgm-exporter
# Get the metrics
curl -s localhost:9400/metrics
How to configure the snap services:
The NV-Hostengine and DCGM-Exporter services can be configured via the snap
CLI.
For example:
# Get all the configuration options
sudo snap get dcgm
# Set the NV-Hostengine port
sudo snap set dcgm nv-hostengine-port=5577
# Restart the NV-Hostengine service to apply the changes
sudo snap restart dcgm.nv-hostengine
Reference
Available configurations options:
nv-hostengine-port
: the port on which the NV-Hostengine listens.
The default is 5555
.dcgm-exporter-address
: the address DCGM-Exporter binds to.
The default is :9400
.dcgm-exporter-metrics-file
: the name of a custom CSV metrics file to be loaded by the exporter.
The path is assumed to be /var/snap/dcgm/common/
.
The default metrics are located in /snap/dcgm/current/etc/dcgm-exporter/default-counters.csv
.
Please refer to the DCGM-Exporter repository link at the bottom of the page for more information on the CSV file format.Cryptography
During the snap build process, snapcraft downloads the CUDA keyring deb package using curl
over HTTPS and verifies its integrity using SHA256 checksums.
The CUDA keyring deb package is then used to set up the appropriate source for the DCGM deb package, whose signature is verified using the keyring.
For more information, see the CUDA keyring repository link and curl
documentation at the bottom of the page.
Links
Upstream DCGM-Exporter repository
https://github.com/NVIDIA/dcgm-exporter
Upstream DCGM repository
https://github.com/NVIDIA/DCGM
DCGM Documentation
https://docs.nvidia.com/datacenter/dcgm/latest/user-guide/index.html
Available NVIDIA GPU metrics
https://docs.nvidia.com/datacenter/dcgm/latest/dcgm-api/dcgm-api-field-ids.html
Repository for the CUDA keyring and DCGM deb package
https://developer.download.nvidia.com/compute/cuda/repos/
curl Documentation
https://curl.se/docs/manpage.html
Thank you for your report. Information you provided will help us investigate further.
There was an error while sending your report. Please try again later.
You are about to open
Do you wish to proceed?
Snaps are applications packaged with all their dependencies to run on all popular Linux distributions from a single build. They update automatically and roll back gracefully.
Snaps are discoverable and installable from the Snap Store, an app store with an audience of millions.
Snap can be installed from the command line on openSUSE Leap 15.x and Tumbleweed.
You need first add the snappy repository from the terminal. Leap 15.5 users, for example, can do this with the following command:
sudo zypper addrepo --refresh https://download.opensuse.org/repositories/system:/snappy/openSUSE_Leap_15.5 snappy
Swap out openSUSE_Leap_15.5
for openSUSE_Leap_15.4
or openSUSE_Tumbleweed
if you’re using a different version of openSUSE.
With the repository added, import its GPG key:
sudo zypper --gpg-auto-import-keys refresh
Finally, upgrade the package cache to include the new snappy repository:
sudo zypper dup --from snappy
Snap can now be installed with the following:
sudo zypper install snapd
You then need to either reboot, logout/login or source /etc/profile
to have /snap/bin added to PATH.
Additionally, enable and start both the snapd and the snapd.apparmor services with the following commands:
sudo systemctl enable --now snapd
sudo systemctl enable --now snapd.apparmor
To install NVIDIA DCGM, simply use the following command:
sudo snap install dcgm
Browse and find snaps from the convenience of your desktop using the snap store snap.
Interested to find out more about snaps? Want to publish your own application? Visit snapcraft.io now.