Installation
GPUFlight supports both NVIDIA CUDA and AMD ROCm backends. You need the C++ library for integration into your GPU application, and optionally the Python library for analysis and visualization.
C++ Library (Integration)
The recommended way to integrate gpufl into your C++ project is via CMake's FetchContent.
NVIDIA Prerequisites
- CMake 3.31 or higher
- CUDA Toolkit 13.x or later (including CUPTI)
- A C++17 compatible compiler
AMD Prerequisites
- CMake 3.31 or higher
- ROCm 6.x with HIP runtime
- ROCm SMI library
- rocprofiler-sdk
- A C++17 compatible compiler
CMake Integration
Add the following to your CMakeLists.txt:
include(FetchContent)
FetchContent_Declare(
gpufl
GIT_REPOSITORY https://github.com/gpu-flight/gpufl-client.git
GIT_TAG main
)
FetchContent_MakeAvailable(gpufl)
For NVIDIA targets:
target_link_libraries(my_app PRIVATE gpufl::gpufl CUDA::cudart)
For AMD/HIP targets:
# Enable AMD backend
set(GPUFL_ENABLE_AMD ON CACHE BOOL "" FORCE)
set(GPUFL_ENABLE_NVIDIA OFF CACHE BOOL "" FORCE)
target_link_libraries(my_app PRIVATE gpufl::gpufl hip::host)
Build Options
| Option | Default | Description |
|---|---|---|
GPUFL_ENABLE_NVIDIA | ON | Enable NVIDIA backends (CUDA + NVML) |
GPUFL_ENABLE_AMD | OFF | Enable AMD backends (ROCm + HIP) |
BUILD_TESTING | ON | Build test suite |
BUILD_PYTHON | OFF | Build Python bindings |
Python Library (Analysis)
The Python library provides tools for analyzing, reporting, and visualizing the logs generated by the C++ library.
Basic Installation
pip install gpufl
The Linux wheels published for 0.1.0, 0.1.1, and 0.1.2 shipped
without NVML linked in — the result was that GPU telemetry (devices
array in job_start) silently came up empty even on a working CUDA
system. Use v0.1.3 or later. Plain pip install gpufl (with
no version pin) resolves to the latest, which is what you want.
Full Installation (with Analyzer)
pip install "gpufl[numba,analyzer]"
The analyzer extra pulls in pandas + rich for the
GpuFlightSession dashboard and report generation.
viz extraA viz extra exists (for matplotlib timeline plots) but the
underlying gpufl.viz module is broken in 0.1.x — it never
learned to decode the columnar batch wire format and produces an
empty chart. The rewrite lands in v1.0.0. Until then, leave
viz out of your install line and use the analyzer (which works
correctly) for visualization-grade insight.
The Python library works with logs from both NVIDIA and AMD sessions — no backend-specific installation is needed for analysis.