Skip to main content

Installation

GPUFlight supports both NVIDIA CUDA and AMD ROCm backends. You need the C++ library for integration into your GPU application, and optionally the Python library for analysis and visualization.

C++ Library (Integration)

The recommended way to integrate gpufl into your C++ project is via CMake's FetchContent.

NVIDIA Prerequisites

  • CMake 3.31 or higher
  • CUDA Toolkit 13.x or later (including CUPTI)
  • A C++17 compatible compiler

AMD Prerequisites

  • CMake 3.31 or higher
  • ROCm 6.x with HIP runtime
  • ROCm SMI library
  • rocprofiler-sdk
  • A C++17 compatible compiler

CMake Integration

Add the following to your CMakeLists.txt:

include(FetchContent)

FetchContent_Declare(
gpufl
GIT_REPOSITORY https://github.com/gpu-flight/gpufl-client.git
GIT_TAG main
)

FetchContent_MakeAvailable(gpufl)

For NVIDIA targets:

target_link_libraries(my_app PRIVATE gpufl::gpufl CUDA::cudart)

For AMD/HIP targets:

# Enable AMD backend
set(GPUFL_ENABLE_AMD ON CACHE BOOL "" FORCE)
set(GPUFL_ENABLE_NVIDIA OFF CACHE BOOL "" FORCE)

target_link_libraries(my_app PRIVATE gpufl::gpufl hip::host)

Build Options

OptionDefaultDescription
GPUFL_ENABLE_NVIDIAONEnable NVIDIA backends (CUDA + NVML)
GPUFL_ENABLE_AMDOFFEnable AMD backends (ROCm + HIP)
BUILD_TESTINGONBuild test suite
BUILD_PYTHONOFFBuild Python bindings

Python Library (Analysis)

The Python library provides tools for analyzing, reporting, and visualizing the logs generated by the C++ library.

Basic Installation

pip install gpufl
Linux NVML in v0.1.0 – v0.1.2

The Linux wheels published for 0.1.0, 0.1.1, and 0.1.2 shipped without NVML linked in — the result was that GPU telemetry (devices array in job_start) silently came up empty even on a working CUDA system. Use v0.1.3 or later. Plain pip install gpufl (with no version pin) resolves to the latest, which is what you want.

Full Installation (with Analyzer)

pip install "gpufl[numba,analyzer]"

The analyzer extra pulls in pandas + rich for the GpuFlightSession dashboard and report generation.

viz extra

A viz extra exists (for matplotlib timeline plots) but the underlying gpufl.viz module is broken in 0.1.x — it never learned to decode the columnar batch wire format and produces an empty chart. The rewrite lands in v1.0.0. Until then, leave viz out of your install line and use the analyzer (which works correctly) for visualization-grade insight.

The Python library works with logs from both NVIDIA and AMD sessions — no backend-specific installation is needed for analysis.