· 6 min read ·

The CUDA Compatibility Matrix Is a Build Problem, and Conan + CMake Can Solve It

Source: isocpp

Every year, the ISO C++ Foundation survey comes back with the same result: dependency management is the number one frustration for C++ developers. Not templates, not compile times, not the preprocessor. Dependency management. And if you bring CUDA into the picture, you’re adding a second compatibility dimension on top of the first one, where the compiler wraps a host compiler, GPU architectures must be explicitly targeted, and the runtime version has a strict minimum driver requirement that varies by OS.

A talk from using std::cpp 2026 takes a direct run at this problem: one source checkout, one command, identical builds everywhere. The solution is Conan and CMake doing more work than most projects ask of them. The interesting part isn’t the demo. It’s the specific CUDA compatibility data that Conan needs to encode, and the exact points in the CMake integration where things break without discipline.

Why CUDA Makes Dependency Management Hard

NVIDIA’s CUDA compatibility model has two independent version numbers that matter at different times. There’s the CUDA Toolkit version, which determines which headers you compile against, which PTX and cubin targets you can emit, and which SM architectures are supported. Then there’s the driver version, which determines the maximum CUDA runtime version the host can actually run. These two versions don’t move together.

A binary compiled against CUDA 12.4 requires a driver of at least 550.54.14 on Linux. CUDA 12.0 needs 525.60.13. These aren’t recommendations. They’re hard minimums enforced at runtime. If a developer builds against CUDA 12.4 on their workstation with a fresh driver and ships that binary into a CI environment running an older driver, the job will fail at runtime, not at compile time, with a cryptic CUDA initialization error.

The CUDA Toolkit release notes document this full matrix. It covers not just driver minimums but also which SM architectures were introduced in each CUDA version. SM 9.0 (Hopper) support arrived in CUDA 12.0. SM 8.6 (Ampere GA106/GA107) needs CUDA 11.1 minimum. If you set CMAKE_CUDA_ARCHITECTURES too ambitiously for the CUDA version you’re actually using, the build either silently skips an architecture or fails outright.

What Conan 2.x Can Model (and What It Can’t)

Conan 2.x, which was a near-complete rewrite released in 2023, introduced CMakeDeps and CMakeToolchain as first-class generators that replaced the older monolithic cmake generator. The package model is better suited to cross-platform work than Conan 1.x was, but CUDA sits at the edge of what Conan can natively represent.

Conan’s settings.yml knows about operating systems, architectures, compilers, and build types. It does not have a built-in cuda_version field. That matters because Conan uses the settings tree to compute a package ID, the key that determines whether a prebuilt binary is a cache hit. Two builds compiled against CUDA 11.8 and CUDA 12.0 would get the same package ID unless you explicitly add cuda_version to settings.yml and declare it in your conanfile.py.

The fix is to extend the settings file and add the field:

# ~/.conan2/settings.yml (addition)
cuda_version:
    - "None"
    - "11.8"
    - "12.0"
    - "12.1"
    - "12.2"
    - "12.3"
    - "12.4"

Then in conanfile.py:

from conan import ConanFile
from conan.tools.cmake import CMake, CMakeToolchain, CMakeDeps
import os

class MLProject(ConanFile):
    settings = "os", "arch", "compiler", "build_type", "cuda_version"

    def generate(self):
        tc = CMakeToolchain(self)
        cuda_path = os.environ.get("CUDA_PATH", os.environ.get("CUDA_HOME", ""))
        if cuda_path:
            tc.variables["CMAKE_CUDA_COMPILER"] = f"{cuda_path}/bin/nvcc"
            tc.variables["CUDAToolkit_ROOT"] = cuda_path
        tc.generate()
        deps = CMakeDeps(self)
        deps.generate()

The Conan profiles system then handles the per-environment configuration without touching the source:

# ~/.conan2/profiles/cuda-12.4-linux
[settings]
os=Linux
arch=x86_64
compiler=gcc
compiler.version=12
compiler.libcxx=libstdc++11
build_type=Release
cuda_version=12.4

[buildenv]
CUDA_PATH=/usr/local/cuda
CUDA_HOME=/usr/local/cuda
PATH+=/usr/local/cuda/bin

The [buildenv] block injects environment variables into the build environment Conan creates, so nvcc ends up on PATH without touching the shell profile. The resulting command is as clean as advertised: conan install . --profile=cuda-12.4-linux --build=missing.

The CMake Side: CUDAToolkit vs the Old Way

On the CMake side, the important line in the C++ AI/ML build ecosystem right now is between FindCUDA and FindCUDAToolkit. FindCUDA was deprecated in CMake 3.10 and removed in CMake 3.27. Any CMakeLists.txt that still calls find_package(CUDA) instead of find_package(CUDAToolkit) is relying on backwards-compatibility shims or old installations.

FindCUDAToolkit, introduced in CMake 3.17, provides proper imported targets: CUDA::cudart, CUDA::cublas, CUDA::cufft, CUDA::cusolver, and the rest. You declare CUDA as a project language, which gives you nvcc as a recognized compiler, and you link against these targets instead of manually specifying library paths.

A minimal correct setup looks like this:

cmake_minimum_required(VERSION 3.18)
project(MLProject LANGUAGES CXX CUDA)

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CUDA_STANDARD 17)

find_package(CUDAToolkit REQUIRED)

message(STATUS "CUDA version: ${CUDAToolkit_VERSION}")

# Conditional architecture selection based on detected toolkit
if(CUDAToolkit_VERSION VERSION_GREATER_EQUAL "12.0")
    set(CMAKE_CUDA_ARCHITECTURES "80;86;89;90")
elseif(CUDAToolkit_VERSION VERSION_GREATER_EQUAL "11.0")
    set(CMAKE_CUDA_ARCHITECTURES "70;75;80;86")
else()
    set(CMAKE_CUDA_ARCHITECTURES "60;70;75")
endif()

add_executable(train src/train.cu src/model.cpp)
target_link_libraries(train PRIVATE CUDA::cudart CUDA::cublas)

CMAKE_CUDA_ARCHITECTURES deserves attention. CMake 3.24 added the native keyword, which detects the GPU on the current machine at configure time. That’s useful for development, but it breaks in CI environments where no GPU is present. CMake 3.23 added all-major, which compiles for one representative SM per major generation. For most ML projects that need broad deployment, all-major or an explicit list tied to your minimum supported CUDA version is the right call.

Combining this with Conan: the CMakeToolchain generator produces conan_toolchain.cmake, which CMakePresets.json can reference directly:

{
  "version": 4,
  "configurePresets": [
    {
      "name": "conan-cuda-release",
      "toolchainFile": "build/Release/generators/conan_toolchain.cmake",
      "cacheVariables": {
        "CMAKE_BUILD_TYPE": "Release"
      }
    }
  ],
  "buildPresets": [
    {
      "name": "conan-cuda-release",
      "configurePreset": "conan-cuda-release"
    }
  ]
}

With this in place, the full flow is: conan install . --profile=cuda-12.4-linux --build=missing && cmake --preset conan-cuda-release && cmake --build --preset conan-cuda-release. Same commands on a developer machine and in CI.

Where This Approach Has Limits

Conan doesn’t package the CUDA Toolkit itself. NVIDIA distributes CUDA through their own installer, and it doesn’t exist in ConanCenter. That means the “one command” story depends on CUDA being pre-installed in the environment, which for CI means Docker images from nvidia/cuda or a self-hosted runner with the toolkit present. The Conan profile abstracts the version selection and path configuration, but it doesn’t provision the toolkit.

The static runtime is also a persistent friction point. CUDA::cudart_static is the right choice for redistributable binaries because it removes the libcudart.so runtime dependency. But on Linux, static linking CUDA requires explicitly pulling in libdl, librt, and libpthread as transitive dependencies. Conan’s generated build info doesn’t always surface these automatically, and they need to be added manually in some project configurations.

Compare this to Spack, the HPC package manager, which treats CUDA architecture as a first-class variant: spack install mypackage +cuda cuda_arch=80,86,90. Spack can actually provision the CUDA Toolkit as a package, and it understands the full compatibility matrix at the package graph resolution level. For HPC and scientific computing, Spack is more complete. For product development targeting standard CI infrastructure, Conan is more practical.

vcpkg, Microsoft’s package manager, has less built-in CUDA awareness than Conan. CUDA version is still primarily an environment variable concern in vcpkg, handled through triplet files rather than a structured settings system. The gap in CUDA modeling between vcpkg and Conan is meaningful for projects with strict reproducibility requirements.

The Broader Problem

The JetBrains State of Developer Ecosystem survey for C++ found that around 39% of C++ developers still use no package manager at all. That number has been moving, but slowly. For teams doing AI and ML work in C++, the calculus is different. The CUDA compatibility matrix isn’t optional complexity. It’s there whether you encode it in a build system or manage it manually through tribal knowledge and documentation comments.

Conan 2.x with a properly extended settings.yml and CMake 3.18+ with FindCUDAToolkit and CMAKE_CUDA_ARCHITECTURES is a complete answer to the “same build everywhere” problem for CUDA projects. It requires upfront work in profile authoring and settings configuration that most tutorials skip. But the alternative, where individual developers maintain their own environment setups and CI jobs fail in environment-specific ways, compounds quietly until it becomes the dominant cost in the build system.

The pain point the ISO C++ survey keeps finding isn’t going away on its own. Encoding the compatibility matrix into the build system is how you stop paying it every sprint.

Was this interesting?