CUDA Dependency Management Is a Three-Axis Problem, and Conan Finally Models All Three

The CUDA Dependency Problem Is Three Problems at Once

C++ dependency management has topped the ISO C++ annual survey as the number one developer frustration for five consecutive years. For general C++ projects, this is already a rough situation. Add CUDA to the mix and it becomes something qualitatively worse, because CUDA introduces not one but three interdependent version axes that all need to be satisfied simultaneously: the CUDA toolkit version, the GPU driver version, and the target GPU architecture (compute capability).

Most build systems pretend this is one problem. It is not.

Three Axes, One Package ID

To understand why Conan’s approach matters, you need to see what the compatibility matrix actually looks like.

CUDA toolkit 12.4 requires a minimum driver of 550.54.14 on Linux and 551.61 on Windows. It supports GPU architectures from SM 7.5 (Turing, T4) through SM 9.0 (Hopper, H100). A library compiled against CUDA 12.4 with sm_86 target code will not work correctly on a machine running an older driver or on a Volta-era V100 (SM 7.0) without a PTX fallback. These are not edge cases. They are the default situation in any team running a mix of development laptops, CI runners, and production GPU servers.

The traditional approach is to treat CUDA as a system dependency and document the requirement in a README. That fails the moment someone checks out the code on a machine with a different toolkit version. The CI system becomes the source of truth about what works, and it only tells you at runtime when CUDA_ERROR_NO_DEVICE or a binary incompatibility surfaces.

How CMake Got This Right, Eventually

The old FindCUDA CMake module was deprecated in CMake 3.10 and removed in 3.27. It treated CUDA as a collection of libraries to find rather than a language to compile. The replacement, enable_language(CUDA) combined with find_package(CUDAToolkit), is available since CMake 3.18 and it changes the model substantially.

Declaring CUDA as a language in the project() call means CMake handles compiler detection, flag propagation, and architecture targeting through the same mechanisms it uses for C and C++. The old cuda_add_executable() macro is gone; you use add_executable() and add_library() as normal. Libraries are imported targets with proper scoping: CUDA::cudart, CUDA::cublas, CUDA::cufft, rather than raw ${CUDA_LIBRARIES} variables.

cmake_minimum_required(VERSION 3.18)
project(MLTrainer LANGUAGES CXX CUDA)

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CUDA_STANDARD 17)

find_package(CUDAToolkit REQUIRED)

if(CUDAToolkit_VERSION VERSION_GREATER_EQUAL "12.0")
    set(CMAKE_CUDA_ARCHITECTURES "80;86;89;90")
elseif(CUDAToolkit_VERSION VERSION_GREATER_EQUAL "11.0")
    set(CMAKE_CUDA_ARCHITECTURES "70;75;80;86")
else()
    set(CMAKE_CUDA_ARCHITECTURES "60;70;75")
endif()

add_executable(train src/train.cu src/model.cpp)
target_link_libraries(train PRIVATE CUDA::cudart CUDA::cublas)

CMake 3.24 added CMAKE_CUDA_ARCHITECTURES=native, which detects the GPU on the current machine at configure time. CMake 3.23 added all-major to cover one representative SM per generation. These are useful for development but fragile for distribution. A library built with native on your RTX 4090 will not run on a CI runner with a T4.

There is one critical ordering constraint: the Conan-generated toolchain file must be included before the project() call. CMake locks compiler configuration at project() time. Setting CMAKE_CUDA_COMPILER afterward has no effect, and this produces a confusing failure that looks like a CUDA installation problem rather than a CMake ordering problem.

What Conan Does That CMake Cannot

CMake can model which architectures to target and which toolkit to use. It cannot model which versions are binary-compatible with each other, or enforce that a package built for CUDA 12.0 is not silently mixed into a project expecting CUDA 12.4. That is Conan’s job.

Conan 2.x does not ship with CUDA in its default settings.yml, which is the correct design. CUDA versions change faster than Conan releases, and project-level extension is the right place for toolchain-specific settings. You add CUDA to your settings:

# settings.yml (project-level extension)
cuda_version:
  - "None"
  - "11.8"
  - "12.0"
  - "12.1"
  - "12.2"
  - "12.4"
  - "12.6"

Once cuda_version is a Conan setting, it becomes part of the package ID hash. A build against CUDA 12.0 and a build against CUDA 12.4 produce different package IDs, stored separately in the binary cache. Silent overwrite collisions, where a re-install with a different toolkit overwrites a compatible binary, become structurally impossible.

The compatibility plugin takes this further by encoding forward compatibility explicitly in Python:

def compatibility(conanfile):
    cuda = str(conanfile.settings.get_safe("cuda_version", "None"))
    if cuda == "None":
        return []
    major, minor = cuda.split(".")
    return [
        [{"cuda_version": f"{major}.{m}"}]
        for m in range(int(minor) + 1, 10)
    ]

This tells Conan that a binary built for CUDA 12.0 can satisfy a dependency requesting CUDA 12.2 or 12.4, within the same major version. This mirrors NVIDIA’s backward compatibility guarantee for minor versions, and it prevents unnecessary rebuilds when the toolkit is upgraded by a patch release.

Validation in the recipe’s validate() method enforces driver-toolkit constraints at install time rather than runtime:

def validate(self):
    cuda = self.settings.get_safe("cuda_version")
    if cuda and str(cuda).startswith("12.") and self.settings.os == "Windows":
        msvc = int(str(self.settings.get_safe("compiler.version", "0")))
        if msvc < 192:
            raise ConanInvalidConfiguration(
                f"CUDA 12.x requires MSVC 2019 (192x) or newer"
            )

On Windows, CUDA 12.x requires MSVC as the host compiler. Clang is not supported without significant patching. This is not a CMake limitation — it is an NVCC constraint. Catching it at conan install time rather than at compile time, or worse at link time, is worth the lines of Python.

The Workflow

The talk at using std::cpp 2026 describes the target experience: one source checkout, one command, identical builds on every platform. With Conan profiles and CMakePresets.json wired together, this looks like:

conan install . --profile=profiles/cuda-12.4-linux --build=missing
cmake --preset conan-cuda-release
cmake --build --preset conan-cuda-release

The profile encodes the full target environment: OS, compiler, C++ standard, and CUDA version. The conan_toolchain.cmake that Conan generates wires this into CMake before project() runs. The preset references the toolchain. From this point, the developer does not touch toolkit paths manually.

The CMakePresets.json entry is minimal:

{
  "name": "conan-cuda-release",
  "toolchainFile": "build/Release/generators/conan_toolchain.cmake",
  "cacheVariables": {
    "CMAKE_BUILD_TYPE": "Release"
  }
}

CI machines use a different profile with a different CUDA version. The package ID hash handles the rest. The profiles directory becomes the single place where environment variation is declared, and that is a meaningful improvement over scattered environment variables and README instructions.

Where the Alternatives Fall Short

vcpkg uses triplets for platform configuration, but CUDA version is not a native triplet dimension. You can add overlay triplets with feature flags, but the binary collision risk remains because vcpkg’s binary caching does not hash CUDA version by default. CPM.cmake and FetchContent are source-level tools with no concept of binary compatibility at all. They recompile everything, which is fine for small projects and brutal for anything touching cuDNN or TensorRT, where compilation times run into tens of minutes.

Spack, the HPC package manager, has first-class CUDA variant support and can provision the CUDA toolkit itself. If your target is a supercomputer cluster, Spack is the right tool. For product development shipping to Linux workstations and Windows desktops, Conan’s binary distribution model fits better and requires less infrastructure knowledge to operate.

The Container Confusion

One failure mode deserves explicit mention. When running inside a Docker container with nvidia-container-toolkit, nvidia-smi reports the maximum toolkit version supported by the host driver, not what is installed inside the container. Teams frequently misread this and assume the container has a newer toolkit than it does. The CUDA toolkit version inside the container is set by the base image (nvidia/cuda:12.4.1-devel-ubuntu22.04, for instance), and that is the version Conan and CMake should be configured against.

The compatibility matrix exists on two levels: the package manager level, which Conan handles, and the infrastructure level, which requires that your CI runner’s host driver is recent enough to support the toolkit version your containers target. Conan cannot enforce the second part, but validate() methods can at least document the minimum driver requirement in the recipe and fail loudly with a message that points to the right variable rather than surfacing as a runtime CUDA_ERROR_INVALID_DEVICE.

Closing Thought

The ISO C++ survey result about dependency management is not surprising. Roughly 39% of C++ developers still use no package manager at all, according to the JetBrains Developer Ecosystem Survey. CUDA amplifies every friction point in that situation because it is not just a library; it is a version-constrained ecosystem with its own compiler, its own ABI guarantees, and a three-way compatibility relationship between toolkit, driver, and hardware.

Modeling that relationship in code, through Conan settings and compatibility plugins, is the part that matters. CMake 3.18+ provides the language-level primitives. Conan provides the identity and compatibility layer. Together they make the compatibility matrix a build system concern instead of a documentation problem, and that is a meaningful shift for teams building C++ AI infrastructure.