Modeling the CUDA Compatibility Matrix in Your Build System

Every year the ISO C++ developer survey comes back with the same finding: dependency management is the top pain point. Not language complexity, not the standard library, not toolchain fragmentation, but the unglamorous problem of getting your dependencies to build reliably across machines. For general C++ projects this is already a solved-ish problem if you adopt Conan or vcpkg. For projects that introduce CUDA, it gets harder in a specific and underappreciated way.

A talk at using std::cpp 2026 on cross-platform C++ AI development with Conan and CMake frames this correctly: CUDA does not just add a dependency, it adds a compatibility matrix, and that matrix needs to be modeled in your build tooling or it will bite you repeatedly.

The Three Axes of CUDA Compatibility

When you build a CUDA project, you are dealing with three orthogonal versioning concerns simultaneously.

First, the CUDA Toolkit version, such as 11.8, 12.3, or 12.6, which is the compiler and library set you use at build time. Second, the GPU Driver version installed on the target machine at runtime, which must meet a minimum threshold for the toolkit version you compiled against. CUDA 12.0 requires driver >= 525.60.13 on Linux; CUDA 12.6 requires >= 560. Third, the GPU compute capability, the architecture generation of the physical GPU: sm_80 for Ampere (A100), sm_86 for RTX 3090, sm_89 for Ada Lovelace (RTX 4090), sm_90 for Hopper (H100).

Code compiled targeting sm_80 runs on sm_80 and above within a CUDA version’s supported range. But code compiled for sm_90 will not run on an sm_80 GPU, and CUDA PTX intermediate code only provides forward compatibility within architectural bounds. The practical consequence: your developer workstation has an RTX 4090 (sm_89) and your CI runner has an A100 (sm_80). Your colleague uses CUDA 12.3; CI has 12.6. Your build works locally and fails or silently generates wrong code in CI.

Naive approaches treat this as a documentation problem. Teams write a README that says “install CUDA 12.3 and target sm_80;86;89.” This breaks every time someone onboards or CI gets upgraded, and it scales to zero on heterogeneous teams.

CMake’s Modern CUDA Story

CMake’s CUDA support has improved substantially over the past several releases. The old FindCUDA module, which was the only option for years, was deprecated in CMake 3.27. The modern path is enable_language(CUDA) combined with find_package(CUDAToolkit), introduced properly in CMake 3.17 and 3.18 respectively.

The critical variable for the architecture axis is CMAKE_CUDA_ARCHITECTURES, added in CMake 3.18. Before that, developers set CUDA_ARCH_LIST or equivalent and hoped. Starting with CMake 3.23, this variable accepts the special values native, all, and all-major, where native detects the GPU installed on the build machine and compiles only for that architecture.

A minimal modern CMakeLists.txt for a CUDA-enabled AI project looks like this:

cmake_minimum_required(VERSION 3.18)
project(my_inference_engine LANGUAGES CXX CUDA)

find_package(CUDAToolkit REQUIRED)

# Allow override from Conan toolchain or CI, fall back to broad coverage
if(NOT DEFINED CMAKE_CUDA_ARCHITECTURES)
  set(CMAKE_CUDA_ARCHITECTURES "80;86;89;90")
endif()

add_library(inference_kernels STATIC kernels.cu)
target_compile_features(inference_kernels PUBLIC cuda_std_17)
target_link_libraries(inference_kernels
  PUBLIC
    CUDA::cudart
    CUDA::cublas
    CUDA::cufft
)

The key discipline here is the if(NOT DEFINED) guard. It lets Conan’s generated toolchain inject the right value for a given profile without the CMakeLists.txt needing to know about it, while still providing a sensible default for developers who run CMake directly.

How Conan 2 Models the Matrix

Conan 2’s profile and settings system is where the architecture compatibility actually gets encoded. The base settings.yml file does not include CUDA version by default, but you can extend it. Once cuda_version is a recognized setting, Conan will treat it as part of the package identity, meaning binaries built with CUDA 12.3 are distinct from those built with CUDA 12.6 and will not be mixed accidentally.

A Conan profile for a CUDA build on a Linux CI machine might look like:

[settings]
os=Linux
arch=x86_64
compiler=gcc
compiler.version=12
compiler.libcxx=libstdc++11
build_type=Release
cuda_version=12.3

[conf]
tools.cmake.cmaketoolchain:generator=Ninja

[buildenv]
CUDA_HOME=/usr/local/cuda-12.3
PATH=+:/usr/local/cuda-12.3/bin

The conanfile.py for the project then declares cuda_version in its settings, handles requirements that are CUDA-version-dependent, and injects the right architecture list into CMake:

from conan import ConanFile
from conan.tools.cmake import CMake, cmake_layout

class InferenceEngine(ConanFile):
    name = "inference_engine"
    version = "0.1"
    settings = "os", "compiler", "build_type", "arch", "cuda_version"
    generators = "CMakeToolchain", "CMakeDeps"

    def requirements(self):
        self.requires("cutlass/3.5.0")
        self.requires("nccl/2.20.5")

    def layout(self):
        cmake_layout(self)

    def generate(self):
        tc = CMakeToolchain(self)
        tc.variables["CMAKE_CUDA_ARCHITECTURES"] = self._cuda_archs()
        tc.generate()

    def build(self):
        cmake = CMake(self)
        cmake.configure()
        cmake.build()

    def _cuda_archs(self):
        cuda = str(self.settings.get_safe("cuda_version", "12.0"))
        major = int(cuda.split(".")[0])
        if major >= 12:
            return "80;86;89;90"
        elif major == 11:
            return "70;80;86"
        return "70;75;80"

The _cuda_archs helper is doing real work: it encodes the architectural timeline in code rather than in a comment. CUDA 11.x did not support sm_90; CUDA 12.x dropped practical support for pre-Volta architectures below sm_70. This logic living in the conanfile means it travels with the source and applies consistently on every machine.

The Compatibility Method: Avoiding Binary Explosion

Without additional configuration, every CUDA minor version produces a distinct Conan binary. A team using CUDA 12.2, 12.3, and 12.6 would need three separate prebuilt binaries for every package. Conan 2’s compatibility() method lets you declare that binaries are interchangeable across minor versions:

def compatibility(self):
    cuda = self.settings.get_safe("cuda_version")
    if not cuda:
        return []
    major = int(str(cuda).split(".")[0])
    return [
        {"settings": [("cuda_version", f"{major}.{minor}")]}
        for minor in range(0, 10)
    ]

This tells Conan: if you cannot find a binary for exactly CUDA 12.3, try any 12.x binary. CUDA guarantees ABI stability within a major version for most of its libraries, so this is correct in practice. The package built against 12.0 will work at runtime against the 12.6 toolkit as long as you are not using APIs added in 12.4 or later.

One Command, Really

With profiles committed to the repository under profiles/, the complete workflow for a fresh checkout on any supported machine becomes:

conan install . --profile=profiles/linux-cuda12-release --build=missing
cmake --preset conan-release
cmake --build build/Release

The first command resolves and builds all dependencies for the exact CUDA version encoded in the profile, injecting the CMake toolchain file. The second and third commands are pure CMake. No manual CUDA architecture flags, no per-machine environment scripts, no README-driven setup.

For CI, the profile is selected via an environment variable or a matrix entry in the CI configuration:

strategy:
  matrix:
    conan_profile: [linux-cuda12-release, linux-cuda11-release]

Where This Approach Has Limits

Conan does not manage the CUDA Toolkit installation itself. That still comes from the OS package manager or NVIDIA’s installer and must be present on the machine before Conan runs. Tools like Spack can manage the toolkit as a package, which is useful in HPC environments where multiple CUDA versions coexist, but Spack’s learning curve is steeper and its integration with CMake-based projects is less ergonomic.

The compatibility matrix also does not protect you from driver version mismatches at runtime. A binary compiled with CUDA 12.6 will fail to load on a machine with a driver that only supports up to CUDA 12.3, even if everything built successfully. This is fundamentally a deployment problem that build tooling alone cannot fix; NVIDIA’s CUDA Forward Compatibility Package is the proper solution there, and it is worth encoding the minimum required driver version somewhere visible in the project.

Vcpkg, the other major C++ package manager, has improved its CUDA support but does not yet have a first-class concept equivalent to Conan’s compatibility() method for binary reuse across CUDA versions. For projects already deep in the Microsoft ecosystem it may be adequate, but for teams doing serious GPU work Conan’s model fits more naturally.

The core insight from the using std::cpp talk is that CUDA compatibility is not a configuration problem you document, it is a structured constraint you model. Conan’s settings system gives you the vocabulary to express that structure, and CMake’s modern CUDA language support gives you the mechanism to act on it. Together they make the “one checkout, one command” promise achievable rather than aspirational.