· 5 min read ·

After You Implement vector<T>, Go Read vector<bool>

Source: isocpp

Implementing a custom vector is one of those exercises that rewards exactly as much curiosity as you bring to it. Quasar Chunawala’s walkthrough on isocpp.org frames this well: you will not beat std::vector, but working through the design builds a mental model that makes the real implementation legible. Spend a day on it and you will understand why ::operator new instead of new T[n], why the growth factor is multiplicative, and why a move constructor without noexcept silently forces copies on every reallocation.

Then go look at std::vector<bool>.

The Partial Specialization

std::vector<bool> has been a special case since C++98. It is declared as a partial specialization of the primary template:

template <class Allocator>
class vector<bool, Allocator>;

The motivation was space efficiency. A vector<bool> backed by plain bool storage uses at least one byte per element on every mainstream platform. The specialization packs eight booleans into a single byte, giving a roughly eight-to-one compression ratio for boolean arrays. At the time, this seemed like a reasonable optimization, particularly for bitfield-heavy embedded and systems code where memory was constrained.

The problem is that achieving this packing requires storing individual bits, and individual bits cannot have addresses. The standard T& operator[] contract that works for every other vector<T> is impossible to fulfill for a packed bit vector. bool& implies a referenceable object at a specific memory address. A one-bit value inside a packed byte has no independent addressable existence.

The Proxy Reference

To resolve this, the specialization introduces vector<bool>::reference, a proxy type that simulates a reference to a single bit:

v[0] = true;   // calls reference::operator=(bool)
bool b = v[0]; // calls reference::operator bool()

For cases where you read and write elements through explicit assignments, this works transparently. The breakage appears in generic code.

template <typename Container>
void negate_first(Container& c) {
    auto& first = c[0];  // bool& for vector<int>, proxy for vector<bool>
    first = !first;
}

With vector<int>, first is int&. With vector<bool>, the non-const operator[] returns a proxy. auto& first becomes vector<bool>::reference, which compiles and runs here. But the illusion collapses in more explicit contexts:

bool& ref = v[0];  // does not compile for vector<bool>
bool* ptr = &v[0]; // taking the address of a proxy -- does not compile

std::addressof(v[0]) fails for the same reason. There is no address to take. The standard’s expectation that &v[0] returns a T* pointing into a contiguous array is not merely weakened; it is entirely absent. std::vector<bool> has no data() member function. There is no contiguous sequence of bool values to point into.

This matters whenever you need to pass a boolean vector’s contents to a C API, to a function expecting bool*, or to any algorithm that operates on a pointer range. All of those work for every other vector<T>. None of them work for vector<bool>.

Where Generic Code Silently Breaks

The proxy issue compounds in template contexts. Code that stores or operates on pointers to elements fails:

template <typename T>
std::vector<T*> get_pointers(std::vector<T>& v) {
    std::vector<T*> ptrs;
    for (auto& x : v)
        ptrs.push_back(&x);  // fails for vector<bool>
    return ptrs;
}

The compiler error is not immediately informative. You are told that some address-of expression on a proxy type does not produce bool*, and finding where that originates in deeply templated call stacks is unpleasant.

The standard itself acknowledges the problem. The cppreference documentation for std::vector<bool> notes: “std::vector<bool> does not model the requirements of Container, SequenceContainer, or AllocatorAwareContainer.” That is a significant admission for a type in the standard library. The class does not fully satisfy the contracts that every other std::vector specialization satisfies. Herb Sutter catalogued the consequences in detail in his “Effective STL” era writing, and the community’s assessment has been consistent for more than two decades: std::vector<bool> was a mistake.

What the Implementations Do With It

libstdc++, libc++, and MSVC’s STL all implement the specialization because the standard mandates it. In libstdc++, the storage is _M_start._M_p, a pointer to _Bit_type (typically unsigned long). Iterators carry both a pointer to the word and a bit mask; incrementing an iterator updates the mask and advances the word pointer when the mask wraps. libc++ is structurally similar, using __storage_type and a bit offset stored alongside the pointer.

The implementations are correct and well-tested as bit containers. The issue is not quality of implementation; it is that the interface they expose diverges from what generic C++ code expects from a vector.

A consequence that is easy to overlook: AddressSanitizer integrations that libc++ and libstdc++ maintain for the general template, marking the capacity gap between end_ and end_of_capacity_ as inaccessible, are not applicable to vector<bool> in the same way because the storage model is entirely different. Tooling that understands vectors may not understand the specialization.

What to Use Instead

For most boolean array use cases, the alternatives are more predictable.

std::vector<char> or std::vector<uint8_t> stores one byte per boolean, wastes seven bits per value, but provides all normal vector guarantees: data(), T& references, pointer arithmetic, and interoperability with C APIs. For anything that needs to pass a boolean array by pointer, this is the direct replacement.

std::bitset<N> provides compact bit storage for a fixed-size array known at compile time. It supports bitwise operations, has clean semantics, and does not suffer from the proxy-reference problems in generic contexts. The constraint is that the size is a compile-time template parameter.

boost::dynamic_bitset<> provides the runtime-sized equivalent. It is explicit about being a bit container rather than a sequence container, which is the right framing for what vector<bool> was trying to be. It supports all the bitwise operations and makes no pretense of satisfying the same interface as vector<int>.

If you are working with C++20 ranges, std::ranges::views::transform over a vector<uint8_t> with a conversion to bool is composable and does not introduce proxy types into your algorithm chains.

The Lesson Beyond the Exercise

Implementing vector<T> teaches you that a correct generic container is built on clear contracts: addressable elements, references that bind to actual objects, iterators that dereference to T&. std::vector<bool> was an attempt to add a space optimization as a transparent drop-in replacement. It succeeds in the most common cases, read an element, write an element, iterate in a range-based for loop, while breaking the contracts in edge cases that are not always immediately visible.

The proposal to deprecate vector<bool> has circulated in WG21 and has not landed. The specialization is too widely deployed to remove without disruption, and all replacement options require explicit code changes at every call site. So it persists, documented with warnings on cppreference, flagged in various linters, and occasionally surprising people who write generic code that assumes vector<T> behaves consistently across all T.

That tension is worth understanding separately from the implementation exercise. When you build vector<T> and get the contracts right, you have produced something with cleaner semantics than one of the standard library’s most heavily used specializations. The gap between a correct generic container and a container that works in practice for most users is where a lot of C++ library history lives.

Was this interesting?