· 7 min read ·

The Hidden Machinery of std::vector

Source: isocpp

There is a certain category of programming exercise that looks trivial from the outside and turns out to be a rabbit hole. Implementing your own std::vector is one of them. Quasar Chunawala’s walkthrough on isocpp.org frames this correctly: the standard library’s version is a work of art, and a homegrown replacement will not beat it. But the exercise is not about beating it. It is about understanding what the standard library is actually doing, and that understanding pays dividends every time you write code that uses vectors.

I want to go a level deeper than the typical “just maintain a pointer, size, and capacity” explanation and look at three things that trip people up: the growth factor decision and why it matters more than it seems, the exception safety guarantee that push_back is required to provide, and the interaction between noexcept move constructors and reallocation performance.

Three Pointers

A conforming std::vector<T> implementation typically maintains three pointers:

T* _begin;    // start of the allocated buffer
T* _end;      // one past the last constructed element
T* _end_cap;  // one past the last allocated byte

size() is _end - _begin. capacity() is _end_cap - _begin. The gap between _end and _end_cap is raw, uninitialized memory where future elements will live. The critical distinction from a plain C array is that this memory has been allocated but no objects have been constructed in it yet. Constructing an element there requires placement new:

new (_begin + _size) T(value);  // construct in pre-allocated memory

And destroying an element before deallocation requires an explicit destructor call:

(_begin + i)->~T();

Forgetting this and just calling delete[] on the buffer is undefined behavior because delete[] expects objects that were constructed with new[]. A correct implementation allocates raw memory via ::operator new or through an allocator, constructs objects explicitly, destroys them explicitly, and then deallocates the raw memory. The portable way to do this is through std::allocator_traits:

std::allocator_traits<Allocator>::construct(alloc, ptr, std::forward<Args>(args)...);
std::allocator_traits<Allocator>::destroy(alloc, ptr);

This matters for allocator-aware containers, which std::vector is required to be.

The Growth Factor Debate

When a vector runs out of capacity, it needs to allocate a larger buffer, move all existing elements, and free the old one. The question is: how much larger? The C++ standard only requires that push_back be amortized O(1); it says nothing about the growth factor. Different implementations make different choices:

  • GCC’s libstdc++ and Clang’s libc++ both use 2x
  • MSVC’s STL uses 1.5x
  • Facebook’s folly::fbvector uses 1.5x

The argument for 1.5x is subtle. With a 2x growth factor, each new allocation is always larger than the sum of all previously freed blocks. This means previously freed memory can never satisfy a future allocation, putting pressure on the allocator to find fresh pages. With a factor below the golden ratio (~1.618), freed blocks eventually accumulate enough total size to satisfy future requests, giving the allocator more opportunities to reuse memory. Facebook’s engineering team cited this as part of the motivation for fbvector’s growth strategy, along with friendlier behavior with realloc for trivially copyable types.

For a homegrown implementation, 2x is the simpler default and perfectly correct. But this is the kind of decision that experienced library authors revisit under production workloads.

Exception Safety: The Strong Guarantee

This is where most naive implementations fall apart. The C++ standard mandates that push_back provides the strong exception guarantee: if an exception is thrown, the vector is left unchanged. No elements are lost, no memory is leaked, and the vector remains in a fully valid state.

Here is what a correct implementation looks like:

void push_back(const T& value) {
    if (_size == _capacity) {
        size_t new_cap = _capacity == 0 ? 1 : _capacity * 2;
        T* new_buf = static_cast<T*>(::operator new(new_cap * sizeof(T)));
        // ^ this can throw std::bad_alloc

        size_t i = 0;
        try {
            for (; i < _size; ++i)
                new (new_buf + i) T(std::move_if_noexcept(_begin[i]));
            new (new_buf + _size) T(value);
        } catch (...) {
            // Clean up partially constructed elements in the new buffer
            for (size_t j = 0; j < i; ++j)
                (new_buf + j)->~T();
            ::operator delete(new_buf);
            throw;  // re-throw: old buffer is untouched
        }

        // Commit: destroy old elements, free old buffer, update pointers
        for (size_t i = 0; i < _size; ++i)
            (_begin + i)->~T();
        ::operator delete(_begin);
        _begin = new_buf;
        _capacity = new_cap;
    }
    new (_begin + _size) T(value);
    ++_size;
}

The pattern is: allocate and construct in the new buffer entirely before touching the old one. If anything throws, clean up the partially constructed new buffer, free it, and re-throw. The old buffer remains intact. Only after the new buffer is fully constructed do you tear down the old one and update the member pointers.

A simpler implementation might allocate, copy over, free the old buffer, then try to append the new element. If that final construction throws, you have destroyed your old data and cannot recover. That violates the strong guarantee.

There is another subtle case: v.push_back(v[0]). If reallocation occurs, v[0] is in the old buffer, which gets freed before the new element is constructed. The reference passed to push_back becomes dangling. A fully correct implementation must handle this aliasing case.

noexcept Move Constructors and Why They Matter Here

Notice the std::move_if_noexcept in the reallocation loop above. This is not cosmetic. std::move_if_noexcept returns an rvalue reference if the move constructor is noexcept, and a const lvalue reference otherwise. In other words: if moving might throw, the vector falls back to copying.

Why? Because the strong guarantee requires that if something goes wrong mid-reallocation, the original elements are still intact. If you move element 3 into the new buffer and then element 4’s constructor throws, elements 0 through 2 are already gone from the old buffer. You cannot recover. With copying, the originals remain untouched throughout, so a mid-loop exception leaves the old buffer consistent.

The consequence is that a type with a potentially throwing move constructor forces your vector to do O(n) copies on every reallocation instead of O(n) moves. For types that hold heap memory, this is a significant performance difference. The fix is always mark your move constructors noexcept when they truly cannot throw:

class MyResource {
public:
    MyResource(MyResource&&) noexcept = default;
    MyResource& operator=(MyResource&&) noexcept = default;
};

This single annotation is one of the highest-leverage things you can do for types that will be stored in vectors.

What the Standard Library Gets That You Probably Will Not

Beyond the above, a production std::vector handles several things that a learning exercise typically skips:

Allocator propagation. The standard defines three traits, propagate_on_container_copy_assignment, propagate_on_container_move_assignment, and propagate_on_container_swap, that determine whether the allocator should be copied along with the data. Getting this wrong means containers with custom allocators behave incorrectly in ways that are extremely hard to debug.

Trivially copyable fast paths. For types where std::is_trivially_copyable_v<T> is true, a vector can use memcpy and memmove instead of element-wise placement new and destructor calls. Standard library implementations exploit this. A naive implementation calls constructors and destructors even for plain integers.

Trivially relocatable types. This is an ongoing gap in the standard. Moving a vector element into a new buffer currently requires calling the move constructor and then the destructor of the source, even for types where a raw memcpy would be equivalent. Facebook’s folly, EASTL, and others have added non-standard type traits to enable memcpy-based relocation. P1144 proposes standardizing this concept, though it has not landed in C++26. Until it does, custom vectors can potentially beat std::vector for certain type categories by opting into this optimization manually.

How Rust Handles the Same Problem

For comparison, Rust’s Vec<T> uses a 2x growth factor, implemented in the internal RawVec. One interesting difference: a zero-capacity Vec does not allocate at all. Instead of a null pointer, it stores NonNull::dangling(), a valid non-null pointer that is not safe to dereference. This avoids a null check on the common empty-vector fast path.

Rust’s ownership model also sidesteps the move_if_noexcept problem entirely. In Rust, a move is always a bitwise copy of the value, and the source is considered uninitialized afterward. There is no concept of a throwing move. This simplifies the reallocation logic considerably, though it does mean Rust’s equivalent of SBO (small buffer optimization) requires more care since you cannot simply bitwise-move a type that contains a self-referential pointer.

Why This Exercise Is Worth Doing

The standard library version of std::vector is almost certainly better than what you will write. The point of implementing your own is that every decision you face, how much to grow, what to do when a constructor throws halfway through reallocation, whether to copy or move, forces you to confront a real constraint that the standard library already solved.

After going through this exercise once, the noexcept annotations in your own types start to feel less like boilerplate and more like a genuine contract. The distinction between object lifetime and buffer lifetime stops being abstract. And the next time you read a cppreference page on iterator invalidation, the rules make sense because you understand the mechanism behind them.

Was this interesting?