· 7 min read ·

Rolling Your Own vector: The Design Decisions That Actually Matter

Source: isocpp

Implementing std::vector from scratch is a rite of passage in C++ education, and Quasar Chunawala’s walkthrough on isocpp.org captures why the exercise is worth doing. But a vector that compiles and passes basic tests sits far from what ships in production standard libraries. The interesting territory lies between those two points: the growth factor arithmetic, the exception safety machinery during reallocation, and the constraints that make obviously reasonable choices wrong.

The Three-Pointer Layout

cppreference describes std::vector<T> as a dynamically sized array, which is accurate but leaves out the specific internal structure that makes the standard library implementation tick. Both libc++ and libstdc++ represent a vector with three pointers:

  • _begin: pointer to the first element
  • _end: pointer to one-past-the-last live element
  • _end_cap: pointer to one-past-the-last allocated slot

Size is _end - _begin. Capacity is _end_cap - _begin. Storing end pointers instead of a pointer plus two integers means that the common path, checking whether capacity remains before push_back, is a single pointer comparison rather than loading two separate fields.

The allocator is stored alongside these pointers using the Empty Base Optimization. Since std::allocator<T> has no data members, the compiler can give it zero size when it appears as a base class. libstdc++ achieves this by having the internal _Vector_impl struct inherit from the allocator type; libc++ uses a __compressed_pair. The result is that a std::vector<int> using the default allocator is exactly 24 bytes on a 64-bit system: three pointers and nothing else.

This layout is frozen by the Itanium C++ ABI, which governs the binary interface on Linux and macOS. Changing it would break every compiled binary that links against the standard library. This is why proposals to add small buffer optimization to std::vector face a structural impossibility: even if the committee wanted it, the major implementations cannot change without an ABI break version.

The Growth Factor Decision

The standard requires push_back to be O(1) amortized, which forces exponential growth on reallocation. The specific multiplier is left to implementations, and the choices they have made diverge for principled reasons.

GCC’s libstdc++ and Clang’s libc++ both use 2x. MSVC’s STL uses 1.5x. Facebook’s folly::fbvector uses 1.5x and documents the reasoning explicitly.

The argument against 2x comes from geometric series arithmetic. Suppose a vector reallocates repeatedly, producing buffers of capacity 1, 2, 4, 8, …, n. The allocator frees each old buffer when the next is claimed. The sum of all freed buffers is 1 + 2 + 4 + … + n/2, which equals n - 1. The next allocation requires n bytes. Because n > n - 1, the allocator can never satisfy the next request by recycling previously freed memory. Every new allocation must come from fresh pages.

With a factor of 1.5x, the cumulative freed memory eventually exceeds the next request. After enough reallocations, the allocator can reuse earlier blocks. This is particularly effective with allocators like jemalloc that maintain size classes and recycle same-size blocks efficiently. For fbvector combined with jemalloc, 1.5x growth sometimes allows an in-place extension via realloc, moving no elements at all.

The theoretical optimum is the golden ratio phi ≈ 1.618, which satisfies phi^2 = phi + 1. After exactly two generations of freed blocks, the accumulated memory covers the next allocation precisely. Any growth factor at or below phi permits eventual reuse from freed blocks; any factor above phi does not. Both 1.5x and phi are viable; 2x fails this criterion.

For a custom implementation, 1.5x is the more defensible starting point. The 2x factor is common in tutorials because its amortization proof is standard textbook material, but it is friendlier to explanations than to allocators.

Exception Safety During Reallocation

When a vector outgrows its buffer, it must transfer all existing elements to new storage. The question of whether to move or copy during this transfer is where most tutorial implementations diverge from the standard library.

Moving is cheaper but unsafe if the move constructor can throw. Suppose ten elements are being moved to a new buffer and the fifth move constructor throws. Five old elements have been destructively modified and are in an indeterminate state. Five new elements are live in the new buffer. Neither buffer holds the original vector. The strong exception guarantee, which promises that a failed push_back leaves the vector unchanged, cannot be honored.

Copying avoids this: if a copy throws during reallocation, the old buffer is intact and the vector can deallocate the partially filled new buffer and re-throw. But copying is more expensive than moving.

The standard library’s answer is std::move_if_noexcept. For each element transferred during reallocation, it moves if the move constructor is noexcept, and copies otherwise. The type trait std::is_nothrow_move_constructible<T> is the decision point.

void reallocate(size_type new_cap) {
    T* new_buf = std::allocator_traits<Alloc>::allocate(alloc_, new_cap);
    size_type i = 0;
    try {
        for (; i < size_; ++i) {
            std::allocator_traits<Alloc>::construct(
                alloc_,
                new_buf + i,
                std::move_if_noexcept(begin_[i])
            );
        }
    } catch (...) {
        for (size_type j = 0; j < i; ++j)
            std::allocator_traits<Alloc>::destroy(alloc_, new_buf + j);
        std::allocator_traits<Alloc>::deallocate(alloc_, new_buf, new_cap);
        throw; // old buffer is untouched
    }
    // commit: destroy old elements, free old buffer
    for (size_type j = 0; j < size_; ++j)
        std::allocator_traits<Alloc>::destroy(alloc_, begin_ + j);
    std::allocator_traits<Alloc>::deallocate(alloc_, begin_, capacity_);
    begin_ = new_buf;
    capacity_ = new_cap;
}

libc++ implements this via __uninitialized_allocator_move_if_noexcept; libstdc++ uses __uninitialized_move_if_noexcept_a. Both work the same way.

The practical implication for users of std::vector is direct: if a type’s move constructor is not marked noexcept, every reallocation copies instead of moves. For expensive-to-copy types, this can be a large hidden cost. Marking move constructors noexcept when they genuinely cannot throw is worth treating as a default practice for types intended to live in containers.

Placement New and the Allocation/Construction Split

A homegrown vector must separate two operations that new T conflates: allocating raw memory and constructing objects into it. std::vector preallocates capacity without constructing elements; individual elements are constructed on demand as they are inserted.

The tool for constructing into pre-allocated memory is placement new:

T* raw = std::allocator_traits<Alloc>::allocate(alloc_, cap);
// later, to construct the i-th element:
::new (static_cast<void*>(raw + i)) T(args...);

Destruction reverses this: call the destructor explicitly on each live element, then deallocate the buffer when it is no longer needed:

(raw + i)->~T();
// after all live elements are destroyed:
std::allocator_traits<Alloc>::deallocate(alloc_, raw, cap);

Correct implementations use std::allocator_traits::construct and std::allocator_traits::destroy rather than calling placement new and destructors directly. This allows custom allocators to intercept construction and destruction, which matters for pool allocators and std::pmr::vector. allocator_traits also provides sensible defaults for methods an allocator does not implement, making the container compatible with the full range of C++11-and-later allocators.

The Trivially Copyable Shortcut

Both libc++ and libstdc++ apply an important optimization for types where std::is_trivially_copyable<T> holds: bulk transfer via std::memcpy rather than element-by-element construction. Integers, floats, and plain structs qualify. For large vectors of trivially copyable types, reallocation becomes a single memory copy rather than a loop of constructors.

if constexpr (std::is_trivially_copyable_v<T>) {
    std::memcpy(new_buf, begin_, size_ * sizeof(T));
} else {
    for (size_type i = 0; i < size_; ++i)
        std::allocator_traits<Alloc>::construct(
            alloc_, new_buf + i, std::move_if_noexcept(begin_[i]));
}

A pending proposal, P1144 (trivially relocatable), would extend this optimization to types like std::unique_ptr and std::string when they appear as vector elements. These types can be safely relocated by copying their bytes provided the destructor of the source is not called afterward. The proposal has not been standardized yet, but libc++ has begun annotating its types in anticipation.

Why There Is No Small Buffer Optimization

std::string uses Small String Optimization to store short strings without heap allocation. LLVM’s SmallVector<T, N> stores N elements in an inline buffer before spilling to the heap. std::vector does neither, and the reason is the move guarantee.

The standard requires that moving a std::vector is O(1). For a heap-allocated buffer, this is a pointer swap. For an inline buffer, there is no pointer to swap; each element must be moved individually, making the operation O(n) for small vectors. The O(1) move guarantee and SBO are incompatible.

Beyond the asymptotic cost, SBO would cause pointers to vector elements to change addresses when the vector is moved. The standard’s iterator invalidation rules permit this for the moved-from vector, but code that captures element addresses before a move would silently break.

The C++26 standard is adding std::inplace_vector<T, N>, which makes the fixed-capacity, stack-resident design a distinct type. The different move semantics and invalidation behavior are explicit in the type signature rather than hidden inside something that looks like std::vector.

Iterator Invalidation, Stated Precisely

The rules around iterator invalidation in std::vector are precise and worth knowing exactly, because violations are common and silent.

Any operation that causes reallocation, whether push_back, insert, emplace_back, or reserve with a larger capacity, invalidates every iterator, pointer, and reference into the vector. No exceptions.

Without reallocation, insert or emplace in the middle invalidates all iterators at or after the insertion point, since elements must shift to make room. erase invalidates all iterators at or after the erased position. push_back without reallocation invalidates only end().

swap is unusual: iterators remain valid but now refer to elements in the other vector. shrink_to_fit is allowed to reallocate and should be assumed to invalidate everything.

For a custom implementation, maintaining these guarantees requires care around any operation that moves element storage. The easiest violation is caching a pointer to a vector element and then calling push_back in a loop without reserve.

What the Exercise Is For

Reading libc++‘s vector source after building your own is a useful calibration. The gap between a working implementation and the standard library does not come primarily from complexity for its own sake. Most of it traces back to specific constraints: the allocator model, the exception safety guarantees written into the standard, the ABI stability requirements that prevent certain designs.

Building a custom vector forces direct engagement with these constraints rather than benefiting from them implicitly. That is where the exercise’s value lies.

Was this interesting?