Three Pointers: What Implementing vector<T> Teaches You About Language Design

Implementing std::vector<T> from scratch is a useful exercise, and Quasar Chunawala’s walkthrough on isocpp.org covers the basic mechanics well. But the real payoff isn’t the finished implementation. It’s what each design decision forces you to confront about C++‘s model of memory, and how starkly those decisions differ from what Rust, Java, or Python face when building the equivalent container.

The Three-Pointer Invariant

Every major C++ standard library stores a vector as three raw pointers:

T* _begin;      // start of allocated buffer
T* _end;        // one past the last constructed element
T* _end_cap;    // one past the last allocated slot

size() is _end - _begin. capacity() is _end_cap - _begin. The region [_end, _end_cap) is raw allocated memory with no live objects in it; reading those bytes is undefined behavior. On a 64-bit platform with a stateless allocator, sizeof(std::vector<int>) is exactly 24 bytes.

Rust’s Vec<T> uses the same structure internally, by way of RawVec<T>, which stores a NonNull<T> pointer, a capacity, and a length separately. The layout converges on the same design across both languages because the requirements lead there: you need to distinguish where memory starts, where live objects end, and where allocated space ends. There is no meaningful alternative if elements must be contiguous and lifetimes must be manually managed.

This layout is also now frozen in C++ by the Itanium C++ ABI, used on Linux and macOS. It cannot change without breaking ABI compatibility across every compiled binary on those platforms. That constraint has real consequences for what the standard library can ever offer; small buffer optimization for vector, for instance, is not merely unimplemented but structurally impossible under the current ABI.

Separation of Allocation and Construction

The first thing a naive implementation gets wrong is T* buf = new T[n]. That default-constructs all n elements immediately. A vector needs raw memory it can populate element-by-element as elements are added, not a fully initialized array of default-constructed objects.

The correct C++ idiom separates allocation from construction using placement new:

// Allocate raw memory, no objects yet
T* buf = static_cast<T*>(::operator new(n * sizeof(T)));

// Construct at a specific address
::new(static_cast<void*>(buf + i)) T(value);

// Destroy without deallocating
buf[i].~T();

// Deallocate without destroying
::operator delete(buf);

Standard-conforming code routes this through std::allocator_traits, which provides the indirection layer needed to support custom allocators:

using Traits = std::allocator_traits<Alloc>;
T* data = Traits::allocate(alloc, n);           // raw memory, no objects
Traits::construct(alloc, data + i, value);      // placement new via allocator
Traits::destroy(alloc, data + i);               // explicit destructor call
Traits::deallocate(alloc, data, n);             // free raw memory

Rust’s equivalent uses std::alloc::alloc for raw memory and std::ptr::write or MaybeUninit<T> for constructing objects in place. The pattern is structurally identical. Both languages require explicit acknowledgment of the boundary between “memory exists” and “an object exists at this memory.”

Java’s ArrayList<E> stores an Object[] internally. Construction and allocation are inseparable because the JVM manages object lifetimes and needs to track every live reference for garbage collection. You cannot hold raw memory that the GC doesn’t know about. Python’s list is similar: it stores an array of PyObject* pointers, each pointer a separately heap-allocated object managed by reference counting. Neither language has the concept of constructing an object at an arbitrary address, because their memory models don’t allow it. The three-pointer layout with its split between allocation and construction is only necessary, and only meaningful, in languages where you control that boundary yourself.

Growth Factors and the Memory Reuse Argument

The C++ standard requires amortized O(1) push_back but leaves the growth factor unspecified. The major implementations diverge:

GCC libstdc++: 2x
Clang libc++: 2x
MSVC STL: 1.5x
Facebook’s folly::fbvector: 1.5x

Rust’s Vec<T> also doubles, rounding to allocation size classes in practice. Java’s ArrayList uses approximately 1.5x (newCapacity = oldCapacity + (oldCapacity >> 1)). Python’s list uses a flatter formula that adds a fixed increment for small sizes and scales sublinearly at larger ones.

The theoretical case for 1.5x (or more precisely, for any factor below the golden ratio φ ≈ 1.618) is rooted in allocator behavior. With 2x growth, the allocation sequence for a vector reaching capacity 32 is: 1, 2, 4, 8, 16, 32. The sum of all freed buffers before the final allocation is 1+2+4+8+16 = 31, which is always one less than the current request. The freed memory can never satisfy the next allocation; every new allocation requires fresh pages from the OS. With a factor below φ, freed blocks eventually accumulate enough to cover the next request, allowing a well-implemented allocator to recycle them. Andrei Alexandrescu’s work on folly::fbvector formalizes this analysis and shows that with jemalloc, 1.5x sometimes allows in-place extension via realloc at no copying cost.

In practice, if you’re doing growth-heavy work, calling reserve(n) before the loop makes the growth factor irrelevant. Two reallocations per order of magnitude versus three makes a measurable difference only at large scale with many allocations, not in typical application code.

The Strong Exception Guarantee and `move_if_noexcept`

This is where C++ vector implementation diverges most sharply from equivalent containers in other languages.

push_back is required to provide the strong exception guarantee: if it throws, the vector is left unchanged. During a reallocation, the implementation must transfer existing elements to a new buffer. If it moves elements and a move constructor throws midway through, the old buffer has been partially destroyed and the new one is partially populated. The vector cannot recover. Rolling back is impossible because some source objects no longer exist.

The solution is std::move_if_noexcept:

for (size_t i = 0; i < _size; ++i)
    new (new_buf + i) T(std::move_if_noexcept(_begin[i]));

If T’s move constructor is noexcept, this uses move semantics. If not, it falls back to copy construction, which cannot damage the original elements. The old buffer remains intact as a rollback point if anything throws.

The consequence is stark and produces no compiler warning: a class with a move constructor not marked noexcept will be copied, not moved, during every vector reallocation. Ten thousand elements means ten thousand deep copies where pointer steals would suffice. The fix is to mark move constructors noexcept when they genuinely cannot throw, as Scott Meyers covers in Effective Modern C++, Item 14:

class MyResource {
public:
    MyResource(MyResource&&) noexcept = default;
    MyResource& operator=(MyResource&&) noexcept = default;
};

Rust handles this problem differently. All moves in Rust are bitwise copies (memcpy-based), and the type system statically prevents using a moved-from value. Panics in Rust are not exceptions that propagate through arbitrary stack frames; they unwind or abort and do not interleave with object construction in the way C++ exceptions do. Vec::push treats the element transfer as a bulk memory operation. There is no equivalent of “move constructor throws partway through reallocation” because Rust moves are not functions that run user-defined code.

Java and Python have no equivalent mechanism at all. Array element assignment either succeeds or raises a reference exception that doesn’t corrupt existing elements; there’s no user-defined construction that could throw an arbitrary exception mid-copy.

The Aliasing Edge Case

One correctness requirement most tutorials omit: v.push_back(v[0]) must work correctly even when it triggers a reallocation. The argument is a reference into the vector’s own buffer. A naive implementation would allocate a new buffer, free the old one, then attempt to copy from v[0], which now points to freed memory.

libc++ handles this by detecting whether the argument’s address falls within [begin_, end_) before any allocation and making a local copy if so. It’s a runtime check, and it’s genuinely necessary.

Rust’s borrow checker eliminates this class of bug at compile time. You cannot hold a shared reference into a Vec<T> and simultaneously call a mutating method on it. v.push(v[0]) is a compile error unless T: Copy, in which case it works by value. The ownership model converts a runtime correctness requirement into a compile-time constraint. C++ library authors implement the check; Rust authors get the guarantee structurally.

This is not a theoretical difference. Real codebases that implement vector-like containers in C++ have shipped with this bug. It’s a case where the source language’s design directly determines whether library correctness is a matter of implementation discipline or type-system enforcement.

What Grows Out of the Constraints

A working vector is not hard to write. One that correctly handles exception safety, the move_if_noexcept rule, aliased self-references, stateful allocators with their propagation traits, and constexpr construction (added in C++20 via std::construct_at) requires substantially more work. Reading the actual libstdc++ source or libc++ source after writing your own version clarifies why each piece of apparent complexity is there. None of it is arbitrary.

The three-pointer layout is convergent across C++ and Rust because the requirements lead there. The separation of allocation from construction is unavoidable if you control object lifetimes. The growth factor is a policy decision with real but bounded impact. The move_if_noexcept mechanism exists because C++ has to reconcile performance, safety, and user-defined constructors that run arbitrary code at arbitrary points in a reallocation sequence.

Rust solved some of these problems through ownership semantics. Java and Python traded others away through garbage collection and uniform reference semantics. C++ kept the full surface area, and implementing vector<T> yourself is one of the cleaner ways to understand what you’re working with when you reach for it.