· 6 min read ·

Rolling Your Own vector: Growth Factors, Exception Safety, and the noexcept Move Rule

Source: isocpp

Quasar Chunawala’s walkthrough on isocpp.org covers the fundamentals of writing your own vector<T>: allocate a buffer, track size and capacity, grow when needed, clean up on destruction. That is the right place to start. But the genuinely interesting parts of std::vector’s design are not in the basic resize loop. They are in three decisions that the standard delegates to implementations, and one decision that has sweeping consequences for user-defined types.

The Layout

Every major standard library implementation stores a vector using three pointers internally:

T* _begin;     // start of allocated buffer
T* _end;       // one past the last constructed element
T* _end_cap;   // one past the end of allocated memory

size() is _end - _begin. capacity() is _end_cap - _begin. This layout means sizeof(std::vector<int>) is exactly 24 bytes on a 64-bit platform when using a stateless allocator. All three major implementations, libstdc++, libc++, and MSVC STL, use this structure, though they differ in how they store the allocator alongside it.

libstdc++ makes the internal _Vector_impl struct inherit from the rebound allocator type, using empty base optimization (EBO) to avoid paying extra space for a stateless allocator. libc++ uses a __compressed_pair for the capacity pointer and allocator. MSVC STL uses its own _Compressed_pair template. The effect is the same: when you use std::allocator<T>, the allocator takes up zero bytes in the vector object.

The allocator is not an afterthought. The standard requires that a vector use its allocator for every allocation and deallocation, which is why homegrown implementations that call new[] directly are subtly wrong when someone passes a custom allocator. For a learning implementation under your own namespace, new and delete are fine; for anything that has to work with memory pools or arenas, you need std::allocator_traits.

The Growth Factor Debate

When a vector runs out of capacity, it must allocate a larger buffer, move or copy existing elements into it, and release the old one. The multiplicative factor by which it grows is not specified by the standard. The standard only requires amortized O(1) push_back.

libstdc++ and libc++ both use a factor of 2. MSVC STL uses 1.5. Facebook’s folly::fbvector also uses 1.5, and the comment block in that header is worth reading in full.

The argument for 2x is simplicity. If old capacity is n, new capacity is 2n. The math for proving amortized O(1) is a straightforward geometric series: the total work across all reallocations is bounded by 1 + 2 + 4 + ... + n/2 < n.

The argument for 1.5x is more subtle and concerns allocator memory reuse. Consider what happens with 2x growth: you allocate blocks of size 1, then 2, then 4, then 8, then 16. When you free the 8-element block and allocate the 16-element block, the sum of all previously freed blocks (1 + 2 + 4 + 8 = 15) is just barely less than 16. The freed memory can never be reused for the next allocation, because the next allocation always exceeds the total freed so far.

With a growth factor below the golden ratio (approximately 1.618), this eventually reverses. Freed blocks accumulate to a size large enough that the allocator can satisfy the next request from previously returned memory. The memory pool stays warmer, fragmentation decreases, and in practice this matters on workloads that create and destroy many large vectors.

Andrei Alexandrescu’s 2015 C++Now talk, “Two improvements to std::vector,” formalizes this analysis and forms the basis for folly::fbvector. Whether 1.5x meaningfully outperforms 2x in your specific workload depends on your allocator and access patterns, but the theoretical grounding for 1.5x is stronger than it first appears. For a homegrown implementation, pick 2x; it is simpler and correct, and if you need to optimize for allocator reuse, measure before changing anything.

Exception Safety and the noexcept Move Rule

This is where vector implementation gets genuinely difficult, and where most learning implementations quietly cut corners.

The C++ standard requires that push_back offer the strong exception guarantee: if it throws, the vector is unchanged. Implementing this correctly requires careful ordering of operations during reallocation:

  1. Allocate the new buffer. If this throws, nothing has changed.
  2. Transfer existing elements to the new buffer. If this throws mid-way, the old buffer still holds all original elements intact.
  3. Construct the new element at the end of the transferred elements. If this throws, roll back.
  4. Swap internal pointers (noexcept, cannot fail).
  5. Destroy the old buffer.

Steps 1 through 3 are the hard part, and step 2 is where user-defined types complicate the picture. During reallocation, the vector must decide whether to move or copy existing elements into the new buffer. If it moves elements and a move constructor throws halfway through, the source elements have been partially destroyed and the destination is partially constructed. There is no clean recovery, and the strong guarantee is lost.

The solution in the standard is std::move_if_noexcept: the vector moves elements only if the element type’s move constructor is declared noexcept. If the move constructor might throw, the vector falls back to copying.

// Simplified illustration of what happens during reallocation
if constexpr (std::is_nothrow_move_constructible_v<T>) {
    std::uninitialized_move(old_begin, old_end, new_begin);
} else {
    std::uninitialized_copy(old_begin, old_end, new_begin);
}

The practical consequence is significant. A type like this:

struct Widget {
    std::vector<int> data;
    Widget(Widget&& other) { /* not noexcept */ }
};

will be copied, not moved, every time a vector<Widget> reallocates. On a vector of 10,000 Widgets, that is 10,000 deep copies where 10,000 pointer steals could have sufficed. Marking the move constructor noexcept is not just a style choice; it determines which code path the standard library takes on your behalf.

libstdc++ implements this in __uninitialized_move_if_noexcept_a. libc++ uses __construct_range_forward. Both check is_nothrow_move_constructible at compile time and dispatch accordingly. Your learning implementation should do the same, even if the template machinery feels like overkill initially.

What the Standard Does Not Mandate

There is no small buffer optimization (SBO) in standard vector. SBO would mean storing short sequences inline in the object itself, avoiding heap allocation for small sizes. The problem is that SBO complicates the allocator interface significantly, and the design of iterator stability makes it awkward to have the buffer location change between “small” and “large” states. If you want SBO, llvm::SmallVector<T, N> and boost::container::small_vector<T, N> are the established options.

The growth factor is unspecified. The iterator invalidation rules, however, are specified in detail: push_back invalidates all iterators if reallocation occurs, only the past-the-end iterator otherwise. insert at an arbitrary position invalidates everything from that position forward. erase invalidates everything from the erased position forward. reserve invalidates all iterators if it increases capacity. These rules exist because the contiguous-memory guarantee means any reallocation requires a new base address, and the compiler cannot track those pointer updates for you.

What You Learn by Implementing It

Writing a vector implementation, even a simplified one, forces you to confront the C++ object model directly. You have to think about the difference between allocated memory and constructed objects. operator new gives you raw bytes; placement new constructs an object in those bytes; explicit destructor calls destroy without releasing memory. The separation between allocation and construction is something that higher-level abstractions hide; writing vector makes it unavoidable.

The noexcept move rule, the allocator propagation traits, the three-pointer layout, the growth factor trade-offs: none of these come up when you use std::vector day to day. Writing your own version is how you build intuition for why std::vector behaves the way it does when you swap two vectors holding different allocators, or why marking a move constructor noexcept can have measurable performance consequences on vector-heavy code.

Chunawala’s article is a good foundation. The interesting work starts when you try to match the standard’s guarantees rather than just its interface.

Was this interesting?