· 7 min read ·

Push Back a Thousand Times, Reallocate Ten: What's Actually Happening Inside std::vector

Source: isocpp

There is a gap between what std::vector looks like at the call site and what it is doing internally. You call push_back() a thousand times. The vector reallocates perhaps ten. From the outside, this looks like magic. From the inside, it is four interlocking design decisions that trade off memory, performance, and exception safety against each other in ways that are not obvious until you hit them in production.

Gracjan Olbinski’s walkthrough on isocpp.org names these four mechanisms clearly. What I want to do here is go a layer deeper on each one, pull in cross-language comparisons, and connect them to the broader design philosophy that explains why C++ made the choices it did.

Exponential Growth and Amortized O(1)

The core idea is familiar: when a vector runs out of capacity, it does not grow by one slot. It multiplies its capacity by some factor. The mathematical payoff is that the total amount of copying work across all reallocations forms a geometric series.

If the growth factor is r and the final size is n, the total elements copied across all prior reallocations is:

n + n/r + n/r² + ... = n * (r / (r - 1))

This converges to a constant multiple of n regardless of how many insertions were made. Divide by n insertions and you get O(1) average cost per insertion. The proof does not depend on which growth factor you pick, as long as it is greater than 1. The factor affects the constant, not the complexity class.

This is why you should use reserve() when you know the eventual size up front. If you are loading ten thousand records from a database into a vector, calling reserve(10000) collapses all those reallocations to zero. The amortized guarantee is a fallback for when you cannot predict size, not an excuse to skip preallocating when you can.

The Growth Factor Wars

Every standard library implementation has to pick a growth factor, and they do not agree.

  • GCC’s libstdc++ uses 2x
  • Clang’s libc++ uses 2x
  • MSVC’s implementation uses 1.5x
  • Java’s ArrayList uses 1.5x (oldCapacity + (oldCapacity >> 1))
  • Rust’s Vec uses 2x for small sizes
  • Go’s slice uses 2x for capacity under 1024, then roughly 1.25x beyond that

The interesting case is Facebook’s folly::fbvector, which explicitly documents its choice of 1.5x with a mathematical argument about heap memory reuse.

The argument goes like this. When a vector reallocates, it frees its old buffer and allocates a new, larger one. If your growth factor is 2x, the new buffer is always larger than the sum of all previously freed buffers. The allocator can never reuse those freed chunks for the new allocation because none of them are big enough, individually or combined. You are perpetually forcing the allocator to find fresh memory.

With a growth factor below the golden ratio (approximately 1.618), the math changes. The cumulative size of all freed buffers eventually exceeds the size of the next allocation. In theory, a smart allocator can pack the new buffer into the space left by previous ones. At Facebook’s scale, where heap fragmentation has real cost, this matters.

In practice, whether this benefit materializes depends heavily on allocator behavior, memory alignment, and whether freed blocks are adjacent. The theoretical argument is sound; the real-world payoff is situational. But the 1.5x camp has a principled reason for its choice, not just conservatism about memory waste.

For most application code, the difference between 1.5x and 2x is invisible. For systems managing many vectors simultaneously under memory pressure, it is worth understanding.

Contiguity and Cache

The reason std::vector dominates std::list for nearly every workload is not algorithmic; it is physical. A vector stores its elements in a single contiguous allocation. A linked list stores each element in a separate heap allocation, with a pointer connecting them.

Modern CPUs do not fetch individual bytes. They fetch cache lines, typically 64 bytes. When you iterate over a vector of 32-bit integers, you pull 16 integers into cache with a single miss. When you iterate over a linked list, each node access likely triggers a separate cache miss because the nodes are scattered across the heap.

The empirical gap is large. Sequential iteration over a vector runs in roughly 0.5 to 1 nanosecond per element on modern hardware. The same iteration over a linked list can cost 10 to 50 nanoseconds per element once the list is large enough that nodes no longer fit in L1 or L2 cache. The algorithmic complexity of both is O(n), but the constant factor differs by an order of magnitude.

std::deque sits between them. It allocates fixed-size chunks and stores chunks as an array of pointers. Sequential access requires traversing chunk boundaries, which adds some cache pressure, but far less than a linked list. The tradeoff is that deque supports O(1) insertion at both ends without shifting, while vector’s push_front is O(n).

The practical rule that has held up across a lot of codebases: default to vector. Switch to deque when you genuinely need efficient front insertion. Reach for list only when you need iterator stability across insertions and your profiler confirms that the cache penalty is acceptable.

The noexcept Trap

This is the mechanism most likely to surprise you in production.

When a vector reallocates, it has to move all its existing elements to the new buffer. In C++11 and later, moving is usually much cheaper than copying; for types like std::string or std::vector<T>, a move is O(1) while a copy is O(n).

But std::vector will not move your elements during reallocation unless the move constructor is marked noexcept. If it is not, the vector falls back to copying.

The reason is exception safety. The vector guarantees that if push_back throws, the vector is left unchanged, the strong exception guarantee. During reallocation, the vector is partway through transferring elements to a new buffer. If a move constructor throws midway through this transfer, some elements have been moved out of the old buffer and some have not. The old buffer is in a partially destroyed state. There is no way to roll back. The strong guarantee is broken.

Copying does not have this problem. If a copy constructor throws, the old buffer is untouched. The vector can release the partial new buffer and remain valid.

So the implementation uses std::move_if_noexcept, which returns an rvalue reference if the move constructor is noexcept, and an lvalue reference otherwise:

// Simplified illustration of reallocation behavior
for (auto& element : old_buffer) {
    allocator_traits::construct(
        alloc,
        new_buffer + i,
        std::move_if_noexcept(element)  // move only if safe
    );
}

The practical trap: if you write a type with a move constructor and forget noexcept, every vector reallocation silently copies instead of moves. For types with cheap copies (small structs, trivially copyable types) this is harmless. For types managing large resources, it is a performance cliff that does not show up until the vector grows past its initial capacity.

struct Expensive {
    std::vector<std::string> data;  // contains thousands of strings

    // No noexcept -- vector will copy this instead of moving during reallocation
    Expensive(Expensive&& other) {
        data = std::move(other.data);
    }

    // Fix: mark it noexcept
    // Expensive(Expensive&& other) noexcept {
    //     data = std::move(other.data);
    // }
};

std::vector<Expensive> vec;
vec.push_back(Expensive{/* ... */});
// Each reallocation now deep-copies the data member

The compiler will not warn you. The code is correct; it just does unnecessary work. Adding noexcept to move constructors is one of the highest-leverage mechanical changes you can make to performance-sensitive code.

As a side note, compiler-generated move constructors are implicitly noexcept if all member move constructors are noexcept. If your type is composed entirely of standard types with well-behaved move semantics, the default-generated move constructor will already satisfy the requirement. The trap closes on types that define their own move constructors manually and forget the annotation.

The Sum of Four Parts

These four mechanisms are not independent. Exponential growth reduces reallocation frequency, which reduces how often the noexcept decision gets made. The growth factor determines how much memory gets freed and potentially reused. Contiguity provides cache performance regardless of how the growth happened.

What is worth taking away from all of this is not just the mechanism list, but the design philosophy underneath it. std::vector makes strong guarantees (amortized O(1) push_back, strong exception safety, contiguous layout) and implements them with deliberate, mathematically motivated tradeoffs. The noexcept requirement for moves is not an oversight; it is a conservative choice that preserves a correctness guarantee at some performance cost, and it transfers the responsibility to the type author.

For most code, understanding these mechanisms is enough to avoid the common pitfalls: skipping reserve() when size is known, using list out of habit, and writing move constructors without noexcept. For library authors or performance-critical systems work, the growth factor arithmetic and memory reuse arguments are worth understanding in detail before you reach for a custom container.

The interface is simple. The contract behind it is not.

Was this interesting?