· 6 min read ·

Rolling Your Own vector<T>: Where the Correctness Traps Hide

Source: isocpp

The basic skeleton comes together in an afternoon: three data members, a few constructor overloads, a destructor. Writing it is a useful exercise, which Quasar Chunawala’s article on isocpp.org explores well. What the exercise reveals, as the edge cases accumulate, is how much the standard library quietly handles for you.

The Memory Model

The first mistake in a naive implementation is allocating with new T[n]. That default-constructs n elements, which is wrong; you want uninitialised memory that you populate on demand. The correct approach uses raw allocation followed by placement new:

template <typename T>
class vector {
    T*          data_;
    std::size_t size_;
    std::size_t capacity_;
};
// Allocate raw memory, no construction
data_ = static_cast<T*>(::operator new(capacity_ * sizeof(T)));

// Construct in place when needed
new (data_ + size_) T(value);
++size_;

If you want to support custom allocators, which the standard requires, you go through std::allocator_traits instead:

using Alloc  = std::allocator<T>;
using Traits = std::allocator_traits<Alloc>;

Alloc alloc;
T* data = Traits::allocate(alloc, n);          // raw memory
Traits::construct(alloc, data + i, value);     // construct in place
Traits::destroy(alloc, data + i);              // explicit destructor call
Traits::deallocate(alloc, data, n);            // free raw memory

The allocator_traits layer exists so containers can work with stateful allocators, pool allocators, and arena allocators without changing their core logic. Skipping it means your vector only works with the default allocator, which is fine for a learning exercise but not for production code.

Growth Strategy

When push_back exhausts capacity, the vector must grow. The strategy must be multiplicative to achieve amortized O(1) insertions. Additive growth, growing by a fixed amount each time, turns a sequence of n push_back calls into O(n²) total work.

The growth factor is where mainstream implementations diverge. GCC’s libstdc++ and LLVM’s libc++ both double capacity on reallocation (2x). MSVC’s STL uses 1.5x. Facebook’s folly::fbvector documents the 1.5x choice explicitly: with 2x growth, each new allocation is larger than the sum of all previous allocations combined, so freed memory can never satisfy the next reallocation request. With a factor below the golden ratio (φ ≈ 1.618), freed blocks eventually accumulate enough total size to cover the next allocation. The 1.5x factor is a practical approximation of this property and reduces fragmentation in long-lived vectors with many reallocations.

For practical purposes: a 2x factor on a vector growing from capacity 1 to 1024 causes 10 reallocations. A 1.5x factor causes 16. The 2x choice reallocates less often; the 1.5x choice is friendlier to the allocator over longer lifetimes. For most application code, the right answer is reserve(n) before a growth loop, which reduces the reallocation count to zero regardless of factor.

Exception Safety and move_if_noexcept

Reallocation contains the most subtle correctness requirement in the entire implementation. A naive version looks like this:

void reallocate(std::size_t new_cap) {
    T* new_data = static_cast<T*>(::operator new(new_cap * sizeof(T)));
    for (std::size_t i = 0; i < size_; ++i) {
        new (new_data + i) T(std::move(data_[i]));  // move into new buffer
        data_[i].~T();                               // destroy old
    }
    ::operator delete(data_);
    data_     = new_data;
    capacity_ = new_cap;
}

The problem surfaces when T’s move constructor can throw. If it throws on element 5 of 10, elements 0 through 4 are destroyed in the old buffer and partially constructed in the new one. The vector is in an unrecoverable state. The strong exception guarantee, the promise that a failed operation leaves the container unchanged, is broken.

The standard’s solution is std::move_if_noexcept. If T’s move constructor is declared noexcept, it moves; otherwise it copies:

for (std::size_t i = 0; i < size_; ++i) {
    new (new_data + i) T(std::move_if_noexcept(data_[i]));
}

With copying, a throw leaves the original buffer intact and the strong guarantee holds. With a noexcept move, there is nothing to handle. The dispatch happens at compile time through std::is_nothrow_move_constructible<T>. GCC’s implementation uses this in _M_realloc_insert via __and_ metafunction chains. The practical implication for type authors: marking move constructors noexcept is not just an optimisation hint. It enables containers to use moves during reallocation while preserving the strong guarantee. Types that omit noexcept on their move constructors force containers to copy, which can be significantly more expensive for types with deep ownership.

The Rule of Five

A vector owns heap memory, so the compiler-generated copy constructor and copy assignment are wrong by default. Both would shallow-copy the pointer and produce a double-free on destruction. All five special member functions need explicit definitions.

The move constructor is straightforward: steal the resource and null out the source.

vector(vector&& other) noexcept
    : data_(other.data_), size_(other.size_), capacity_(other.capacity_)
{
    other.data_     = nullptr;
    other.size_     = 0;
    other.capacity_ = 0;
}

Copy assignment has two considerations: self-assignment must be handled explicitly, and existing capacity should be reused when sufficient.

vector& operator=(const vector& other) {
    if (this == &other) return *this;

    for (std::size_t i = 0; i < size_; ++i)
        data_[i].~T();

    if (capacity_ < other.size_) {
        ::operator delete(data_);
        capacity_ = other.size_;
        data_ = static_cast<T*>(::operator new(capacity_ * sizeof(T)));
    }

    size_ = other.size_;
    for (std::size_t i = 0; i < size_; ++i)
        new (data_ + i) T(other.data_[i]);

    return *this;
}

The copy-and-swap idiom simplifies this at the cost of always allocating, even when the destination has sufficient capacity. For code that reassigns vectors in tight loops, the direct approach above is worth the extra lines.

Proper Destruction

Destruction is easy to get wrong in a specific way: you cannot use delete[] because you never used new[]. The correct sequence destroys each element explicitly, then releases the raw memory:

~vector() {
    for (std::size_t i = 0; i < size_; ++i)
        data_[i].~T();
    ::operator delete(data_);
}

For trivially destructible types like int or plain structs, the compiler eliminates the destruction loop entirely. You still write it because the template must handle the general case.

What the Standard Adds

Chunawala’s article provides a solid foundation for a dev-namespace vector. A standard-conforming implementation adds several requirements on top.

Allocator propagation. std::allocator_traits defines propagate_on_container_copy_assignment, propagate_on_container_move_assignment, and propagate_on_container_swap. With stateful allocators, getting these wrong causes undefined behaviour. The standard specifies precisely when an allocator must be copied or moved alongside the container’s data.

Exact iterator invalidation rules. push_back invalidates all iterators if reallocation occurs, none if it does not. insert invalidates iterators from the insertion point to the end. A conforming implementation must guarantee these semantics, not approximate them.

constexpr support (C++20). std::vector became usable in constant expressions in C++20, requiring constexpr allocate and deallocate, plus std::construct_at for placement new in constant evaluation contexts. The machinery is non-trivial and touches the constant evaluation rules directly.

Small Buffer Optimisation

The standard does not require small buffer optimisation for vector, and mainstream implementations avoid it. SBO would store small element counts inline in the vector object itself, but this breaks address stability: after a move, elements would live at a different address than before. std::string accepts this trade-off; std::vector does not.

The pattern exists under other names. llvm::SmallVector<T, N> stores the first N elements in inline storage and spills to the heap beyond that. absl::InlinedVector<T, N> from Abseil works similarly. Both document the address-stability caveat in their interfaces.

The Value of the Exercise

Writing a vector from scratch is not about replacing std::vector. It is a controlled environment for encountering the real constraints that shape generic container design: how exception safety interacts with move semantics, why allocator_traits exists as an abstraction layer, and what iterator invalidation guarantees actually cost to implement correctly. The standard library’s std::vector is what it is because each of those constraints has been resolved in the most general and portable way available. Understanding how they were resolved is worth considerably more than the exercise itself.

Was this interesting?