One Problem, Five Answers: What Dynamic Arrays Look Like Across Languages

Implementing std::vector from scratch, as Quasar Chunawala’s walkthrough on isocpp.org explores, clarifies how much complexity the standard library absorbs quietly. But the exercise also raises a question: is that complexity necessary, or is it a C++-specific choice? Every mainstream language has a growable array, and comparing the implementations reveals what trade-offs each language made to get there.

The Universal Skeleton

Every dynamic array implementation stores the same three things: a pointer to an allocated buffer, how many elements have been constructed, and how many slots the buffer holds. C++ names these begin_, end_, and end_of_capacity_ and stores them as raw pointers. Go uses a struct literal with a pointer, length, and capacity as integers. Rust stores a NonNull<T> plus two usize fields. Python and Java tuck these into runtime-managed object headers.

The growth factor — how much extra capacity to claim on reallocation — varies considerably, and not arbitrarily.

Python: Minimize Wasted Space

CPython’s list stores an array of PyObject* pointers. Everything goes into that array as a pointer: integers, strings, custom objects. There is no value storage; the list itself is always an array of references.

The growth formula in Objects/listobject.c:

new_allocated = newsize + (newsize >> 3) + (newsize < 9 ? 3 : 6);

For large lists, newsize >> 3 is 12.5% of the current size, so the effective growth factor settles near 1.125x. For small lists, the fixed addend (3 or 6 slots) dominates. The result is a growth factor that starts high for tiny lists and asymptotically approaches 1.125x for large ones.

The small factor minimizes wasted capacity, which matters when every list element already points to a separately heap-allocated Python object. The memory overhead is already substantial. NumPy’s ndarray side-steps this by storing typed values contiguously in a C array, bypassing the Python object system entirely. The performance difference for numerical work comes almost entirely from that distinction.

Java: Boxing, and the Long Road Away From It

java.util.ArrayList<E> has the same structural problem: it stores an Object[] array, so primitive types require boxing. Storing int in an ArrayList<Integer> wraps each integer in a heap-allocated object with a header and reference count. The ArrayList holds references to those objects.

The growth formula is 1.5x:

int newCapacity = oldCapacity + (oldCapacity >> 1);

Bulk element transfer during reallocation uses System.arraycopy, which the JIT compiles to a native memory copy on hot paths. For reference types, a reallocation moves pointers rather than values, so the copy is cheap regardless of what the objects themselves contain.

Project Valhalla, still in active preview as of early 2025, targets this boxing overhead. Value classes, when finalized, would allow an ArrayList<int> to store raw integers in the backing array. Most of the performance gap between Java collections and C++ containers on numeric workloads traces to boxing; eliminating it would close that gap substantially.

Go: Slices as Values

Go’s slice is a three-word value:

type slice struct {
    array unsafe.Pointer
    len   int
    cap   int
}

This mirrors C++ vector’s three-pointer representation almost exactly, with integer counts instead of pointer offsets. The key difference is in how growth is triggered. append is a language built-in that returns a new slice header. If the existing backing array has remaining capacity, append returns a slice header pointing to the same array with len incremented. If not, append allocates a new array, copies all elements, and returns a header to the new backing store.

The caller must use the return value:

s = append(s, value)  // correct: new header captures any reallocation
append(s, value)      // wrong: new header discarded, growth is invisible

Go 1.18 changed the growth formula to smooth over the old sharp threshold at 1024 elements. The previous formula doubled capacity below 1024 and grew by 25% above it, producing a jarring transition. The new formula blends between the two, producing a growth rate that decreases gradually from around 2x for small slices toward 1.25x for large ones.

The slice-as-value design sidesteps iterator invalidation in a meaningful way. There are no iterators in Go; you index into the slice header you hold. If append reallocates, your local slice header is not updated — but that header is a value you own, not a pointer into shared state. Whether this is better or worse than C++‘s approach depends on whether you need shared mutable access to a growing sequence; Go’s model makes that pattern explicit and manual.

Go stores values directly in the backing array for value types (structs, integers, slices of structs). There is no boxing for non-interface types.

Rust: Moves Are Always Safe

Rust’s Vec<T> stores a NonNull<T> pointer rather than a raw *mut T. The NonNull wrapper communicates to the compiler that the pointer is never null, enabling downstream optimizations in types like Option<Vec<T>>, which can use the null bit pattern for None rather than a separate discriminant.

A zero-capacity Vec stores NonNull::dangling(), a valid but non-dereferenceable pointer. This avoids a null-check branch on the empty-vector fast path.

The growth factor is 2x, matching GCC’s libstdc++ and Clang’s libc++.

The structural difference from C++ is in what move means. In Rust, every move is a bitwise copy of the source value, after which the source is considered uninitialized and its destructor is not called. Moving a String into a new buffer is always equivalent to memcpy followed by forgetting the source. No special notation is required; this is the only kind of move Rust has.

This eliminates the std::move_if_noexcept problem entirely. C++ vectors must decide at compile time whether to move or copy each element during reallocation, depending on whether the move constructor is declared noexcept. A type without noexcept on its move constructor causes the vector to copy all elements on every reallocation, silently, with no diagnostic. In Rust, the equivalent path always moves via memcpy.

The borrow checker enforces iterator invalidation statically. You cannot hold a reference into a Vec and also call push on it; the borrow checker rejects the combination at compile time. In C++, v.push_back(v[0]) when reallocation occurs is undefined behavior that the compiler does not diagnose.

Rust’s custom allocator support for Vec<T, A> remains a nightly feature as of 2025, making it less mature than C++‘s allocator model on this dimension.

C++: The Full-Featured Version

C++ vector accumulates more machinery than any of the above because it accommodates requirements the others do not have: custom allocators wired through std::allocator_traits, the move_if_noexcept protocol, the strong exception guarantee, trivially copyable fast paths using memcpy, and the full set of allocator propagation traits for copy, move, and swap operations.

None of this complexity is accidental. It reflects a language where you can have:

Types whose move constructors can throw
Custom memory resources (pool allocators, arena allocators, std::pmr::polymorphic_allocator)
Types that are neither trivially copyable nor trivially movable but have efficient, side-effect-free moves
Stateful allocators where two allocators of the same type may not be interchangeable

Go and Rust avoid the move_if_noexcept problem by design: Go moves are pointer copies (reference semantics), Rust moves are always bitwise. Python and Java avoid it via the garbage collector. C++ is the only language in this comparison where a value type can have a throwing move constructor, and std::vector must handle that case correctly while still being efficient when the move constructor does not throw.

The allocator model is the other distinctive element. std::allocator_traits lets a vector work unchanged with a global new/delete, a per-thread pool, a fixed-size arena, or anything else that satisfies the allocator concept. No other language in this comparison offers that abstraction at this level of generality. Java’s garbage-collected heap and Go’s runtime allocator are not user-extensible. Rust’s Allocator trait for Vec is still nightly-only.

What the Comparison Shows

The three-word representation (pointer, length, capacity) is universal because it is the minimum state required for O(1) amortized appends without scanning. The growth factor ranges from 1.125x (Python) to 2x (Rust, GCC/Clang C++), reflecting different priorities: Python minimizes wasted space because every element already carries object overhead; Rust and C++ minimize reallocation frequency.

The deepest differences come from the ownership model. Go makes iterator invalidation a non-issue by treating slice headers as values. Rust prevents it at compile time. C++ specifies it precisely and trusts the programmer to follow the rules. Python and Java sidestep it via garbage collection and reference semantics.

That C++ vector is the most complex of the five is not a criticism. The exercise Chunawala walks through is useful precisely because C++ exposes the machinery that other languages abstract away or prohibit. Understanding why the noexcept annotation matters, why allocation and construction are separate operations, and why the growth factor has mathematical consequences teaches something about memory management that holds across all five languages, even in the ones where you never see it directly.