Why C++ Has Fifteen Ways to Filter a Container

Bartlomiej Filipek catalogued fifteen distinct approaches to filtering a container in modern C++. Fifteen. In most languages, filtering is a one-liner with a single idiomatic path. The Python programmer writes a list comprehension. The Rust programmer calls .retain() for in-place mutation or .filter().collect() for a new collection. The Java programmer uses .stream().filter().collect(). These languages have one or two canonical forms, and everyone uses them.

C++ has fifteen because every generation of the standard added a new solution without removing the old ones. That history is worth understanding, because the fifteen approaches are not equally valid today. They are a stratigraphic record, and reading that record tells you both how to write better modern C++ and why the language is the way it is.

The Original Problem: Algorithms Don’t Know About Containers

The STL was designed around a strict separation of concerns. Algorithms operate on iterator pairs. Containers own data and provide iterators. These two worlds are deliberately kept apart, and that separation is what makes STL algorithms composable across container types.

The problem is that “remove” an element from a container is inherently a container-level operation. An algorithm that receives two iterators has no way to call erase() on the container, because it doesn’t have a reference to the container. So std::remove_if does not actually remove anything. It partitions the range, moving surviving elements to the front and leaving the “removed” elements in a valid-but-unspecified state at the back, then returns an iterator to the new logical end.

To actually shrink the container, you need a second call:

v.erase(
    std::remove_if(v.begin(), v.end(), [](int x){ return x % 2 == 0; }),
    v.end()
);

Scott Meyers documented this in Effective STL in 2001. It became the canonical pattern for in-place filtering in C++98 and C++11. It works, it’s efficient (a single pass through the data, then an O(1) size adjustment), and it is deeply ugly. The two-call dance is non-obvious to anyone who hasn’t been told why it exists, and the name remove_if is actively misleading.

C++11: Better Syntax, Same Patterns

C++11 brought lambdas, which made the predicate syntax tolerable, and std::copy_if for building a new container from a filtered range:

std::vector<int> dst;
std::copy_if(src.begin(), src.end(), std::back_inserter(dst),
             [](int x){ return x % 2 == 0; });

This is the eager copy approach. It creates a new container containing only the passing elements. It’s straightforward, readable, and correct. If you want to preserve the original and produce a filtered result, this is the right tool for C++11 code.

The erase-remove idiom and copy_if cover the two fundamental operations: mutate in place, or produce a new collection. Between these two, you have everything you need. C++11 codebases should use exactly these two approaches and nothing else.

C++20: Two Important Additions

std::erase_if finally wraps the erase-remove idiom into a single, correctly named function:

std::erase_if(v, [](int x){ return x % 2 == 0; });
// returns the count of erased elements

For associative containers (std::map, std::set), erase_if iterates and calls erase per element, because std::remove_if cannot work on non-contiguous structures. For std::vector, it internally performs the erase-remove. The interface is uniform. There is no reason to write the two-call idiom in new code targeting C++20 or later.

std::views::filter is the more conceptually significant addition. It creates a lazy view over the range rather than copying or mutating anything:

auto even_view = v | std::views::filter([](int x){ return x % 2 == 0; });

for (int n : even_view)
    std::cout << n << ' ';  // predicate evaluated here, on demand

No elements are copied. No new container is allocated. The predicate is evaluated only when an element is accessed. This is the same model as Rust’s iterator adaptors, Python 3’s filter(), Java’s intermediate stream operations, and Haskell’s filter function.

The composability is the point:

auto result = people
    | std::views::filter([](const Person& p){ return p.age >= 18; })
    | std::views::transform([](const Person& p){ return p.name; })
    | std::views::take(10);

This constructs a pipeline. No intermediate containers are created. The entire chain processes one element at a time when iterated. For long pipelines with multiple transformations, this can be substantially more memory-efficient than chaining eager operations.

The ranges versions of existing algorithms (std::ranges::copy_if, std::ranges::remove_if, std::ranges::partition) also add projections, which let you filter on a member or computed value without writing a wrapping lambda:

std::vector<Person> adults;
std::ranges::copy_if(people, std::back_inserter(adults),
                     [](int age){ return age >= 18; },
                     &Person::age);  // projection: extract age before predicate

C++23: Closing the Materialization Gap

std::ranges::to<> addresses the one remaining awkwardness with lazy views: getting a concrete container out of a pipeline.

// C++23
auto evens = v | std::views::filter([](int x){ return x % 2 == 0; })
               | std::ranges::to<std::vector<int>>();

Before C++23, materializing a filter_view into a vector required a detour through std::vector’s range constructor or an explicit std::ranges::copy_if. With ranges::to, the pipeline syntax is complete: compose lazily, materialize explicitly.

Lazy vs Eager: The Real Design Decision

Beneath the syntactic variety, there are only two fundamental questions. Do you want to mutate in place, or produce a new collection? Do you want eager evaluation, or lazy evaluation?

Eager approaches (copy_if, erase_if, manual loops) execute immediately and produce a concrete result. They are appropriate when:

You will iterate the result multiple times (lazy views re-evaluate the predicate on each traversal)
You are constructing data to pass to code that expects a container, not a range
The predicate is expensive and you want to pay for it exactly once

Lazy approaches (views::filter and its composition) are appropriate when:

You are consuming the result once, possibly only partially
You are building a pipeline where the filter is one step among several transformations
Memory allocation for an intermediate container is a concern

Performance differences between the approaches are real but often small for typical workloads. For a std::vector<int> with one million elements and a ~50% pass rate, the in-place methods (erase_if, erase-remove) tend to outperform copy-based methods slightly due to better cache behavior, since no second buffer is allocated. Eager copy approaches are roughly equivalent to a manual loop with reserve(). Lazy views introduce some iterator abstraction overhead that compilers with optimization enabled usually eliminate through inlining and constant propagation.

The parallel execution policy (std::execution::par_unseq) can be applied to std::copy_if, but std::back_inserter is not thread-safe. In practice, parallel filtering requires a pre-allocated output buffer with an atomic index or a two-pass approach: parallel partition, then resize. std::partition with a parallel policy is cleaner here.

What Rust Does Differently

Rust makes the in-place vs copy distinction explicit in the type system and the naming. Vec::retain mutates in place; its signature makes clear that the predicate controls what to keep (the opposite of remove_if’s convention):

v.retain(|&x| x % 2 == 0);  // keep even numbers

For a new collection, you compose iterators and collect:

let evens: Vec<i32> = v.iter().filter(|&&x| x % 2 == 0).cloned().collect();

The unstable extract_if (formerly drain_filter) handles the case where you want to remove elements and do something with them simultaneously. There is no equivalent to C++‘s fifteen-way proliferation because Rust’s ownership model forces you to be explicit about whether you are consuming, borrowing, or mutating data. That explicitness constrains the design space in a way that C++‘s implicit copying and loose ownership do not.

Which Approach to Use

For code targeting C++20 or later:

Mutate in place: std::erase_if. Full stop. Do not write the erase-remove idiom in new code.
Produce a new container: std::ranges::copy_if with std::back_inserter, or for C++23, | std::views::filter(...) | std::ranges::to<std::vector>().
Compose a pipeline consumed once: std::views::filter with the pipe syntax.

For C++11/14/17 code:

Mutate in place: erase-remove idiom.
Produce a new container: std::copy_if.
Consider Range-v3 if you need lazy composition before C++20.

The fifteen approaches in Filipek’s article are real, and each corresponds to a legitimate historical context or performance niche. But most of them are vestigial in new code. The version of the story worth internalizing is not the full taxonomy, but the trajectory: from a necessary two-call workaround, to a single function, to a composable lazy pipeline. C++ got there eventually. It just kept everything it built along the way.