Optimizing std::unique into warp speed

std::unique – linearly removes consecutive duplicates in a range (does not mean range must be sorted). Some time ago I had an issue it was too slow. And recently I had an idea how to solve it.

100% of the time I have encountered usage of std::unique – or similar implementation in different namespace – the input range was sorted.
This opens the possibility to replace linear search in favor of binary search. And so I did, I have replaced linear search with binary search.

template <typename TIterator>
TIterator faster_unique( TIterator begin, TIterator end )
{
    TIterator value = begin;
    TIterator next = value;

    while ( next != end ) {
        std::advance( next, 1 );
        TIterator found = std::upper_bound( next, end, *value );
        std::advance( value, 1 );
        if ( found == end ) return value;
        std::swap( *value, *found );
        next = found;
    }
    return next;
}

The result is as follows:

Conclusion: if you expect to have about 33% or more duplicates in a range, the faster_unique will be faster than standard implementation.

Caveat: this is a different algorithm than std::unique and it is not a direct replacement. it requires the input range to be sorted to work properly, while std::unique does not.

Testing and benchmarking has been done under Linux, with gcc 14.2.1, google benchmark 1.9.2

Github repo with the code + testing + benchmark is available under the following link:
https://github.com/xmaciek/faster_unique

DreamTalon

Optimizing std::unique into warp speed

Optimizing std::unique into warp speed

Decentralized cooking

On using VRS as an antialiasing optimization