{"id":61,"date":"2025-04-05T17:26:15","date_gmt":"2025-04-05T15:26:15","guid":{"rendered":"https:\/\/dreamtalon.net\/?p=61"},"modified":"2025-04-05T17:30:13","modified_gmt":"2025-04-05T15:30:13","slug":"optimizing-stdunique-into-warp-speed","status":"publish","type":"post","link":"https:\/\/dreamtalon.net\/?p=61","title":{"rendered":"Optimizing std::unique into warp speed"},"content":{"rendered":"\n<p><code>std::unique<\/code> &#8211; linearly removes consecutive duplicates in a range (does not mean range must be sorted). Some time ago I had an issue it was too slow. And recently I had an idea how to solve it.<\/p>\n\n\n\n<p>100% of the time I have encountered usage of <code>std::unique<\/code> &#8211; or similar implementation in different namespace &#8211; the input range was sorted.<br>This opens the possibility to replace linear search in favor of binary search. And so I did, I have replaced linear search with binary search.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>template &lt;typename TIterator&gt;\nTIterator faster_unique( TIterator begin, TIterator end )\n{\n    TIterator value = begin;\n    TIterator next = value;\n\n    while ( next != end ) {\n        std::advance( next, 1 );\n        TIterator found = std::upper_bound( next, end, *value );\n        std::advance( value, 1 );\n        if ( found == end ) return value;\n        std::swap( *value, *found );\n        next = found;\n    }\n    return next;\n}<\/code><\/pre>\n\n\n\n<p>The result is as follows:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"599\" height=\"336\" src=\"https:\/\/dreamtalon.net\/wp-content\/uploads\/2025\/04\/chart.png\" alt=\"\" class=\"wp-image-62\" srcset=\"https:\/\/dreamtalon.net\/wp-content\/uploads\/2025\/04\/chart.png 599w, https:\/\/dreamtalon.net\/wp-content\/uploads\/2025\/04\/chart-300x168.png 300w\" sizes=\"auto, (max-width: 599px) 100vw, 599px\" \/><\/figure>\n\n\n\n<p>Conclusion: if you expect to have about 33% or more duplicates in a range, the <code>faster_unique<\/code> will be faster than standard implementation.<\/p>\n\n\n\n<p>Caveat: this is a different algorithm than <code>std::unique<\/code> and it is not a direct replacement. it requires the input range to be <strong>sorted<\/strong> to work properly, while <code>std::unique<\/code> does not.<\/p>\n\n\n\n<p>Testing and benchmarking has been done under Linux, with gcc 14.2.1, google benchmark 1.9.2<\/p>\n\n\n\n<p>Github repo with the code + testing + benchmark is available under the following link:<br><a href=\"https:\/\/github.com\/xmaciek\/faster_unique\">https:\/\/github.com\/xmaciek\/faster_unique<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>std::unique &#8211; linearly removes consecutive duplicates in a range (does not mean range must be sorted). Some time ago I had an issue it was too slow. And recently I had an idea how to solve it. 100% of the time I have encountered usage of std::unique &#8211; or similar implementation in different namespace &#8211; [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-61","post","type-post","status-publish","format-standard","hentry","category-bez-kategorii"],"_links":{"self":[{"href":"https:\/\/dreamtalon.net\/index.php?rest_route=\/wp\/v2\/posts\/61","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dreamtalon.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dreamtalon.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dreamtalon.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dreamtalon.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=61"}],"version-history":[{"count":3,"href":"https:\/\/dreamtalon.net\/index.php?rest_route=\/wp\/v2\/posts\/61\/revisions"}],"predecessor-version":[{"id":65,"href":"https:\/\/dreamtalon.net\/index.php?rest_route=\/wp\/v2\/posts\/61\/revisions\/65"}],"wp:attachment":[{"href":"https:\/\/dreamtalon.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=61"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dreamtalon.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=61"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dreamtalon.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=61"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}