Check if a vector contain duplicate numbers

Pages: 12
Yes, always measure. As N goes up, one would expect the set ( average O(N) ) to be significantly faster, even if we do not copy the large vector for the sort. (Till memory becomes the bottleneck).

For instance, with 106 random numbers, without a copy of the vector, on the same libstdc++ implementation:

g++ -std=c++20 -O3 -Wall -Wextra -pedantic-errors -Wno-unused-variable main.cpp && ./a.out
echo =====================================
clang++ -std=c++2a -O3 -Wall -Wextra -pedantic-errors -Wno-unused-variable main.cpp && ./a.out
sort: 81.0783 millisecs
 set: 21.8664 millisecs
=====================================
sort: 82.6397 millisecs
 set: 14.9261 millisecs

http://coliru.stacked-crooked.com/a/580c93c2a020d2c4
That is more like the result I expected - with std::unordered_set showing the better performance over sort.

For me, I get:


sort: 80.05 millisecs
 set: 12.07 millisecs

PS. For the set method, the time taken is pretty much dependent upon the position of the first duplicate. If the first 2 elements are the duplicate, then the sort method will still sort - incurring that time cost - but the set method would return almost immediately.

For me, for 10^6 ascending numbers with only the last 2 duplicate, the set method gives 238 ms - as opposed to 0ms when the first 2 elements are duplicate!
Last edited on
Topic archived. No new replies allowed.
Pages: 12