I met a problem when I try to do this: removing the duplicated elements from a vector. I know there is a generic algorithm unique() working with sort() can remove consecutive duplicates in range from a vector. But the unique will keep the first element in each group of consecutive equal elements.
The thing I want is that if an elements showing up in vector more than once, it should be discarded.
For example, original vector = {1, 2, 3, 3, 4, 5, 4, 6, 7, 9, 9}, I want get the result = {1, 2, 5, 6, 7}.
I want to know if there is any efficient way to do that.
I try to use the iterator with the erase(), but I failed, system always report errors.(WinXP + Visual Studio)
One quick way I could think off is, use a temp hash table. While iterating through vector, just check whether the current element of vector is present in the hash table. If it is not, then add it into the hash table with key as the current vector element and value as the count of that element. But if the value is present then just increment the value of that key in hash map. Once you are done with this,
just loop through the list of vectors again and check for which key the value is more than 1 in hash table. And delete that element from the vector.
Running time of this algorithm will still be O(n).
Hi, jsmith, thanks for reply, but that is not want I want.
For example, the vector in my origian post is {1, 2, 3, 3, 4, 5, 4, 6, 7, 9, 9}, corresponding set will be {1, 2, 3, 4, 5, 6, 7, 9}, the result I want is {1, 2, 5, 6, 7}.
Meaning that I only want the elements which are only showing once.
It is also worth nothing the reason that this results in undefined behavior. After the first call to erase, the iterator is invalidated. vector::erase returns an iterator. You need to assign the return value to it if you want to end up with a valid iterator after the first erase call.
Moreover, in your example the duplicates aren't contiguous therefore the for loop concept wouldn't work anyhow.
Looks like kevinchkin is the only one that suggested a valid solution for what the OP wants. My thoughts were along similar lines. I'd use the foreach algorithm and create a functor that builds up the table. When finished you can use the table to remove all of the elements. You could even use the remove_if algorithm to operate on the same functor used by foreach.
Iter looks like taking the form of std::vector<int>, so the "first", "last" and "not_this_one" seem to be Iter::iterator (i.e., std::vector<int>::iterator), this does not compile but illustrating the idea:
1 2 3 4 5 6 7 8 9 10 11
template< typename Iter, typename Comp >
Iter find_if_except( std::Iter::iterator first, std::Iter::iterator last,
std::Iter::iterator not_this_one, Comp compare )
{
Iter res = std::find_if( first, last, compare );
if( res == not_this_one ) {
std::advance( res, 1 );
res = std::find_if( res, last, compare );
}
return res;
}
With jsmith's code above, I got the following compiling error:
error C2782: 'Iter find_if_except(Iter,Iter,Iter,Comp)' : template parameter 'Iter' is ambiguous
(0) Get a std:vector<int> result; // store the result
(1) Get a std:vector<int> vec; // populated with N integers to be processed
(2) Get a std:vector<bool> duplicated(vec.size(), false); // initially every int is not duplicated.
(3) Assuming i and j be in the range of [0,vec.size() ),
for each vec.at(i) and vec.at(j) in the range of [0,vec.size() )
if vec.at(i) == vec.at(j),
duplicated.at(i)=duplicated.at(j) = true; // assignment
(4) For i in the range of [0,vec.size)
if (!duplicated.at(i) )
result.push_back(vec.at(i));
EDIT: For a vector of N completely unique integers, the two for statements make O(0)times of assignments on line 7) rather than O(N*N)in theory; for a vector of N completely duplicated integers, the two for statements make O(N-1) times of assignments (on line 7) rather than O(N*N) in theory. So it is an O(N) algorithm.
(0) Get a std:vector<int> result; // store the result
(1) Get a std:vector<int> vec; // populated with N integers to be processed
(2) Get a std:vector<bool> duplicated(vec.size(), false); // initially every int is not duplicated.
(3) Assuming i and j be in the range of [0,vec.size() ),
for each vec.at(i) and vec.at(j) in the range of [0,vec.size() )
if vec.at(i) == vec.at(j),
duplicated.at(i)=duplicated.at(j) = true; // assignment
(4) For i in the range of [0,vec.size)
if (!duplicated.at(i) )
result.push_back(vec.at(i));
EDIT: For a vector of N completely unique integers, the two for statements make O(0)times of assignments on line 7) rather than O(N*N)in theory; for a vector of N completely duplicated integers, the two for statements make O(N-1) times of assignments (on line 7) rather than O(N*N) in theory. So it is an O(N) algorithm.
Probably my line 16 should change to be an iterator instead of const_iterator, because the
compiler is probably calling the non-const version of begin() and end(). Else explicitly
construct a const_iterator from the return of begin() and end().