Why is SET slower than VECTOR in this case?

Hello.

The code below runs faster with a Vector rather than with a Set

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
vector<string>files{ "vocabolario1.txt", "vocabolario2.txt", "vocabolario3.txt" };

// HERE is the data structure
// Change vector with set and you'll notice the difference!
vector<string> fusion; 

auto _begin = std::chrono::high_resolution_clock::now();

for (string filename : files)
{
	ifstream read(filename);
	string word = "";

	cout << "Scanning " << filename << " ...\n";

	int contaParole = 0; 

	while (getline(read, word))
	{

		contaParole++;
		if (contaParole % 1000 == 0)			
				cout << "Reading " << word << " ...\n";		
	
		transform(word.begin(), word.end(), word.begin(), ::tolower);

		auto found = lower_bound(fusion.begin(), fusion.end(), word);

		if (found == fusion.end() || word != *found)
			fusion.insert(found, word);
	}

	read.close(); 
	read.clear();
}

ofstream write("italiano.txt");

for (string s : fusion)	
	write << s << endl;

auto _end = chrono::high_resolution_clock::now();

cout << "Passed " << chrono::duration_cast<std::chrono::seconds>(_end - _begin).count();


It takes me ~33 seconds with vector.

When I change fusion's data structure from vector to set... the program runs a lot slowly!

Here you can download the three text files I used in this program: http://www.filedropper.com/files_23

--------------------------------

Notice, though, that if I delete the lines 27 and 29 from the code above (leaving Insert alone) the set will be faster than the Vector!

(yes, I know std::set checks if there are duplicates, so we don't need to do the search manually...)

What is happening?
Last edited on
If i had to guess, there are far fewer cache misses with a vector. In a vector, everything is stored next to each other in memory, and searching through the vector runs through it sequentially.

To examine an item in a vector, as the lower_bound function does, requires fetching that item from memory. Fetching one item in a vector will, if the items are small enough, also fetch into the processor many other items next to it (because memory is fetched in big lumps, and since in a vector the items are all next to each other, when the processor fetches one item in the vector, that big lump of memory will contain the items right next to it). If the next item wanted has already been loading into the processor (into the processor's memory cache) there is no need for another expensive fetch all the way out to main memory.

Having to go all the way to main memory because the item needed is not already cached on the processor is known as a "cache miss"; the item needed is not already in local cache, and the processor has to fetch it all the way from main memory, which takes a long time.

To examine an item in a set, as the lower_bound function does, requires fetching that item from memory. A typical set will not store items next to each other in memory, so as you iterate through the set, every item looked at has to be fetched from main memory, which is very expensive (i.e. takes a long time).

What you've stumbled onto here is that to get performance, it's not enough to understand C++; you also need to understand the hardware it's running on.

You could test this hypothesis by iterating through the vector and outputting the memory locations of each element, and repeat with the set; see if they are contiguous in the vector and not contiguous in the set.
Last edited on
Thank you for the explanation.

The strange thing here is that, removed lines 27 and 29, the set behaves in the opposite way!

It's very very fast compared to the vector!

Why do those lines slow down the set?
Obviously I wasn't as clear as I hoped.

Line 27 looks through the set, one item at a time, in order. Looking at each item in a set takes a long long time. Very slow. This is a very slow thing.

Looking at each item in a vector, one item at a time, in order, is very fast. Very fast. This is a very fast thing

If you remove line 27, then you are no longer doing the very slow thing for a set! So it's no longer very slow!

Inserting; inserting into a vector means that everything after the place that you insert it has to be moved in memory. Inserting into a vector means a lot of items have to be moved in memory, so this takes time. This is a slow thing.

Inserting into a set does not require moving anything else. So this is a fast thing.


You need to learn more about the containers and how they are made; what they look like in memory. This will help you understand their performance.
Last edited on
The discussion in this thread may be of interest: http://www.cplusplus.com/forum/beginner/206320/

Though the comparison here is between list and vector http://www.cplusplus.com/forum/beginner/206320/#msg976016
the same considerations would apply for insert performance comparison between set and vector.

This is a common fallacy that many programmers suffer from:
The right choice of vector, list, deque or any of the other containers depends largely on what operations you plan to do on it. If you have a lot of items in the container and you do a lot of insertions/deletions from the middle, then vector will make your i7 processor run like a Z80
no doubt exacerbated by memories of what career teachers had taught them years ago.
Thank you everyone.

Line 27 looks through the set, one item at a time, in order


That's because I'm using std::lower_bound

I should use std::set::lower_bound instead, right?
Topic archived. No new replies allowed.