Run Time with Lambda Function

May 29, 2011 at 12:44am
[Edit] I apparently changed something (maybe a compiler setting?) other than what is shown here and that is what was responsible for the slowdown. I still cannot figure out what that something is, but at least this code was not responsible for it.

I recently watched a lecture on uses for lambda functions with STL algorithms and I thought I would try to apply what I had learned to speed up some code that I had been given.

I replaced:

1
2
3
4
5
6
7
8
9
10
11
12
13
template<typename Type>
Type get_resource_value(Resource& resource) { return get_resource_value<Type>(resource.name); }

template<typename Type>
Type get_resource_value(string resource_name) {
	Type value;
	list<GetResource *>::iterator index = resources_action_gets.begin() ;

	while ((index != resources_action_gets.end()) && (**index).resource.name != resource_name) 
		index++;
	
	//more code
}


with

1
2
3
4
5
6
7
8
9
10
11
12
13
template<typename Type>
Type get_resource_value(Resource& resource) {
	Type value;
	list<GetResource *>::iterator index;

        index = std::find_if(resources_action_gets.begin(),
                             resources_action_gets.end(),
                             [&] (GetResource* resourceGet) -> bool {
                                  return &(resourceGet->resource) == &resource;}
                            );

        //more code
}


This function is called repeatedly throughout the simulation.

After making this change, the run time almost doubled. I figured that doing a pointer comparison would be much faster than a string comparison and that using the STL find_if would be as fast as if not faster than the while loop.

I am using MS Visual Studio 2010 and both runs are being done with the code compiled in Release mode. Both versions yield the same results.

What am I missing here? Am I being an idiot somewhere? Thanks.
Last edited on May 31, 2011 at 12:30am
May 29, 2011 at 4:00am
I've seen an std::map<T *,T2> (or it might have been a set) take 5-10 times longer to find an element that an otherwise identical container ordered by integers.
Have you tried modifying your first snippet to use pointers instead of strings and comparing the results?
May 30, 2011 at 12:31am
Thanks for the response. I did go back and modify the first bit of code to compare pointers instead of strings.

1
2
while (index != resources_action_sets.end() && &((*index)->resource) != &resource)
    index++;

It took just as long as the second piece of code.

So now I guess my question is, why would a pointer comparison be slower than a string comparison? Or is it the reference operator? Am I correct in my understanding that a pointer comparison checks that the two pointers point to the same address in memory, not that all members of the object pointed to are identical?

Thanks for the help.
May 30, 2011 at 6:22am
I don't think string comparison could be faster than pointer comparison. My guess is there is some hidden functionality here, i.e. operator '&' is overloaded for 'Resource'.
May 30, 2011 at 8:30am
First check for odd operator overloading. If you don't find anything, you're seeing the same thing I saw. Your compiler is generating unusually bad code for pointer comparisons. I can't say why this is, as comparing two pointers shouldn't take longer than comparing two registers, but I can say that string comparisons are strictly integer-only, and with some luck, many string comparisons can stop at the first character.
Try casting both operands to an integer type and measure again.
May 30, 2011 at 9:14am
First, I recommend that you replace Line 9 in your second block of code with the string comparison that you did in your first block of code.

In my experience, the best way to track down the problem is to morph block one slowly into block two and see when you take the performance hit...

...if there are too many differences (especially hidden differences) between the two blocks of code, the true cause for the slowdown may not be immediately obvious.
May 30, 2011 at 4:27pm
I suppose the problem could be in the "more code" section that you left out, because the second variant is 45-55% faster than the first one for me (tested with gcc 4.5, a list of 10,000 GetResource elements having their index as their name, with the get_resource_value functions simply returning the iterator they found and 10,000 consecutive calls for each function, passing the 7777th list element each time).

It would be interesting to see the assembler code that VC++ generates for each of the functions, as it might give a hint to where the problem is.

For comparison, GCC generates the following code for the function with find_if+lambda:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
	.cfi_startproc
	movq	resources_action_gets(%rip), %rax
	cmpq	$resources_action_gets, %rax
	je	.L21
	cmpq	16(%rax), %rdi
	je	.L27
	.p2align 5,,24
	.p2align 3
.L25:
	movq	(%rax), %rax
	cmpq	$resources_action_gets, %rax
	je	.L21
	cmpq	16(%rax), %rdi
	jne	.L25
.L21:
	rep
	ret
	.p2align 5,,7
	.p2align 3
.L27:
	.p2align 4,,7
	rep
	ret
	.cfi_endproc
Last edited on May 30, 2011 at 4:48pm
May 31, 2011 at 12:27am
Thanks for the responses. Sorry for the long delay. This is a side project that I have been working on.

After trying all of the suggestions listed here and staring at assembly code for awhile (the assembly appeared fine based on my limited exposure), I decided to do one last sanity check. I changed the code back to original and ran again. This time I got times with the original code very similar to the code with pointer comparisons.

I had copied my baseline folder to a separate folder to work on this code, so that I could compare outputs a little easier. Somewhere, somehow, something must have changed between the two versions other than the code changes. When I copied the pointer comparison code into what had been my "baseline" folder, it ran as fast as the baseline code.

After diffing the folders and looking through all of the Visual Studio settings I know of, I still cannot find the difference between the folders. There has to be something, but I have no idea what it is.

Anyway, sorry for wasting everyone's time. Thanks for the help.
Topic archived. No new replies allowed.