Resetting vector/array.

Jun 13, 2012 at 12:08pm
Hello,

In a piece of code I'm working on, I end up resetting a "large-ish" vector. I'm wondering what the best way to do this would be. Looping over index? Over iterators? Or copying a new vector?

One of the vectors is a vector of bools, which is a specialized template. Does it change the answer?

I tried testing it, but I'm having troubles finding a proper test that isn't skewed by some behind-the-scenes optimizations.

[edit]

Probably taking it a bit too far, but in the case of the bools: if I reset to "true", would it be faster to ' = true', or to OR with 1?
Last edited on Jun 13, 2012 at 12:12pm
Jun 13, 2012 at 12:13pm
Since C++03, the elements of vector are guaranteed to be contiguous, so some kind of memset would do the job.

That, of course, only opens the next can of worms; what's the fastest memset, and can I beat it if I optimise for my hardware?
Jun 13, 2012 at 12:19pm
Ah, just found out there's a built-in vector function 'assign'. I'm guessing this'll be optimized enough to make sure I don't have to worry about it?
Jun 13, 2012 at 12:19pm
Probably taking it a bit too far, but in the case of the bools: if I reset to "true", would it be faster to ' = true', or to OR with 1?

Assigning true just calls, like, a memcpy.
Where Or will FIRST have an or operation, then will memcpy it.
Or am I wrong?
Jun 13, 2012 at 12:29pm
What types are the vectors? Are they plain data that can be happily zeroed, or are they objects of some class?
Jun 13, 2012 at 12:32pm
All of them will be of primitive types, generally bools or (unsigned) integers.
Jun 13, 2012 at 12:50pm
I always like playing with this sort of thing, so here's some code. Memset comes out looking pretty good :)

Obviously the usual caveats apply with this sort of shonky timing code - shoud run thousands of trials, proper profiling tools, etc etc.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#include <iostream>
#include <vector>
#include <ctime>

using namespace std;

int main()
{

  const int SIZE_OF_VECTOR = 20000000;
  
  vector<int> a;
  a.resize(SIZE_OF_VECTOR);
  

  clock_t start, end;
  double cpu_time_used;
  

  start = clock();
  memset(&a[0], 0, SIZE_OF_VECTOR * sizeof(int));
  end = clock();
  cpu_time_used = ((double) (end-start));
  cout << "CPU time used by memset = " <<cpu_time_used << endl;
  

  start = clock();
  a.assign(SIZE_OF_VECTOR, 0);
  end = clock();
  cpu_time_used = ((double) (end-start));
  cout << "CPU time used by assign = " <<cpu_time_used << endl;

  start = clock();
  for (int eger=0; eger < SIZE_OF_VECTOR; ++eger)
  {
    a[eger] = 0;
  }
  end = clock();
  cpu_time_used = ((double) (end-start));
  cout << "CPU time used by for loop = " <<cpu_time_used << endl;

  start = clock();
  vector<int> b(SIZE_OF_VECTOR, 0);
  end = clock();
  cpu_time_used = ((double) (end-start));
  cout << "CPU time used by just making another vector = " <<cpu_time_used << endl;     
}



Last edited on Jun 13, 2012 at 12:51pm
Jun 13, 2012 at 12:59pm
Error: Memset not defined in this scope. Need to add <cstring> to includes.
Also memset looks the faster on C::B / XP, like half the time the other functions take.
Last edited on Jun 13, 2012 at 1:00pm
Jun 13, 2012 at 1:10pm
Actually half, or a lot less than half?
Jun 13, 2012 at 2:32pm
Ran this on an XP Pro (x86) workstation with C::B with these results:
1
2
3
4
5
6
7
CPU time used by memset = 15
CPU time used by assign = 63
CPU time used by for loop = 93
CPU time used by just making another vector = 63

Process returned 0 (0x0)   execution time : 0.390 s
Press any key to continue.
Jun 13, 2012 at 5:31pm
@Moschops: here's some variety for you

I had to #include <cstring>, divide by CLOCKS_PER_SEC and run each test 100 times to get readable results.

Output in seconds, cumulative over 100 repeats:

xlc on IBM: 0.69 0.70 0.68 0.85
alc on HP: 1.01 2.68 2.78 1.07
gcc on Linux: 3.69 2.43 2.46 4.93
clang++ on Linux: 2.03 1.99 2.07 4.16

I don't think memset is worth the hassle, in general.
Last edited on Jun 13, 2012 at 5:36pm
Jun 13, 2012 at 5:36pm
@Moschops : Actually half, but didn't take a indeep-closeAllPrograms-OverclockPC test.
@Cubbi : Maybe you devided by CLOCKS_PER_SEC and you stored the result into an integer? Dunno, I get readable results.
Last edited on Jun 13, 2012 at 5:37pm
Jun 13, 2012 at 5:36pm
@Cubbi

do you deal with IBM mainframe?
Last edited on Jun 13, 2012 at 5:39pm
Jun 13, 2012 at 5:45pm
@vlad I deal with many different platforms. Do you consider P7-795's "mainframes"? Anyway, my dev boxes are just P5-595s though.

@EssGeEich by readable I mean comparable. clock_t is different on different platforms.
Last edited on Jun 13, 2012 at 5:46pm
Jun 13, 2012 at 6:01pm
Cubbi, I meant z/OS and z/Series
Jun 13, 2012 at 6:08pm
@vlad Nope.
+1
CC on Sun: 4.36 7.79 30.9 8.29, but I already knew Sun C++ was't all that good
Last edited on Jun 13, 2012 at 6:12pm
Jun 14, 2012 at 9:57am
For completeness, gcc 4.0.2 on a Linux VM (and who knows what the hell's going on in there :p )

10
30
30
160
Jun 15, 2012 at 10:18am
Once again, I'm getting completely different results than the rest of the world. :/

Single run tests revealed nothing (16ms precision wasn't sufficient to show a difference), so I just put a loop around it. 100 runs each:

memset: 1712
assign: 1702
for loop: 816
new vector: 3707

Somewhat consistent to a similar topic I made months ago, on the question of "cheapest way to copy a large array".

Could be some optimization taking place for the loop, but not for the rest?

(VC++2010 Express, Win7 64bit.)
Jun 15, 2012 at 1:40pm
@Gaminic I'm going to guess that your C++ library implements vector.assign as a call to memset() for suitable value_type (which is a very common implementation technique), and that you library memset() is too generic to make use of platform-specific optimizations, while your C++ compiler is set to optimize. (linux has the same problem, memset() and other general-purpose precompiled C library functions are slower than loops optimized to the target architecture)
Last edited on Jun 15, 2012 at 1:48pm
Topic archived. No new replies allowed.