fastest way to XOR 2 arrays of boolean?

Forum

Forum
General C++ Programming
fastest way to XOR 2 arrays of boolean?

fastest way to XOR 2 arrays of boolean?

Jan 29, 2009 at 7:36pm

Hi,

I have two arrays of booleans, both have the same known size.
I would like to perform a XOR on the arrays and store the result in a third boolean array c.
I could iterate on the arrays and do a XOR on each value, but I was wondering if there was a faster way to do it, since it is a very "low level" operation?

Bottom line : whats the fastest way to XOR two boolean arrays?

Thanks,
Cyp

#what i am doing for now:
bool a[100];
bool b[100];
bool c[100];
for(int i=0; i<100; i++){
c[i]=a[i]^b[i];
}

Jan 29, 2009 at 7:39pm

Zaita (2770)

I imagine the compiler would optimise it for you. Just compile with optimisations turn on :) That should also enable look-unrolling which would make it even faster.

Jan 29, 2009 at 7:48pm

helios (17607)

I think the fastest way would be to make the size of the array a multiple of sizeof(long):

//Assume sizeof(long)==4
bool a[100];
bool b[100];
bool c[100];
long *a2=(long *)a,
	*b2=(long *)b,
	*c2=(long *)c;
for (int i=0;i<100/sizeof(long);i++)
	c2[i]=a2[i]^b2[i];

Jan 29, 2009 at 7:54pm

Zaita (2770)

@Helios: Not ideal to use sizeof(type), as this can limit you to various compilers and architectures. What if sizeof(long) == 8? (100%8) != 0
http://developers.sun.com/solaris/articles/ILP32toLP64Issues.html

He could be trying to compile this on some strange hardware (SPARC, MIPS, RISC).

As I said. The compiler will likely optimise. If you wanted to make it easier for the compiler, unroll you loops to 5->10 instructions so it can execute concurrently.

Last edited on Jan 29, 2009 at 7:57pm

Jan 29, 2009 at 8:20pm

helios (17607)

Well, yeah. That's why my comment states that the code assumes sizeof(long)==4. It was just an example. I took advantage of the fact that !(100%4) and that sizeof(long)==4 is very common.
An interesting fact: the loop will never produce a buffer overflow, so if (n%sizeof(long)), it's always possible to complete the last steps with the bool pointer.

Another solution would be to use int32_t, although that's not quite standard, sadly.

Jan 29, 2009 at 8:25pm

Zaita (2770)

Assumption is the mother of all f*ck ups

Jan 29, 2009 at 8:47pm

helios (17607)

Out of curiosity, I did a little benchmark with MinGW:

(unoptimized)
n=1024*1024*100
Initialization (bool elements): 3390 ms
XOR (uint32_t elements): 234 ms (best), 250 ms (worst)
XOR (bool elements): 656 ms (best)

(-O3 -fexpensive-optimizations)
n=1024*1024*100
Initialization (bool elements): 3296 ms
XOR (uint32_t elements): 188 ms (best), 235 ms (worst)
XOR (bool elements): 234 ms (best), 250 ms (worst)

I wonder why XOR with uint32_t elements varies so much.

Jan 29, 2009 at 8:50pm

Zaita (2770)

Un-Roll the loops and try it unoptimized. Un-roll to 10 each

Jan 29, 2009 at 9:05pm

jsmith (5804)

I believe that C defines "int" to be the word size of the architecture, so perhaps helios' solution with "int" instead of "long" accommodates Zaita's point.

For that last ounce of speed, I'd also try to write the loop such that the terminating condition was i == 0, since comparison to zero can be slightly faster than comparing to nonzero, and I'd also use pre-increment/pre-decrement instead of post-.

Jan 29, 2009 at 9:17pm

helios (17607)

Unrolled to 10, 100 MB:
Unoptimized:
uint32_t: 200-220 ms
bool: 500 ms

Optimized:
uint32_t: 200-220 ms
bool: 180-200 ms (at one point it measured 281 ms. Probably the OS having a brain fart)

Last edited on Jan 29, 2009 at 9:17pm

Jan 29, 2009 at 9:18pm

Zaita (2770)

Unrolling wins :D

Jan 30, 2009 at 2:35am

Mythios (1058)

Assumption is the mother of all f*ck ups

Haha that made me laugh and brought back some old memories. During my first year of programming my Tutor told this to the classroom everyday haha.

Last edited on Jan 30, 2009 at 2:36am

Topic archived. No new replies allowed.

C++

Forum

fastest way to XOR 2 arrays of boolean?