valarray performance?

Jun 2, 2016 at 8:16pm
Hi all.

Am I doing something wrong that causes valarray implicit vector operations to take 4 times longer than an explicit loop doing the same thing?

Thanks Buk.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
/*
E:\femm42src\fkn\saved>cl /W3 /O2 /Ot /fp:strict valarrayTest.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 15.00.21022.08 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

valarrayTest.cpp
Microsoft (R) Incremental Linker Version 9.00.21022.08
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:valarrayTest.exe
valarrayTest.obj

E:\femm42src\fkn\saved>valarrayTest.exe
Explicit took 0.010435961 seconds
Implicit took 0.039183428 seconds

E:\femm42src\fkn\saved>valarrayTest.exe
Explicit took 0.010549275 seconds
Implicit took 0.038959721 seconds

E:\femm42src\fkn\saved>valarrayTest.exe
Explicit took 0.010881806 seconds
Implicit took 0.039002306 seconds

E:\femm42src\fkn\saved>valarrayTest.exe
Explicit took 0.010385426 seconds
Implicit took 0.044248826 seconds
*/

#pragma warning(disable: 4530 )
#include <windows.h>
#include <cstdlib>
#include <valarray>

using namespace std;

const int VSIZE = 2000000;

int main() {
    valarray<double> X( VSIZE ), Y( VSIZE ), Z( VSIZE );
    double const c = 3.141592653;

    for( int i=0; i < VSIZE; ++i ) X[ i ] = rand();

    __int64 start = __rdtsc();
    for( int i=0; i < VSIZE; ++i ) Y[ i ] = X[ i ] * c;
    printf( "Explicit took %.9f seconds\n", (double)(__rdtsc() - start ) / 2.4e9 );

    start = __rdtsc();
    Z = X * c;
    printf( "Implicit took %.9f seconds\n", (double)(__rdtsc() - start ) / 2.4e9 );

    if( Y.sum() != Z.sum() ) printf( "Results differ\n" );
    return 0;
}


Jun 2, 2016 at 9:48pm
This was actually discussed 5 years ago on StackOverflow: http://stackoverflow.com/questions/6850807/why-is-valarray-so-slow -- and apparenly hasn't changed: Microsoft didn't implement valarrays the way everyone else did.

I'm getting close to same numbers (after increasing the repetitions and adding warmup) on gcc, clang, and intel compilers, sometimes valarray is slightly faster, sometimes the loop (intel using GNU library, because using Intel's parallelized valarrays would be cheating) and MSVC (2013) is consistently 2.2 times slower for me.
Jun 3, 2016 at 2:29am
Huh! Looks like I'll have to learn to use Mingw64 :(

Thanks for the confirmation Cubbi.

Last edited on Jun 3, 2016 at 7:44am
Topic archived. No new replies allowed.