Various Questions About Code Efficiency (array length, funct eval, etc)

Hi All,

I've been coding in MATLAB for a while now, and I recently decided to switch over to C++ to join the big boys. I attempted my first C++ code for an assignment of mine and I had some general questions for you guys. I would also love to hear your comments on how to make this code more efficient. I'm not sure if there are any functions in C++ that do what I have done in my code much faster. I'm using CodeBlocks as my compiler, if that helps at all.

1) Is there any way to get the length of a vector/array? Currently I'm just defining the length as N = sizeof(h)/sizeof(h[0]), however, I read online somewhere that there is an issue with this definition and can lead to problems later on. Can anyone comment on this? Is there a better way to define the vector length?

2) In MATLAB, you could define a function and evaluate it at a point x by feval(@func,x(i)); is there something similar in C++? Right now I've manually entered the function cos(x)/x^2 inside my code (see line 27-29), but it's really messy and I would love to clean it up a bit.

3) Finally, I'm outputting my results into a text file, but I'm forced to use a for loop which is probably not the most efficient method. Is there a better way to handle this?

I'd really appreciate any help you guys can give. Also any comments on how to make the code more efficient is welcome. I'm extremely new to C++, so please bear with me if my questions are a bit simplistic.

- Ali

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
#include <iostream>
#include <math.h>
#include <stdio.h>
#include <fstream>
#include <cmath>

using namespace std;

double h [4] = { 0.001, 0.01, 0.1, 1 };
double x = 4;
int N = sizeof(h)/sizeof(h[0]);

int main()
{
    double fpa;
    double fp1 [4];
    double fp2 [4];
    double fp3 [4];
    double ERR1 [4];
    double ERR2 [4];
    double ERR3 [4];

    fpa = -sin(x)/pow(x,2) - 2*cos(x)/pow(x,3);

    for (int i = 0; i<N; i++) {

        fp1[i] = (cos(x+h[i])/pow(x+h[i],2) - cos(x)/pow(x,2))/h[i];
        fp2[i] = (cos(x+h[i])/pow(x+h[i],2) - cos(x-h[i])/pow(x-h[i],2))/(2*h[i]);
        fp3[i] = (cos(x-2*h[i])/pow(x-2*h[i],2) - 8*cos(x-h[i])/pow(x-h[i],2) + 8*cos(x+h[i])/pow(x+h[i],2) - cos(x+2*h[i])/pow(x+2*h[i],2))/(12*h[i]);
        ERR1[i] = abs(fpa - fp1[i]);
        ERR2[i] = abs(fpa - fp2[i]);
        ERR3[i] = abs(fpa - fp3[i]);

        }


    ofstream outputdata("output.txt");
    for (int j = 0; j<N; j++)
    {
        outputdata << fp1[j] << "    " << fp2[j] << "    " << fp3[j] << "    " << ERR1[j] << "    " << ERR2[j] << "    " << ERR3[j] << endl;
    }
    outputdata.close();

    return 0;

}


In MATLAB, you could define a function and evaluate it at a point x by feval(@func,x(i)); is there something similar in C++?

Yes, the equivalent would be func(x(i)) and the function definition would be:
1
2
3
4
double func(double x)
{
  return cos(x)/(x*x);   
}


1) Is there any way to get the length of a vector/array?

Well, there are two different things in C++: regular arrays have a fixed length, which needs to be known at compile-time and vectors, which can have a variable size and can be resized. vectors return their size when you call size() on them.

Example:
1
2
3
4
#include <vector>
[...]
vector<double> h = { 0.001, 0.01, 0.1, 1 }; //requires C++11
//h.size() would return 4 


Currently I'm just defining the length as N = sizeof(h)/sizeof(h[0]), however, I read online somewhere that there is an issue with this definition and can lead to problems later on.

It does not work when you only have a pointer to the beginning of the array (which is always the case for dynamically allocated arrays).

Is there a better way to define the vector length?

Well, for defining things in general you should use constants:
1
2
3
4
5
const int N=4;
[...]
double fp1 [N];
double fp2 [N];
//etc. 


3) Finally, I'm outputting my results into a text file, but I'm forced to use a for loop which is probably not the most efficient method. Is there a better way to handle this?

Normally you'd put all those values into a class (fp1, fp2 etc.), which would reduce the number of arrays to one.
You can then give the class an operator<< for stream output, which at least reduces the file output loop to:
for (auto& v : data)outputdata << v;


If you care about performance, you should make sure to run the Release target and change the project settings for the release target to use the optimization level -O3 (instead of the default -O2) and to add the additional compiler switch -ffast-math. -ffast-math gives up standard conformance for improved performance. This allows the compiler to perform more optimizations when floating point numbers are involved. As an example, it allows optimizing expressions like (x*1000.0)/500.0 into x*2.0 (or x+x), or x/3.0 into x*0.333... (multiplication is much faster than division), none of which would normally be allowed.

If you are compiling for 32-bit, then you should tune for an architecture that at least supports SSE2 (like Pentium 4) or for a generic SSE2 CPU with the switch -msse2 and additionally -mfpmath=sse which forces regular floating point computations to use SSE instead of the old x87 FPU stack.

But of course, all of that only matters for programs that perform more than just a few dozen computations on 4 sets.
Last edited on
make constant data const, like so:
1
2
3
const double h [4] = { 0.001, 0.01, 0.1, 1 };
const double x = 4;
const int N = sizeof(h)/sizeof(h[0]);


making everything const that is actually const helps the compiler creating better code.

You can declare the variable where it's used. So you have a better overview and initialize them like line 23:
const double fpa = -sin(x)/pow(x,2) - 2*cos(x)/pow(x,3);
This also obeys the RAII paradigm: http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization

If you use a stream like on line 37 you should always check if it's open:
1
2
3
4
if(outputdata.is_open())
{
...
}



1) Is there any way to get the length of a vector/array? Currently I'm just defining the length as N = sizeof(h)/sizeof(h[0]), however, I read online somewhere that there is an issue with this definition and can lead to problems later on. Can anyone comment on this? Is there a better way to define the vector length?
You can do this with a template, but doing so means not having a numeric constant (cannot be used in array declarations).

2) In MATLAB, you could define a function and evaluate it at a point x by feval(@func,x(i)); is there something similar in C++? Right now I've manually entered the function cos(x)/x^2 inside my code (see line 27-29), but it's really messy and I would love to clean it up a bit.
if there's repeating code like cos(x+h[i])/pow(x+h[i],2) you can put that in a function

3) Finally, I'm outputting my results into a text file, but I'm forced to use a for loop which is probably not the most efficient method. Is there a better way to handle this?
Nothing wrong with a for loop
Yes, the equivalent would be func(x(i)) and the function definition would be:


I implemented this in my code and it worked perfectly. I did notice a slight increase in time when I did this though. I know that it's hard to tell for such small number of iterations, but is it generally less efficient to use this function call than to just place the function inside my for loop as I had done in the first place?

If you care about performance, you should make sure to run the Release target and change the project settings for the release target to use the optimization level -O3 (instead of the default -O2) and to add the additional compiler switch -ffast-math. -ffast-math gives up standard conformance for improved performance. This allows the compiler to perform more optimizations when floating point numbers are involved. As an example, it allows optimizing expressions like (x*1000.0)/500.0 into x*2.0 (or x+x), or x/3.0 into x*0.333... (multiplication is much faster than division), none of which would normally be allowed.


I managed to find -03 optimization within codeblocks, however, I can't seem to find how to enable -ffast-math and Google isn't yielding any results as to where to find this. Do you have any suggestions? Also, what's the main difference between running the Release target instead of the debug target? Does it just do less debugging prior to compiling the code?

If you are compiling for 32-bit, then you should tune for an architecture that at least supports SSE2 (like Pentium 4) or for a generic SSE2 CPU with the switch -msse2 and additionally -mfpmath=sse which forces regular floating point computations to use SSE instead of the old x87 FPU stack.


I'm using a 64-bit operating system (AMD Phenom II). Is there an architecture that would be most beneficial to me?

Thank you for both of your help!!

Here is the updated code, just in case. Unfortunately I haven't been able to figure out how to put my arrays into a class yet, but hopefully I'll get it working soon.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
#include <iostream>
#include <math.h>
#include <stdio.h>
#include <fstream>
#include <cmath>

using namespace std;

const int N = 4;
const double h [N] = { 0.001, 0.01, 0.1, 1 };
double x = 4;
//int N = sizeof(h)/sizeof(h[0]);

double func(double x)
{
    return cos(x)/(x*x);
}

int main()
{
    double fpa = -sin(x)/pow(x,2) - 2*cos(x)/pow(x,3);
    double fp1 [N];
    double fp2 [N];
    double fp3 [N];
    double ERR1 [N];
    double ERR2 [N];
    double ERR3 [N];

    for (int i = 0; i<N; i++) {

        fp1[i] = (func(x + h[i]) - func(x))/h[i];
        fp2[i] = 0.5*(func(x + h[i]) - func(x - h[i]))/h[i];
        fp3[i] = (func(x - 2*h[i]) - 8*func(x - h[i]) + 8*func(x + h[i]) - func(x + 2*h[i]))/(12*h[i]); 
        ERR1[i] = abs(fpa - fp1[i]);
        ERR2[i] = abs(fpa - fp2[i]);
        ERR3[i] = abs(fpa - fp3[i]);

        }


    ofstream outputdata("output.txt");
    for (int j = 0; j<N; j++)
    {
        outputdata << fp1[j] << "    " << fp2[j] << "    " << fp3[j] << "    " << ERR1[j] << "    " << ERR2[j] << "    " << ERR3[j] << "    " << h[j] << endl;
    }
    outputdata.close();

    return 0;

}






Also, what's the main difference between running the Release target instead of the debug target?

Optimizations are disabled by default in the Debug target. For most code that means it will run very, very slowly.
Your executable will also contain debug information (-g), making it several times larger than the release build. However, debuggers can use that information to show you detailed information about a crash and let you inspect values of variables.

however, I can't seem to find how to enable -ffast-math

Code::Blocks has no checkbox for this, so you need to add it under "Other options".

I'm using a 64-bit operating system (AMD Phenom II). Is there an architecture that would be most beneficial to me?

The switch -march=native always selects the best available tuning settings for your CPU. However, note that your application might no longer run on other CPUs when you tune for a specific architecture. So if you have a 64-bit OS, it's best to just compile for 64-bit instead, as SSE and SSE2 are already part of x86_64.


I implemented this in my code and it worked perfectly. I did notice a slight increase in time when I did this though. I know that it's hard to tell for such small number of iterations, but is it generally less efficient to use this function call than to just place the function inside my for loop as I had done in the first place?

Function calls can be less efficient, especially if the function is short. However, that's only true if the function call is not inlined. func here is so short that it will be inlined even at -O2 (larger functions are only inlined at -O3), so there should be no performance hit at all (unless you were testing the Debug target, but that's irrelevant).

But yeah, trying to measure performance here is kind of pointless, we're talking about only a few microseconds.
And if the OS scheduler decides to interrupt your program in the middle for a few milliseconds, imagine what that does to your timing results...
Last edited on
Awesome. You've been a tremendous amount of help. Thank you very much!

- Ali
Topic archived. No new replies allowed.