Optimizing function

Dear experts,

Can I optimize the follow code, perhaps by pointers.
Thank you in advance,



double F(double x0, double x1,double x2, double x3, double x4, double x5, double g0,double g1,double g2,double g3, double g4, double g5){
return (pow(x0, g0)*pow(x1,g1)*pow(x2,g2)*pow(x3,g3)*pow(x4,g4)*pow(x5,g5));
}
Everything is already pretty much optimised. Any further "improvements" will lead to the loss of generality.

Although I will probably ditch this function altogether and just use an std::inner_product
Optimize for what? Speed, size, memory usage, etc?

What do you know about input data? Some ranges have more options than others.

What is the acceptable level of error?

pow() has a lot discussion on the net. For example: http://stackoverflow.com/questions/6475373/optimizations-for-pow-with-const-non-integer-exponent

Six pows could benefit from vectorization. That in turn requires aligned memory.


Ultimately, is this function so significant part of the computation that the optimization effort will pay for itself?
Thank you very much for your fast response. Because the the actual function has about 2 * 30 inputs( ie g0, g1, ...g29 and x0, x1, ..., x29) I thought that it could be possible to rewrite the function by array. Any idea?
Yes, the std::inner_product takes ranges (std::vector, std::array, etc).
http://www.cplusplus.com/reference/numeric/inner_product/
closed account (j3Rz8vqX)
Alternative:

If you want to decrease the work load on yourself, and increase the workload on the computer, an option would be: using arrays of "x"s and "g"s.

If so, you could do something like this:
1
2
3
4
5
6
7
8
9
double F(double *x, double *g, int j)
{
    double total=1;//Multiplication by zero would result with 0.
    for(int i=0;i<j;i++)
    {
        total*=pow(x[i],g[i]);
    }
    return total;
}

This will give you a possible pattern and dynamic control of arguments.

Full example code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
#include <iostream>

//Header for use of pow();
#include <cmath>

//Headers: for an example with rand():
//Including srand() and time();
#include <cstdlib>
#include <ctime>

//For trailing decimal places of 2, Precision();
#include <iomanip>

using namespace std;

double F(double *x, double *g, int j)
{
    double result=1;//Multiplication by zero would result with 0.
    for(int i=0;i<j;i++)
    {
        result*=pow(x[i],g[i]);
    }
    return result;
}
int main()
{
    srand(time(0));

    //Declaring an integer j, which counts array size.
    int j=rand()%10+1;

    //Declaring arrays of already known size j; j must be known.
    double x[j], g[j];

    //Initializing dummy data:
    for(int i=0;i<j;i++)
    {
        x[i] =rand()%10+1;
        g[i] =rand()%10+1;
        cout<<"(x"<<i<<",g"<<i<<"): "<<"("<<x[i]<<","<<g[i]<<")"<<endl;
    }

    //Running function and displaying the result:
    double result = F(x,g,j);
    cout<<"The resulting value from F function: "<<result<<endl;
    cout<<"OR: "<<fixed<<setprecision(2)<<result<<endl;

    //Supporting command line operation, without IDE:
    cout<<"Custom Exit: Press 'enter': ";
    cin.get();

    return 0;
}
(x0,g0): (1,8)
(x1,g1): (8,4)
(x2,g2): (3,1)
(x3,g3): (6,1)
(x4,g4): (1,3)
(x5,g5): (9,6)
The resulting value from F function: 3.91821e+010
OR: 39182082048.00
Custom Exit: Press 'enter':

optimize
Hard coding will, almost always, outperform roundabout methods.
However, these methods "should" provide better dynamic flexibility.
In this example, it would be allowing any number of arguments; minus limitations.
Last edited on
@Dput or, you know, remove any workload from yourself and use what standard library creators have already done.
1
2
3
double result = std::inner_product(std::begin(x), std::end(x),
                                   std::begin(g), 1.0
                                   std::multiplies<double>(), std::pow);
Last edited on
closed account (j3Rz8vqX)
My previous post was an alternative to std::inner_product, which was posted by keskiverto; prior to my post.
@MiiNiPaa if someone wants to optimize a function like the one from the OP, it's probably about speed.

Using the C++ stdlib as in your example will create a lot of overhead that will cause the exact opposite of "performance optimization".
In this case Dput's low level approach is probably the fastest you can get.
C++ stdlib as in your example will create a lot of overhead
It will not create any overhead and actually will create same code (possibly avoiding index recalculation on every step, however it will not matter).

Both variants have created same assembly code on -O2.
Edit: after I tested Dput code exactly as it written without my changes to interface, it produced less optimal code than library function.

In this case Dput's low level approach is probably the fastest you can get.
Main perperator of that code is pow function. By throwing out most safety checks and with a little help of assembly inserts you can create way more effective and more dangerous function.
Last edited on
Using the C++ stdlib will create a lot of overhead

Does it, really?


No matter how we do the actual operation, it seems quite clear that we really want to have the data -- 30-something input values -- in an array of some sort, rather than as separate variables.
closed account (j3Rz8vqX)
It was simply an alternative.

So far, we've produced two implementations.
So far, we've produced two implementations.
Yes. Which internally are not so different (and would be the same if we get rid of subscript operator in your version).

Main problem is pow function. Which is a hornet nest. You will need advanced math knowledge to properly implement it even without special cases check.
Not to mention that standard library pow function is already optimized as much as possible and is often written in assembly directly.
So, this code is better to left alone (aside from swithching to arrays)
Thank you all indeed. Dput /MiNiPaa/ Keskiverto / Plexus. It was my first topic with complete answers.
You're looking for Ans = x0g0 * x1g1 * ... * xngn

ln(Ans) = g0*ln(x0) + g1*ln(x1) + ... + gn*ln(xn)

It might be faster to calculate ln(Ans) this way, and then use exp() to get the actual result. This method may also give a more accurate result because the intermediate results are less likely to overflow. Then again, it might give a less accurate result too.

Dave
dhayden: Extremely useful tip. Thanks a lot.
Topic archived. No new replies allowed.