Truncating values

When I call these two functions to calculate the mean and standard deviation, my standard deviation is off by a little bit. I think it has to do with my values being truncated. Does anyone know where the bug is?

//This functions takes in an array as a parameter, as well as the
//total number of elements. It then calculates and returns the average.
double mean(double a[], int numelements)
{
double sum = 0.0;
double average = 0.0;
for (int i = 0; i < numelements; i++)
{
sum += a[i]; //add up sum of all elements
}
average = sum / numelements;
return average;
}

//This function takes in an array as a parameter, as well as the
//total number of elements. It then calculates and returns the standard deviation.
double standarddeviation(double a[], int numelements)
{
double avg = mean(a, numelements);
double sum = 0.0;
for (int i = 0; i < numelements; i++)
{
sum += pow((a[i] - avg), 2);
}
double variance = sum / numelements;
double stddev = sqrt(variance);
return stddev;
}
Replacing sum += pow((a[i] - avg), 2); with sum += (a[i] - avg) * (a[i] - avg) ; may be all that is required.

If not, perform compensated summations for computing both mean and sigma.
http://en.wikipedia.org/wiki/Kahan_summation_algorithm

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include <iostream>
#include <cmath>
#include <iomanip>

double raw_mean( const double a[], std::size_t n )
{
    double sum = 0 ;
    for( std::size_t i = 0; i < n; ++i ) sum += a[i] ;
    return n > 0 ? sum/n : std::nan("NAN") ;
}

double compensated_mean( const double a[], std::size_t n )
{
    // http://en.wikipedia.org/wiki/Kahan_summation_algorithm
    double sum  = 0 ;
    double compensation = 0 ;
    for( std::size_t i = 0; i < n; ++i )
    {
        const double term = a[i] - compensation ;
        const double temp = sum + term ;
        compensation = ( temp - sum ) - term ;
        sum = temp ;
    }

    return n > 0 ? sum/n : std::nan("NAN") ;
}

int main()
{
    const std::size_t N = 1'000'000 ;
    static double a[N] {} ;
    const double big_val =  1.e+16 ;

    for( std::size_t i = 0; i < N/2; ++i ) a[i] = big_val ;
    const double expected_mean = big_val / 2 ;

    std::cout << std::fixed << std::setprecision(1)
              << "   expected mean: " << expected_mean << '\n'
              << "        raw mean: " << raw_mean( a, N ) << '\n'
              << "compensated mean: " << compensated_mean( a, N ) << '\n' ;
}

   expected mean: 5000000000000000.0
        raw mean: 4999999999971101.0
compensated mean: 5000000000000000.0

http://coliru.stacked-crooked.com/a/6855c5610c0c08ac
my standard deviation is off by a little bit

Can you give an example? Is "a little bit" 1%, or a hundred billionth? A tiny difference may be rounding errors as JLBorges indicates.
Topic archived. No new replies allowed.