double can represent uniformly spaced integers up to 2^52 - 1. Storing this many doubles would require 2^40 petabytes.
From 2^52 to 2^53 - 1 it can only represent even integers. From 2^53 to 2^54 - 1 it can only represent multiples of 4, and so on up to 2^2048.
Okay ... The thing is that i want to calculate avarage with double sized variables and only using 2, sum and count.
Problem is that sum goes way too high over the time.
I somehow need to make sum and count lower so that the return value would not changes and the next calculate function call's return would be the same as without lowering sum/count values.
Okay, i should be more specific of what i want to do.
I need to calculate avarage error for neural network each time i do the backpropagation.
I calculate the learning speed/momentum based on error avarage.
Also based of error avarage, i know whether to add new neuron or remove one.
It's a bit more complicated but everything relies on error avarage.
The thing is that backpropagation function could be called enough times that sum variable may get too high.
Any ideas of how could i lower the sum variable and count variabes in that way that
everything continues working like i never would't have lowered the sum and count variable?
The only way to prevent an overflow when computing an average is by computing a1/n + a2/n + ... an/n instead of (a1 + a2 + ... + an)/n, but this requires knowing in advance how many values you have to average.
Doubles are really hard to overflow, but adding a double to a garbage value will yield a garbage value. Are you absolutely sure that you don't have uninitialized data or some kind of memory bug? Approximately how many values do you expect there to be, and in what range are those values?
Hi Gyiove, I understand your concern. Potential overflow when calculating the average of a large set of integers is quite a common source of bugs.
When using floating point numbers (as in your example) you don't get overflows, but instead there is a loss of significance. With a very large set of numbers this can be significant.
If you know your worst case you can might be able to choose an appropriate type for the sum. For example, many compilers provide a 64-bit integer type (perhaps called long long or __int64).
However, if you have an indefinite count of numbers to deal with (ie unknown beforehand) you have a problem. I think this is what you were getting at when you mention "infinite times".
In that case I think you need an infinite precision math library. An astonishingly good one for C/C++ is GMP (see http://www.gmplib.org/ ). I recently incorporated it into the calculator embedded in my hex editor (see http://www.hexedit.com ) and was astounded that it could calculate 1000000! (factorial of 1 million) in less than a minute. (BTW 1000000! is an integer with more than a million digits!)
#include <iostream>
#include <string>
#include <random>
int main()
{
std::mt19937 rng( std::random_device{}() ) ;
constdouble minv = 1000.0 ;
constdouble maxv = 5000.0 ;
std::uniform_real_distribution<double> distrib( minv, maxv ) ;
constlonglong nvalues = 100'000'000 ;
constauto nprint = nvalues / 5 ;
double average = 0 ;
for( longlong n = 0 ; n < nvalues ; ++n )
{
constdouble value = distrib(rng) ;
// see: https://en.wikipedia.org/wiki/Moving_average#Cumulative_moving_average
average += ( value - average ) / (n+1) ;
if( n % nprint == (nprint-1) ) std::cout << std::fixed << "after " << n+1 << " values, moving average is: " << average << '\n' ;
}
std::cout << "\nexpected value: " << (minv+maxv)/2 << " computed average: " << average << '\n' ;
}
after 20000000 values, moving average is: 2999.985586
after 40000000 values, moving average is: 2999.888527
after 60000000 values, moving average is: 3000.001183
after 80000000 values, moving average is: 3000.069204
after 100000000 values, moving average is: 3000.063402
expected value: 3000.000000 computed average: 3000.063402