I got 131242625471.8555 as output, which is within 2^-32 of error from 131242625454.7232, which is what I got as the definite integral of log2(x) between 1 and 2^32, which I had estimated to be equal or very close to the exact value of the sum.
Considering one computation took billions of floating point operations more than the other, should I trust the value of the integral to be the exact value, or the operations I performed in the program conserve enough precision for the result to be trusted?
The program was compiled with g++ test.cpp -o test -O3 -s and ran on a Core 2 Duo.
That sum is not equal to the integral. Consider both methods for range (0; 1].
In fact I was just going to complain how your method is wrong as the integral is smaller than the sum.. wait, is the correct expression "to be just about to" ?..
I know, but it's definitely possible that the integral is closer to the actual value of the sum than what the program returned. The error being such a special number is certainly suspicious.
The question is more about the precision of floating point operations, particularly in the FPU->memory transition, than about math.
Consider both methods for range (0; 1].
Is it possible to integrate over open intervals? EDIT: Ah, I see. Improper integrals are limits. In any case, the point is moot because there are no natural numbers in (0;1), so the sum doesn't make sense, either.
wait, is the correct expression "to be just about to" ?..
Not that it matters as the thread is about precision (which I know nothing about), but I have improved my bound slightly.
Consider triangles that go through the points (x,log(x)), (x+1,log(x)), (x+1,log(x+1)). They have gradient 1/(x+1) and go through the point (x,log(x)). Because log(x) is convex, they can't cross log(x), so if we add up their area we still stay below the integral. Their total area is 1/2 (H_n - 3/2) where H_n is the nth harmonic number.
Substituting this in the inequality gives
s_n - log_2 (n) + 1/2 (H_n - 3/2) < i_n
Because H_n appears with positive coefficient on the less than side of the inequality, we can replace it by a lower bound. It just so happens that ln(n) is a lower bound for the harmonic function, so substituting and tidying up gives.
s_n < i_n + log2(n)(1-ln(2)/2) + 3/4
So the difference between your sum and the integral can be at most 21.66.
You can do exactly the same trick above log(x) and derive the difference must be at least 10.59.
If that's true, then it proves that whatever the actual value of the sum, the directly computed one will always be closer than the integral's. Which, after thinking about it for a while, is to be expected. There's a whole chunk of surface it misses at every step, and it looks a bit like this:
I really wanted a series expansion where I could bound the error term, and hence calculate the result to a desired accuracy, but the integral for the Chebyshev coefficients looked horrible, and Maple confirmed it.
I do think it's quite neat that with relatively simple maths, and a little picture, you can derive
s-i<21.66
s-i>10.59
A little bit of fun, yeah. But anything is fun when you've still got 50 exams to mark on the independent samples t test. I think I just miss real maths. I don't do anything like this anymore.
Would you happen to know why when I give sum(i=1,2^32,log2(i)) to a CAS, it gives back log2(gamma(2^32+1)), when I arrived at the summation from log2((2^32)!)? It just doesn't make sense.
The gamma function is a generalization of factorial, but for natural n is equivalent to (n-1)! So your CAS has basically undone the manipulation that you did.
I am supposed to be finishing my PhD, but I tend to rock climb and run most of my time. I also work part time for a consultancy that looks at the scheduling of home care nurses/staff.
Actually, some of the parts that most interest me in CS overlap with mathematics (e.g. language theory and anything to do with symbolic manipulations; computer graphics, which applies many tools from linear algebra and calculus; data compression), so I guess I'll end up moving to a similar region as you from a different direction. Hopefully I won't then complain about a lack of proper programming.
Approx theory and numerical analysis are pretty much just applied analysis. Metaheuristic optimization, at least what I do, is simulation and graph theory.