atof() precision

Pages: 12

Greetings!

A question to input arguments of main().
My program receives a string representing a number upon call, e.g. "1.6".
At the start of the code, it is transferred to a double with atof(argv[i]).
Yet, if I print out this double, it is not 1.60000...., but 1.6000000000000000088178...

Why does this happen, and what should I do to ensure that the program takes the string "1.6" as the exact double "1.6"?

edit: I should note that I have used cout<<fixed<<setprecision(30);

Best!

PiF

Last edited on

jonnin (11497)

you are trying to print more digits than the type has, and it is making up the least significant digits, for lack of typing a longer answer to say those words.

a double has, if I remember it right, 15 digits of precision for sure, and after that, you are at risk.

printed format aside, doubles cannot be trusted to store exact values. There are an infinite number of numbers between 0 and 1 (take 1/x for all x between 0 and infinity to see this). You can't represent all of them exactly with a finite number of bits. That idea leads to the fact that doubles have values they cannot store exactly, even some that you think maybe it should have...
you can use the extended FPU doubles for a little more precision (10 byte on older machines, some newer machines have other formats like 128 bit, 12byte, etc it varies a little).
its a pain to use those if your compiler does not have an extension to do so. After you run out on those, you have to use a slow emulation class that can represent anything you can fit in memory, but those are very very slow for iteration bound work.

Last edited on

PhysicsIsFun (297)

Thank you jonnin!

I am a bit shocked to say the least, I thought double precision could easily handle 30 digits.

There is another part in the code:
I add the "strange" representation of my 1.6 to a double initialized in the code, 3.0.
I printout the result and it gives "4.59999999999999964..."
Even though my printed out version of 1.6 is actually >1.6, the result of this addition yields a printout value of < 4.6.

This is the same phenomenon? And it occurs because I force a printout of more digits than double can handle?

Last edited on

closed account (E0p9LyTq)

Why does this happen

You are accessing more decimal (base 10) digits than a double can store without change due to rounding or overflow.
https://en.cppreference.com/w/cpp/types/numeric_limits/digits10

what should I do to ensure that the program takes the string "1.6" as the exact double "1.6"?

use std::setprecision(DBL_DIG) (you might need to include the <cfloat> header)

or

std::setprecision(std::numeric_limits<double>::digits10 + 1

(include the <limits> header)

Either method should give you the maximum number of decimal digits to represent accurately any double without rounding or overflow.

jonnin (11497)

its both :)
you are still printing too far out, but here, you also have an inexact result.
I mean, 4.59999999999999999 is really 4.6 ... and 3.0+1.6 does equal 4.6
so its slightly off, due to representation problems.

this is why you will be told 2 or 3 useful things about working with doubles... maybe you have not seen these yet:
if working with just a few digits, use integers instead. 16+30 is 46 ... you can float the point yourself. Banking/money applications do a lot of this.

do not use == on doubles apart from checking against zero (and that only if you are SURE it will be EXACTLY zero, and not 0.000000000000000001 etc).

and most important, if you need that level of precision, re-evaluate what you are doing and what your real needs are. Most people can live with 4.599999999999999 meaning 4.6 ... if yuo can't, then you have to do something else.

PhysicsIsFun (297)

Thanks guys.

Maybe a word to my situation: I deal with a physical particle simulation and I want to test time reversibility: If my particle starts at a certain point X and moves to a final point Y during the simulation, it should be possible to reverse the simulation, i.e. if my particle starts at Y it should reach point X in the reverse simulation.

At the beginning of the backwards motion, it strictly moves along its original trajectory. However, there are slowly growing deviations to the original trajectory, first of the order 10^-12, then 10^-11, 10^-10 and so on, and they grow exponentially.
I thought that this has something to do with the fact that, for the background motion, I manually put the particle on position Y, the final position of the forward motion. That position Y, however, was printed out with cout with just a few decimal digits after the forward motion. So If I used these finite representation of Y as an input parameter of the backward motion, it should lead to small deviations.

So I increased the precision of the cout outputs with setprecision() to get a more accurate representation of the Y position, and I observed these strange effects.
If I understood you correctly, the Y output of the forward motion is only valid up to 15 digits. Thus, if I start my backward motion using that position value, I will make an error of around 10^-16?

I need to figure out wether this could lead to dramatic effects. This should depend on the chaoticity of my system...

Last edited on

jlb (4973)

what should I do to ensure that the program takes the string "1.6" as the exact double "1.6"?

In C++ there is no such thing as an "exact double", all floating point numbers are approximations. You can get "errors" in any range of values since there are decimal numbers that can't be accurately represented in binary.

jonnin (11497)

use large integers as I said, and float the point yourself. It will still be an approximation, but it will be a reversible one.

What you sounds like sounds similar to CFD which is fairly challenging work. I don't know if nearly 80 years of CFD tips and tricks from other engineers would apply to your efforts or not... but people did an awful lot of good work with a lot less power and precision than you have available today... also, at its heart, physics has a LOT of approximations anyway. Maybe you are trying to hard to be exact in a field where that is not necessary? I don't know, just asking you to ponder a little if you have not already.

Last edited on

mbozzi (3944)

To understand the issues with floating-point math, give Goldberg's famous paper a look:
https://www.itu.dk/~sestoft/bachelor/IEEE754_article.pdf
One fundamental idea from that paper is that the magnitude of rounding error increases with the magnitude of the value being represented. (In general).

There is a whole field called numerical analysis which is partly about designing numerically stable algorithms that give reasonable results in the presence of errors. Unfortunately, I know nothing about this sort of thing. Maybe @lastchance can help you?

Last edited on

dutch (2548)

0.6 (or 3/5) in binary is 0.100110011001100...
So it can't be stored exactly in a finite number of bits.
If you need perfect decimal fractions then you pretty much have to do the math in decimal.
(Doesn't bc do it's math in decimal?)

Last edited on

PhysicsIsFun (297)

@Jonnin
it is not CFD, but still a many-particle simulation.
I am also of the opinion that the lack of double accuracy should not be a problem here (even though the precision is weaker than I thought), but I still have to consider the possibility. If I don't find the problem somewhere else I will look into the usage of large integers. Thanks!

@mbozzi
Thanks for the paper, I will take a look! Indeed, wether the rounding error increases exponentially depends on the chaoticity of my system, i.e. the numerical equations. However, it is not that chaotic, so my problem will most likely lie somewhere else.
It's just good to know now that double will 'only' allow precision to 10^-15, which I did not know.

jonnin (11497)

I know its not CFD, but there may be some overlap in approaches to handling the problem. I do not know, it was just a suggestion since that problem has been studied to death.

PhysicsIsFun (297)

I have another question to the double precision topic.

Now, I have the following issue:
I am using fixed set_precision(14), i.e I will have 14 digits after the comma.

Try to follow this calculation:
Suppose I manually specify four doubles, bx=1500, ex=4500, r=4499, R=0.1.
Now, my simulation calculates some (correct) double value s=0.9997 (printed out, it is 0.9997 0000 0000 00).
I calculate diff = ex - bx. It is given as the correct 3000.0000 0000 0000 00.
Now, I print out m = diff * s. Result: 2999.0999 9999 9999 91.
At this point, I am aware that only 12 digits after the comma are significant, so the result is correct.
First question here: Shouldn't the computer have rounded the result to 2999.100...?
Since he can't 'understand' the last 9's, I thought he would round up.

Now, I calculate c = bx + m, so 1500 + 2999.0999 9999 9999 91.
It is printed as 4499.1000 0000 0000 36.
At this point, it looks like the computer really did round earlier, at least internally. Did he only show the 2999.09999... because I forced him to do so with set_precision(14)?
Likewise, the 4499.1000 0000 0000 36 should simply be 4499.1000... for the computer.

Now final question: I calculate c - r, i.e. the 4499.1000 0000 0000 36 - 4499.
Result: 0.1000 0000 0000 36.
Now this is a problem! Suddenly, the 36 from before are significant digits!
I don't know how this could happen. Shouldn't the computer have rounded the earlier results? I want my result to be 0.10000 =(

I hope it wasn't too confusing and is readable.

Thank you!

Last edited on

jonnin (11497)

the computer does not round anything for you.
how is it supposed to know you mean 2.0 and not 1.999999999999999 ? Maybe you meant that?

PhysicsIsFun (297)

Take the number 2999.0999 9999 9999 91.
A double has 15-16 significant digits. I have printed out 14 digits after the comma.
4 digits are in front of the comma, that leaves 11 significant digits after the comma.
Since the 12th digit after the comma is 9, I assumed that the computer would round this.

I mean, on one hand it can only have 15 significant digits, on the other hand, there are obviously more digits there. What is he doing internally with these excess numbers? I thought he would just round them away and then work with 15 significant digits.
At least that's what a human would do.

Last edited on

TheIdeasMan (6847)

PhysicsIsFun wrote:
I am a bit shocked to say the least, I thought double precision could easily handle 30 digits.

Although double has 15 digits of precision it can do exponents, so a number like 1e-30 can still represent the digit at 1e-45.

Research what the range of exponents are.

15 or 16 digits of precision is still useful for most things. One can change the units.

jonnin (11497)

it depends on where those digits really live.
if you are just printing them to the screen, again, its making them up. You can ask printf and probably cout to print a double with %1.50f and it will attempt to oblige. It will happily print junk.

int main()
{
double d = 1.234;
printf("%1.50f", d);
}
result
1.23399999999999998578914528479799628257751500000000201

I know its crap, you know its crap, but printf is eager to please. I don't know if its doing some idiotic math inside to generate the extra digits or if its reading the next memory location in turn out past where it should be looking, but the back end of that is just randomly crafted junk, whatever the source of it.

and then there is the FPU.
the FPU uses more bits than you can see. If you think you have 64 bits as a double, odds are down in the hardware its using 80 bits. It does this to minimize some roundoff/error problems, sure. But its not going to round for you or anything else, its just trying to minimize the damage via extra precision in intermediate steps. I don't think it does a lot of corrections, maybe in specific embedded circuits but generally no.

The computer is not a human. Its a bunch of circuits. Its cannot think. GIGO. TOM. Live and breath those 2 concepts until it sinks in :)

Last edited on

mbozzi (3944)

It may help to print your doubles in hexadecimal (i.e., use std::hexfloat). Being able to read the actual bit pattern might be useful.

Last edited on

PhysicsIsFun (297)

@jonnin
My problem is that these final 'garbage digits' suddenly become significant.
If I calculate 4499.1000 0000 0000 36 - 4499 I get 0.1000 0000 0000 36.
Now these 36 in the end, formerly insignificant garbage that was just generated somehow to please my set_precision() output method, are significant.
To the computer this new number is > 0.1, even though it shouldn't be...

jlb (4973)

My problem is that these final 'garbage digits' suddenly become significant.

Those "garbage digits" are always significant.

If I calculate 4499.1000 0000 0000 36 - 4499 I get 0.1000 0000 0000 36.

Yes, you calculated a value of 4499.10000000000036 which is different than 4499 so if you do your subtraction you will be left with the fractional part, .10000000000036.

Now these 36 in the end, formerly insignificant garbage that was just generated somehow to please my set_precision() output method, are significant.

The set_precision() function doesn't affect the actual number stored in memory, it just alters the display of that number. The value held in memory is not "magically" altered by set_precision(), the value in memory is really 4499.1000 0000 0000 36 or thereabouts. Floating point math is inherently imprecise having much more than just "round off" problems.

Pages: 12