Help with Float point refrencing

Forum

Forum
General C++ Programming
Help with Float point refrencing

Help with Float point refrencing

https://youtu.be/KnCNfBb2ODQ

Near the end of the video the guy mentions that to zoom infinitely you need a float point representation system. I've done research and have no idea how you would get floats that small? Any suggestions? Im using Visual Studio c++ and I have a bit of experience. I've done a fair amount of research myself but still don't understand?

Ganado (6896)

For others: Timestamp around 50:00 is when he starts to talk about how it breaks down at a certain zoom level.

But yeah, floats (32-bit) and doubles (64-bit) still have finite precision, so once you zoom in enough, you will eventually run out of precision. If you want to be able to zoom in farther, you have to have higher precision.

Many compilers will have 128-bit extensions for floating-point numbers.
https://gcc.gnu.org/onlinedocs/gcc/Floating-Types.html

The hardware itself may not support 128-bit (or higher) operations, so it's likely you will need to use an arbitrary-precision arithmetic library (or make your own software-implemented float).
I have never used it, but the MPFR is one: https://www.mpfr.org/

Last edited on

keskiverto (10425)

The floating point values supported by language/hardware are essentially {sign, exponent, significand} triplets:
https://en.wikipedia.org/wiki/Single-precision_floating-point_format
https://en.wikipedia.org/wiki/Double-precision_floating-point_format

If you do need unsupported triplets, then those have to be implemented somehow with use of the supported types. Those arbitrary-precision arithmetic libraries are such implementations.

Kingfrankbob (4)

"or make your own software-implemented float", I was working on this, I was thinking about doing all the multiplying setting to a number round it then multiplying it by 10^whatever power I need. I looked at MPFR, I've looked at the documentation, do I set the precision to a low number so it can go deeper? I apologize it was a bit confusing

I tried the precision float format but I also tried scientific notation, didn't know specifically how to implement that! I'm still working though!

dhayden (5799)

I coded this in 1988 using 64-bit fixed point numbers. It's a lot faster than floating point. If you go to 128-bit on a modern CPU it would probably really fly.

I really must port that code. It generated stunning, multi-color views. rendered very quickly (with some shortcuts in the algorithm), let you pan and zoom and did the Julia set, which is related.

Kingfrankbob (4)

How would I make a fixed point float. To be honest I'm kind of confused because I can't even use a function to count the numbers on a double/float because it goes to 9999 all the way to 10^324

jonnin (11497)

float and fixed are opposites, you can't.
you can store a float as an int, eg pi could be stored in a 64 bit integer with MORE digits than in a double, if you know that it is X 10 to the 17th or whatever power, its just 31415926... like that as an integer, and your code is written/scaled to work like that KNOWING where the decimal place is (that is fixed point).

but your range is too large for a simple fixed point. you need either a better floating point or a large integer library, or write your own code. The language does not support both high precision and gigantic exponent representation, and neither does the hardware: that is why you have to roll your own.

log10 can be used to count the digits of a number, with a little hand waving. But, here again, it can't do something like count all the ones and such -- the data isnt there.

Last edited on

dhayden (5799)

How would I make a fixed point float

Let's clear up some terminology. "floating point" is a general term for a way to represent real numbers. In floating point, the location of the decimal point is not fixed. Instead, it "floats" around and it's exact location is usually given by an exponent.

"Fixed point" is a way of representing real numbers where the decimal point is fixed in the same location every time.

Also, since we aren't talking about decimal numbers, let's use "radix point" instead of "decimal point" to denote the place where bits represent values smaller than 1.

Okay, for mandlebrot, you can make a 64-bit fixed-point value by using 64-bit ints and simply defining where the radix point goes. If I recall correctly, each iteration of the calculation results in a value less than 4 so you can you can define your fixed point number as having the most significant 2 bits being values 2 and 1.

Now just go back to 5th grade math to figure out how to do the arithmetic. You'll find that the only thing you need to change is when you do multiplication and division because the radix point in the result moves.

Kingfrankbob (4)

So I was thinking about it, and am still slightly confused (which is fine don't worry) I was thinking about having a set number like 1.6 and the making it move but coloring

for(int I = 1;I < 222; I++){

int re = 1.6;

long double Printed = re * pow(10, I);

}

is this what your trying to explain? I apologize I haven't worked with small numbers, is there any exercises I could try and do to help me familiarize myself to this?

jonnin (11497)

yes.
the most simple, basic thing would be to use base 10 (inefficient but simple) so some number * some power of 10 stored in an integer is what you have above, and it is correct.
all you have to do is normalize your values such that the decimal is in the same place for each integer, again, like pi might be 31415.92... in your system in which case you divide by some power of 10 to get it back to 'normal'.
pow is very inefficient for integer powers, write your own or use a lookup table for powers of your base (10, or 2, or 16, or 64 or whatever you want).
log and exp type functions and math can help here for some tasks, but like pow, they are sluggish if abused.

just making a tool to do this math would be plenty good as an exercise.
first, try doing it just using 64 bit integers. decide if you want your tool to handle negative values or not.
then if you need bigger values, you can extend that to an array of integers. I recommend the least significant chunk goes in array[0] and next [1] .. [n] as it makes processing it much easier.

there are a lot of ways to deal with these kinds of problems. Usually, get a library, since someone already did it, but they are easy enough to write for basic use cases. You can also 'float' by keeping the power of 10 and the value in 2 variables and doing extra math to get that to fall into place.

Last edited on

Topic archived. No new replies allowed.

C++

Forum

Help with Float point refrencing