Ok, I would reccommend you use double rather than float, because it has 16 digits of precision, whereas float only has 8, which probably won't be enough for your app.
Also, when testing use a small dataset, like 1000 say. Get everything working with that, then test on the full set. Remember that with full data you are going to have to have some sort of tree structure to store it all in, other wise (as I said) your computer might sit there for an hour then run out of memory.
With that much data, it is not a trivial task that you are doing.
For now, (just to have a go), you could have several scenarios. (This is just for the small dataset, the full one will be completely different)
The first one is this: (it's a C approach)
Put the 5 values into a struct:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
|
struct Galaxy {
double dXOrdinate = 0.0; //put description here, units etc
double dYOrdinate = 0.0; //put description here, units etc
double dBrightness = 0.0; //put description here, units etc
double dMass = 0.0; //put description here, units etc
double dSize = 0.0; //put description here, units etc
};
struct Galaxy AGalaxy; //creates a galaxy object named AGalaxy
//this shows how to deal with one struct object
AGalaxy.dXOrdinate = 100.0; //initialisation
AGalaxy.dYOrdinate = 500.0;
AGalaxy.dBrightness = 2.0;
AGalaxy.dMass = 600.0;
AGalaxy.dSize= 200.0;
typedef struct Galaxy TheGalaxy; //now we can use TheGalaxy as a type instead of struct Galaxy everywhere.
TheGalaxy GalaxyArray[1000]; //an array of galaxy structs, starts at index 0
//Now some psuedo code
double Field1 = 0.0;
double Field2 = 0.0;
double Field3 = 0.0;
double Field4 = 0.0;
double Field5 = 0.0;
// open the file with galaxy info in it
int counter = 0;
//while not end of file {
for (counter = 0; counter<1000; counter++) { //does 1000 lines worth
//read 1 line worth of info
//extract the 5 fields and put them into the struct
GalaxyArray[counter].dXOrdinate = Field1;
GalaxyArray[counter].dYOrdinate = Field2; //similar for the other 3
}
|
Benefits of the C approach:
Performance & memory management will be a big thing here, there is less overhead with these C data structures as opposed to the C++ data structures such as vector and list.
Now the C++ approach
Have a vector of the 5 doubles, then make a <list> of these.
Now some other observations:
Galaxies are in 3D, so do we need a ZOrdinate ? Do we need a name for each one - I think so?
It could also be worth having a diatance to the galaxy, see below as to why
Now to the tree data structure:
Binary trees provide quick performance, because you can halve the data you need to search in each iteration. Google Binary Sort Trees.
In a BST the left child is less than the right child, so each time you ask a question, you are reducing the amount of remaining data to search through by half. This gives log order efficiency.
To do searching, you need something to search by, so this is why I think we need the distance field. The name is no good and neither are any of the others.
We can still keep our Galaxy struct, we just need to decide what Abstract Data Type to put it in.
We could program our own BST in C, but that would be reasonably involved.
In C++, I am not sure what ADT the STL's like map or multimap use, but they are supposed to be efficient. Multimaps allow non unique keys, which could be handy if the distances happened to be the same.
The trouble is, there might be still too much data evenif it is in a BST.
Let me explain what I might do with my LIDAR data.
I have 3.4 million 3D points which come from an airbourne scanner. They are spread over 70 sq km (7km by 10 km).
My proposal is to have map sheets which are 16km by 16km. These are made up of blocks which are 4km by 4km.
The blocks are made up of grid squares 1km by 1km. These are made up of tiles which are 250m sq.
All these are stored in a tree structure, it's not a BST because there are 4 child objects at each level.
The tile objects will store their points in a BST. The idea is that each tile will have it's points in an OS file, the files are only loaded if needed. and are removed from the tree if they aren't needed.
So maybe you could do something similar with your galaxy info. It's fairly involved, but as I said your problem is not a trivial one with that much data. The difference between your problem and mine is that yours is truly 3d, whereas mine has 2d rectangles that hold 3d info.
Good luck again