I was reading about std::rand, and it says that there's a bias towards smaller numbers. I'm looking in to doing large scale random number generation, and I need uniform probability over a large sample (10000+ in many cases). In most cases, I think the numbers being generated will be pushed down to values of 1000 or less via modulus.
I'm not well versed in random number generation in computers. Past the fact that the numbers you get aren't truly random, but based on seeds you pass to the generator, I don't know anything. What I'm looking for is more details about the bias of the std::rand, when it becomes a problem, and if/how to resolve issues caused by it would.
In my opinion, the best is to convert the random number to a double in the range [0,1], and then scale it the way you want. The typical method people use by the modulus % doesn't give you a uniform space of random numbers. Try this to generate a uniform random number between 0 and 1.
1 2 3
double x;
srand( time(NULL) );
x = static_cast<double>(rand())/static_cast<double>(RAND_MAX);
Or alternatively, if you want a 100% good result, use GSL (GNU Scientific Library, C based, forget about templates when you use it unless you modify the code your self) random number generator or numerical recipes (C++ based, but license is crazily strict).
@TheDestroyer: I like the simple solution you posted. I'll take some time to test it and see if it gives me satisfactory results. As for GSL, it looks nice, but licensing is a big issue. For the purposes of this project, I'm trying to avoid dealing with licenses as much as I can.
@Caligulaminus: It just so happens I am using a compiler that supports C++0x. I didn't even think about looking at what's new in the would of RNGs. I'll take a look at what it has to offer and see how it compares with the above solution.
Don't go into the hustle and costs for licensing any RNG libs. It's all there in best qualities free to take. Use the standard lib or look at boost first before doing anything else.
As to destroyers suggestion of converting to double and back: I really don't think that changes anything mathwise. I mean: rand()'s distribution deficits don't vanish with a change in representation.
As to destroyers suggestion of converting to double and back: I really don't think that changes anything mathwise. I mean: rand()'s distribution deficits don't vanish with a change in representation.
How would you then solve the dilemma to have a specific range for the random numbers?
What do you mean no problem with range? range is always a problem...!!! if he used the typical way with % to re-scale the integers he'd have a non-uniform distribution.
He doesn't want random numbers over the whole range of integers...!!!! otherwise he'd have disapproved of my conversions. The whole point of the conversion is rescaling the random numbers...!!! omg...!
Ups - You're right.
Anyway, the solution x = rand() / (double)RAND_MAX doesn't help if the source (rand()) does not yield uniformly distributed values in the first place.
Shadowayex should definitely look into the new C++11 features.