Random numbers

I was reading about std::rand, and it says that there's a bias towards smaller numbers. I'm looking in to doing large scale random number generation, and I need uniform probability over a large sample (10000+ in many cases). In most cases, I think the numbers being generated will be pushed down to values of 1000 or less via modulus.

I'm not well versed in random number generation in computers. Past the fact that the numbers you get aren't truly random, but based on seeds you pass to the generator, I don't know anything. What I'm looking for is more details about the bias of the std::rand, when it becomes a problem, and if/how to resolve issues caused by it would.

Thanks.
Last edited on
In my opinion, the best is to convert the random number to a double in the range [0,1], and then scale it the way you want. The typical method people use by the modulus % doesn't give you a uniform space of random numbers. Try this to generate a uniform random number between 0 and 1.

1
2
3
double x;
srand( time(NULL) );
x = static_cast<double>(rand())/static_cast<double>(RAND_MAX);


Or alternatively, if you want a 100% good result, use GSL (GNU Scientific Library, C based, forget about templates when you use it unless you modify the code your self) random number generator or numerical recipes (C++ based, but license is crazily strict).
C++11 has much better PRNGs.
Try googling "mt19937" if your compiler speaks C++11. It's supposed to be very good distributionwise.
@TheDestroyer: I like the simple solution you posted. I'll take some time to test it and see if it gives me satisfactory results. As for GSL, it looks nice, but licensing is a big issue. For the purposes of this project, I'm trying to avoid dealing with licenses as much as I can.

@Caligulaminus: It just so happens I am using a compiler that supports C++0x. I didn't even think about looking at what's new in the would of RNGs. I'll take a look at what it has to offer and see how it compares with the above solution.

Thanks to both of you
Don't go into the hustle and costs for licensing any RNG libs. It's all there in best qualities free to take. Use the standard lib or look at boost first before doing anything else.

As to destroyers suggestion of converting to double and back: I really don't think that changes anything mathwise. I mean: rand()'s distribution deficits don't vanish with a change in representation.

Smells like C. C. Baxter - I know.
As to destroyers suggestion of converting to double and back: I really don't think that changes anything mathwise. I mean: rand()'s distribution deficits don't vanish with a change in representation.


How would you then solve the dilemma to have a specific range for the random numbers?
No problem with range, no dilemma.
Shadowayex worried (righteously) about the uniformity in rand()'s distribution.
Last edited on
What do you mean no problem with range? range is always a problem...!!! if he used the typical way with % to re-scale the integers he'd have a non-uniform distribution.

He doesn't want random numbers over the whole range of integers...!!!! otherwise he'd have disapproved of my conversions. The whole point of the conversion is rescaling the random numbers...!!! omg...!
Last edited on
If I had an RNG with perfect uniform distribution, would using modulo on it's values give me a non-uniform distribution?

He doesn't want random numbers over the whole range of integers...!!!!

Of course he doesn't. He made himself pretty clear:
will be pushed down to values of 1000 or less via modulus.


oyg...?
Last edited on
> If I had an RNG with perfect uniform distribution,
> would using modulo on it's values give me a non-uniform distribution?

Yes, if the modulo is naively applied.

For instance, if we had a RNG random() that generated perfect uniform random numbers (with equi-distribution of the bits) in the interval [0,255],

and we tried to get a random number in the interval [0,9] by random() % 10

A number in the range [0-5] would have a higher probability of being chosen than a number in the range [6-9].
Yes, if the modulo is naively applied.


Ups - You're right.
Anyway, the solution x = rand() / (double)RAND_MAX doesn't help if the source (rand()) does not yield uniformly distributed values in the first place.
Shadowayex should definitely look into the new C++11 features.
Topic archived. No new replies allowed.