weighted dice throw

May 9, 2010 at 3:51pm
I need to create an algorithm to return a number between 2 and 12, as in a dice throw, which is weighted so that I can control whether a more or less probable number is returned, for example, we know that the probability of throwing a 7 is higher than the probability of throwing a 2 or a 12, these are the least probable throws, but with this algorithm I wish to specify, possibly with a float, whether I wish for more probable or less probable numbers to be returned more often, so If i called the function with a parameter of 0 it would be much more likely to reutrn 2 and 12 and no longer very likely to return a 7. A parameter of 0.5 would carry out a normal dice throw, and a parameter or 1 would make the 7 even more likely than usual. It needs to be weighted but not entirely biased, so even with a parameter value of 0 throwing a 7 is still possible.

I'm not sure how well I've explained what i want I hope you guys can understand it.

Thanks x
May 9, 2010 at 5:31pm
Generate the random number and then randomly offset the results to either side?

rand(5, 9) + rand(0,1) * abs(offset-0.5)*8;

See:
http://cplusplus.com/reference/clibrary/cstdlib/rand/
For why my code won't compile.


Never mind this post.

-Albatross
Last edited on May 9, 2010 at 7:08pm
May 9, 2010 at 6:49pm
That wouldn't work as it could give values outside of the range 2-12. If it helps, I'm currently simulating my dice by using 2 random numbers between 1 and 6, just making one number between 2 and 12 means that there is an equal chance of throwing each number, but making 2 random numbers simulating 2 seperate dice means that the further a number strays from 7 the less probable it is, just like in real life. It is the weighting of the dice which has me stumped.
May 9, 2010 at 7:07pm
I fixed a bug in my code.

There are many ways you could weigh dice. If you have two dice, each 1 - 6, you could create a function that does this:

Generates a random number 1000-6000.
Return 1 through 6, depending on what range this number is (i.e. return 1 if i > 999 && i < 2000)
To weigh the dice, manipulate the ranges.

I need more coffee...

-Albatross
Last edited on May 9, 2010 at 7:08pm
May 9, 2010 at 7:19pm
Weighing individual dice is pretty simple. Weighing the overall throw is more complex.

You could weigh the dice so that 1 and 6 are more common (and therefore that makes a total rolls of 2 and 12 more common), but that would also make 7 more common.

Whenever I need to do something like this, the first thing I do is come up with some numbers. I get the basic idea of what you're trying to do, but it's much easier to construct a formula if we have solid data to work with.

So let's do this. What are the odds you want for these dice throws?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
offset = 0.0
__________
odds 2, 12 = ??
odds 3, 11 = ??
...


offset = 0.5
__________
odds 2, 12 = 2.8% (1/36)
odds 3, 11 = 5.6% (2/36)
odds 4, 10 = 8.3% (3/36)
...  (normal dice throw)


offset = 1.0
_________
odds 2, 12 = ??
odds 3, 11 = ??


Fill in the blanks with the behavior you're looking for. Then it'll be easier to come up with a solution.
May 9, 2010 at 8:14pm
Hi Disch thanks for the reply.

I didn't have a specific range in mind, but with an offset of 0 i want the numbers which are normally the least probable to be the most probable, and with an offset of 1 the numbers which are normally more probable are even more probable. For an offset of 0 it might look something like this

2 - 15.19%
3 - 12.41%
4 - 9.63%
5 - 6.86%
6 - 4.08%
7 - 3.66%
8 - 4.08%
9 - 6.36%
10 - 9.63%
11 - 12.41%
12 - 15.19%

which I just botched together, but it does add up to 100%
It is essential that regardless of the offset all numbers are still possible outcomes.

May 10, 2010 at 3:14pm
Well this is actually kind of a tricky problem. I don't know if I have the time to really figure something out for you.

The only way I can really think to do it is something like Albatross mentioned:

Return 1 through 6, depending on what range this number is (i.e. return 1 if i > 999 && i < 2000)


The problem is changing the ranges based on a floating point that can change between calls, which can get hairy.

If you don't need the exact same probabilities as normal dice... maybe you can do something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
int odds[11] = {0};  // odds of rolling a 2-12

int x = X;  //  "X"
int acc = 0;

for(int i = 0; i < 6; ++i)  // figure out the odds for 2-7
{
  acc += x;
  x += Y * offset;  // "Y", I'm also not sure about this calculation

  odds[i] = acc;
}

for(...)  {...} // make the odds for 8-12 the same as 2-6

int r = rand() % odds[10];

for(int i = 0; i < 11; ++i)
{
  if(r < odds[i])
    return i+2;
}


Notes:

- This would work better if 'offset' has an effective range of -1.0 to 1.0 (instead of 0 to 1.0).
- 'X' would need to be calculated based on 'offset'. Basically 'X' would be the odds of a 2 being rolled. The lower 'offset' is, the higher 'X' would need to be.
- 'Y' would need to be something that adjusts the odds per roll. I haven't worked out the details of how that'd work, but hopefully you get the idea.
- I don't think multiplying 'Y' by offset would work (as I noted above). If offset is zero, this means x won't change and therefore all rolls will have equal probability, which isn't the behavior you're looking for. Although that does kind of make the most sense for a formula like this. Maybe you'd just have to use a higher value (like 0.3 or something -- don't know what exactly, I kind of pulled that number out of my ass) for a "normal" dice throw.


Anyway that's my idea. You'll have to play around and figure the rest out.
May 10, 2010 at 3:35pm
Why don't you just generate a number from 0 - 10k and then compare it to the percentages you want? It's not a nice solution, but it works...
May 10, 2010 at 3:45pm
Suppose that you want to do this with numbers from 1 to 5. Also suppose that you want 3 to has the least appearence frequency, 2 and 4 to appear three times more often than 3 and 1 and 5 to appear twice as often as 3. A simple way to do this is to create an array with 11 elements like this:
int array[]={1,1,2,2,2,3,4,4,4,5,5}; and then use array[rand()%11] to get the desired result.

If you want to do it with floating numbers as frequencies, well multiply each frequency with the least power of ten that eliminates decimal digits for all frequencies. For example, if you wanted in the previous example 2 and 4 to appear 1.2 times more often than 3 and 1 and 5 to appear 1.6 times more often than 3 than you could create an array with 10+2*12+2*16 elements, where you'd have 10 3s, 12 2s, 12 4s, 16 1s and 16 5s.

I have a feeling that this is similar to what Albatross suggested. This here is a bit faster but it's also more memory consuming.

I made a sample program to calculate this array for your example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include <iostream.h>
double absf[11]={15.19,12.41,9.63, 6.86, 4.08, 3.66, 4.08, 6.36, 9.63, 12.41, 15.19};

int main()
{
	int i;
	for (i=0; i<11; i++)
	{
		absf[i]/=3.66;
                cout << i+2 << ": " << int(10*absf[i]+0.5) << " time(s)" << endl;
        }
	
	return 0;
}

There probably are better ways... I'll tell you if I come up with something.
Last edited on May 10, 2010 at 3:50pm
May 10, 2010 at 5:38pm
The trick is getting the percentages based on a passed floating point value, which none of these approaches consider.

I agree this is a trivial problem if the desired precentages are fixed, but the OP clearly wanted the odds to be "bendable".
May 11, 2010 at 8:23am
Ok, I've got something for you. But first, I'd like to point out a problem relative to rand()%n when n is close to RAND_MAX. Disch may remember a discussion we had about this some time ago. I'm using here a custom rand() function with low RAND_MAX to demonstrate the problem. After that, I propose a way to fix it and then, using the same technique for fixing this problem, I demonstrate a way to do what you want. The trick in both cases is to hold the times of appearance for each number in an array, and don't allow the occurance of a value already appeared too many times.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
#include <iostream>
#include <fstream>
#include <cstdlib>
#include <cmath>
using namespace std;

const int period=20;
const int rand_max=period-1;
int seq[period]={0,1,2,3,4,5,6,7,8,9,
10,11,12,13,14,15,16,17,18,19};
int seed=0;

int draw();
int draw_calls=0;

int main()
{
    ofstream fout("out.txt");
    ifstream fin("in.txt");

    const int n=11;
    int freq[n]={0};
    int i;

    int start;
    int stop;

    fout << "way number 1...\n" << endl;

    start=clock();
    for (i=0; i<100000*period; i++)
        freq[draw()%n]++;
    stop=clock();

    for (i=0; i<n; i++)
        fout << i << " -> " << freq[i] << " time(s)" << endl;

    fout << "\ntime taken: " << (1000*(stop-start))/CLOCKS_PER_SEC << " ms" << endl;
    fout << "draw calls: " << draw_calls << endl;

    for (i=0; i<n; i++) freq[i]=0;
    draw_calls=0;

    fout << "\nway number 2...\n" << endl;
    int total=0;
    int temp;
    int count=0;
    bool redraw;

    start=clock();
    for (i=0; i<100000*period; i++)
    {
        while (true)
        {
            temp=draw()%n;

            count=0;
            redraw=false;
            while (freq[temp]>1000+total/n)
            {
                temp+=++count;
                temp%=n;

                if (count==10+n/4) {redraw=true; break;}
            }

            if (redraw) continue;

            freq[temp]++;
            total++;
            break;
        }
    }
    stop=clock();

    for (i=0; i<n; i++)
        fout << i << " -> " << freq[i] << " time(s)" << endl;

    fout << "\ntime taken: " << (1000*(stop-start))/CLOCKS_PER_SEC << " ms" << endl;
    fout << "draw calls: " << draw_calls << endl;

    fout << "\ncustom frequencies...\n" << endl;

    double target_freq[n];
    for (i=0; i<n; i++) target_freq[i]=0.0;
    double total_freq=0;
    bool problem=false;
    for (i=0; i<n; i++)
    {
        fin >> target_freq[i];
        if (!fin) break;
        total_freq+=target_freq[i];
    }

    fout << "target frequencies:\n";
    for (i=0; i<n; i++) fout << target_freq[i] << endl;
    fout << endl;

    if ( fabs(total_freq-1)>0.000001 ) problem=true;
    if (problem) {fout << "error reading from input file... total_freq=="
     << total_freq<< "..." << endl; return 0;}

    for (i=0; i<n; i++) freq[i]=0;
    draw_calls=0;
    total=0;

    start=clock();
    for (i=0; i<100000*period; i++)
    {
        while (true)
        {
            temp=draw()%n;

            count=0;
            redraw=false;
            while (freq[temp]>1000+target_freq[temp]*total)
            {
                temp+=++count;
                temp%=n;

                if (count==10+n/4) {redraw=true; break;}
            }

            if (redraw) continue;

            freq[temp]++;
            total++;
            break;
        }
    }
    stop=clock();

    for (i=0; i<n; i++)
        fout << i << " -> " << freq[i] << " time(s)" << endl;

    fout << "\ntime taken: " << (1000*(stop-start))/CLOCKS_PER_SEC << " ms" << endl;
    fout << "draw calls: " << draw_calls << endl;

    return 0;
}

int draw()
{
    draw_calls++;

    if (seed>rand_max) seed=0;
    return seq[seed++];
}

in.txt
.2
.15
.15
.02
.08
.1
.05
.11
.06
.05
.03

out.txt
way number 1...

0 -> 200000 time(s)
1 -> 200000 time(s)
2 -> 200000 time(s)
3 -> 200000 time(s)
4 -> 200000 time(s)
5 -> 200000 time(s)
6 -> 200000 time(s)
7 -> 200000 time(s)
8 -> 200000 time(s)
9 -> 100000 time(s) //<- watch this here
10 -> 100000 time(s) //<- and this here

time taken: 32 ms
draw calls: 2000000

way number 2...

0 -> 182818 time(s)
1 -> 182819 time(s)
2 -> 182819 time(s)
3 -> 182818 time(s)
4 -> 182818 time(s)
5 -> 182818 time(s)
6 -> 182818 time(s)
7 -> 182818 time(s)
8 -> 182818 time(s)
9 -> 182817 time(s)  //ok, now all frequencies
10 -> 171819 time(s) //are about the same

time taken: 47 ms
draw calls: 2042954

custom frequencies...

target frequencies:
0.2
0.15
0.15
0.02
0.08
0.1
0.05
0.11
0.06
0.05
0.03

0 -> 401000 time(s)
1 -> 301000 time(s)
2 -> 290000 time(s)
3 -> 41000 time(s)
4 -> 161000 time(s)
5 -> 201000 time(s)
6 -> 101000 time(s)
7 -> 221000 time(s)
8 -> 121000 time(s)
9 -> 101000 time(s) //got my custom
10 -> 61000 time(s) //frequencies! ALL RIGHT! :D

time taken: 93 ms
draw calls: 2000000

To sum up, use either (1) the method Albatross/firedraco suggested, modified to fix the rand()%n problem or (2) what I demonstrate here. Do your tests and see which one is faster.
Last edited on May 13, 2010 at 3:50pm
May 11, 2010 at 9:33am
First, I am sorry - I haven't read all the answers. Maybe my solution has already been posted before. Then all the credits to the original author! ;)

what you want to do is statistically blend multiple functions together. Just call either one of these functions, based on your alpha-value.

Example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
int random_after_normal_range() { ... } // return equal probable from 2 to 12. (classic 1D12)
int random_after_two_dices() { ... } // returns the classic 2D6
int random_with_strange_distribution() { ... } // returns the strange "2 and 12 are more often"

int random_dice(double alpha)
{
  double a = (double)rand() / RAND_MAX; // random variable from 0 to 1

  if (alpha > 0.5)
  { // blend 1D12 with 2D6
    if (a > (alpha-0.5)*2) return random_after_normal_range();
    else                   return random_after_two_dices();
  }
  else
  { // blend 1D12 with "strange distribution"
    if (a > alpha*2) return random_with_strange_distribution();
    else             return random_after_normal_range();
  }
}


This way, you can blend any statistical function together. The distribution will be equivalent to the "merged" distribution of both functions - weighted by alpha.


Think of the code like this: If alpha is approaching 1, then it becomes less and less often, that the random "a" will be larger than your alpha. This way, more and more often (but not always) you will use the "fallback" random_after_two_dices. If your alpha is exaclty 1, then in no cases can the random "a" be larger. This means you always draw from random_after_two_dices and you get your "2D6" behaviour for alpha=1.

Same goes for all the other cases: If you approaching 0.5 aplha value (from either side) the IF-clauses will more and more often use random_after_normal_range. On exactly 0.5, you will be always in the second outer-IF and there the IF will always be false too, as no random "a" (from 0 to 1) can be greater than your alpha*2 (which is 1).

Finally, as you approach 0, you always end in the second outer IF clause and the second IF will get more and more ofter "true" as your random "a" will be easier to be larger than the smaller number. Finally if alpha = 0, you always end up with a larger random variable "a" and you always draw from random_with_strange_distribution.

Ciao, Imi.
Last edited on May 11, 2010 at 10:07am
Topic archived. No new replies allowed.