Generating and Sorting numbers.

Hello. How can I generate 1 million random numbers and sort them in c++.Can I use a vector to sort the numbers?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <iostream>
#include <vector>
#include <ctime>
#include <cstdlib>
#include <algorithm>
using namespace std;


int main()
{
   const int MAX = 1000000;
   vector<int> a( MAX );

   for ( size_t i = 0; i < MAX; i++ ) a[i] = rand() % 1000;
   sort( a.begin(), a.end() );

   string dummy;
   cout << "Press enter if you dare!";   getline( cin, dummy );
   for ( size_t i = 0; i < MAX; i++ ) cout << a[i] << '\n';
}

but, if you are not playing around and writing a sort yourself or similar classroom work, why not generate the values in sorted order and skip the sort? Even something simple like

a[i] = rand()%initialsomething;
for(..etc..)
a[i] = a[i-1]+rand()%something;

or something similar?
This is obviously overly simple and crude; it depends on what you need your random stream to be.
Last edited on
a[i] = a[i-1]+rand()%something;

That will not work. It cannot produce the same probability distribution as randomly generating and then sorting.

Consider a numerical example:
Imagine that your original numbers have to be between 0 and 999. This is easy to ensure if you generate them randomly and then sort. However, there is no conceivable probability distribution that could be applied ADDITIVELY and GUARANTEE that none of the final numbers exceeded 999, yet still allows a finite probability of a sequence like {0, 1, 998, 999}. To get this additively you would have to allow each addition to possibly reach +997, ... but if this took place every time then it would lead to numbers far in excess of 999.
If the range of the random integers is not prohibitively high:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <vector>
#include <random>
#include <ctime>

// generate a sorted sequence of n random integers
// in the range [min,max] invariant: min<max
std::vector<int> generate_rn( std::size_t n, int min, int max )
{
    // http://en.cppreference.com/w/cpp/numeric/random
    static std::mt19937 rng( std::time(nullptr) ) ;

    const std::size_t diff = max - min ;
    std::uniform_int_distribution<int> distr( 0, diff ) ;

    std::vector<int> counts(diff+1) ;
    while( n-- ) ++counts[ distr(rng) ] ;

    std::vector<int> result ;
    for( std::size_t i = 0 ; i <= diff ; ++i  )
        result.insert( result.end(), counts[i], i+min ) ;
    return result ;
}

http://coliru.stacked-crooked.com/a/20f7fbde6f7ea5a3
Last edited on
Now that really is an ingenious solution!
A million is not a particularly large number. You could easily generate it directly in O(n+nlogn) → O(nlogn) time, with only O(n) space.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <algorithm>
#include <ctime>
#include <random>
#include <vector>

// generate a sorted sequence of n random integers
// in the range [min,max] invariant: min<max
std::vector<int> generate_rn( std::size_t n, int min, int max )
{
    std::vector<int> result( n ) ;

    // http://en.cppreference.com/w/cpp/numeric/random
    static std::mt19937 rng( std::time(nullptr) ) ;
    std::uniform_int_distribution<int> distr( min, max ) ;

    std::generate_n( result.begin(), result.size(), [&](){ return distr(rng); } ) ;
    std::sort( result.begin(), result.end() );

    return result ;
}

JLBorges’s brilliant code uses a Counting Sort.
http://www.cplusplus.com/faq/sequences/sequencing/sort-algorithms/counting-sort/
much better.
Yes, I knew my quick crack at it was flawed, I was having trouble coming up with a proper implementation but it got the idea across (that the sort can be avoided).


... (that the sort can be avoided)

Uh, that wasn't the point at all -- because it can't be avoided -- not without incurring greater complexity and/or biasing the data.

Also, using remainder on the random value also biases the data (read up on the pigeonhole principle). Use a std::uniform_int_distribution.
Yes, I understand that fully along with the many issues with rand() & %.

There is no requirement from the OP, so any stream of sorted values will do, regardless of distribution or range of values or bias concerns. So assuming that any list of numbers that is sorted will do(maybe he just wants to test a binary search routine), you can generate directly. I said that above... it depends on what is needed.

Which brings us to JL's code.
You could fill in the bucket vector of the bucket sort with a uniform distribution of 1000 and create the target from that, if approximately 1M numbers instead of exactly 1M will do. There again, it depends on the needs. This may even be preferable, as it exercises the code on even and odd sized input vectors, and so on.

In all cases it depends on what the need is whether the shortcut will work. If the actual requirement were exactly 1M, 1000 range, uniform, sorted, and having variation across program runs, there is no shortcut worth taking, agreed. Remove one or more of those requirements (excepting sort) and it becomes possible.

Sometimes, you just want some numbers, and that is all you need.






Last edited on
closed account (48T7M4Gy)
Sometimes, you just want some numbers, and that is all you need.


https://qrng.anu.edu.au/RainHex.php
Topic archived. No new replies allowed.