unordered alternative to next_permutatio

Forum

Forum
General C++ Programming
unordered alternative to next_permutatio

unordered alternative to next_permutation()?

Hello! Forgive me if anything I say is dumb, I'm a fairly novice programmer.

Currently I have a vector of objects and I am permuting the vector easily with next_permutation. However, in an effort to reduce running time for the algorithm that I am using, I restrict the maximum number of permutations to be some number. This is also easily done, however since next_permutation works in lexicographical order, I am only getting the "first" permutations.

For various reasons, I want to sample random permutations, without duplication. It would be wonderful if there was an analog to next_permutation that simply didn't behave in lexicographical order. I'm not sure this exists.

Some of the behavior of next_permutation that I enjoy is that it can act on the vector without returning the permuted vector. I'm not sure what allows me to do this, but it's great. I also especially like that it returns "false" when all permutations have been exhausted.

I had thought to construct a function that is handed the vector, reorders it, and returns the permuted vector, however, it would be difficult to keep track of which permutations I had already done.

Can anyone guide me to a way to get unordered permutations in a way analogous to next_permutation?

cppmatt (37)

I don't think what you are asking is possible. The function which generates lexicographical permutations does so by simple arithmetic operations on the last permutation. Thus only one permutation is in storage at once.

There are three options that I can see.

1. Generate all of the permutations, then randomly select the number you need from the full list. This can easily run into storage problems if the number of permutations is large enough.

2. Generate random permutations one at a time, storing each one as it is used. Then every time you generate a new random permutation, check to see if it has already been generated by looking at each permutation in storage. If the current permutation is in the list, generate a new permutation until one is found that is not in the list. This could also run into memory problems, unless the number of permutations you need is small enough.

3. Shuffle the indices of the number of permutations, then call next_permutation until you get to each index number. This would seem to be slow, but also not require a lot of storage.

Last edited on

Bazzy (6281)

To reduce memory consumption you can keep your vector element in a fixed order and shuffle a sequence of indices ( so you need to store sequences of indices rather than the whole vector )

ne555 (10692)

Or just generate a number that represents the index of the permutation (less memory, more processing)
123456 -> 0 654321 -> 719

tition (887)

what is an index of a permutation?

cppmatt (37)

tition,

I believe ne555 is thinking along the lines of the third option I presented above. There are 720 permutations of the numbers 1 through 6. So we could generate a random number N between 0 and 719 to use in order to draw a random permutation, but we still would either have to cycle through next_permutation N times to get that particular permutation, or store all 720 permutations and look up the Nth one. If we are interested instead in permutations of the numbers 1:90, say, the second option is not feasible, and the first option may take too long (years) to complete. That is why I would go with my 2nd option in a case that large.

Last edited on

Capt Crunch (2)

So I need a one-to-one and onto map (f) from the set of permutations (S_n) to the integers mod N!. Once I have that I can do the following:

1 - Construct a random permutation x. Record f(x).
2 - Construct a new random permutation x1. Check to see if f(x1)==f(x). If so, discard. If not, record f(x1).

Lather, rinse, repeat, discarding any repeated permutations. This would be a bad way of doing things if the number of permutations is small, but a good way for a large number of permutations. I could simply have two cases.

It's a bit annoying, any thoughts? What is an easy choice for the map f?

ne555 (10692)

Sorry, cppmatt I missed your post.
That was the idea, but without doing next_permutation in order to reach the number N
More like this:

void permutation(int index, std::vector<int> &array){
	size_t size=array.size();
	std::vector<int> used(size);
	for(int K=0; K<size; K++) used[K]=K;
	
	for(int K=0; K<size; K++){
		int value = index/factorial(size-K-1); //magic (how much till I change my value)
		array[K] = used[value];
		used.erase( used.begin()+value ); //to avoid repeat values
		
		index %= factorial(size-K-1); //normalize
	}
}

ne555 (10692)

I don't understand why do you need a map. Just use a set, when you insert an element it will tell you if it was present before.
Edit: if you want to use less memory, then the lexicographical distance to the sorted array will be a good mapping (the index of the permutation)

Last edited on

simeonz (490)

I was thinking along the lines of the solution with indices and came up with this. I use boost lambda for the first time. I am sick of writing helper functor classes for each use of algorithm.

#include <iostream>
#include <algorithm>
#include <ctime>
#include <boost/lambda/bind.hpp>
#include <boost/lambda/lambda.hpp>

using namespace std;
using namespace boost;
using namespace boost::lambda;

#define ARRAY_SIZE(array) (sizeof(array) / sizeof((array)[0]))

template <typename T, size_t N>
inline T* array_begin(T(& array)[N]) { return array; }

template <typename T, size_t N>
inline T* array_end(T(& array)[N]) { return &array[N]; }

template <typename T>
inline void fill_sequence(T* begin, T* end, const T& start = T())
{
    *begin = start;
    transform(begin, end - 1, begin + 1, _1 + 1);
}

int main()
{
    int data[10];
    int indexes[ARRAY_SIZE(data)];

    srand(time(NULL));

    fill_sequence(array_begin(data), array_end(data));
    random_shuffle(array_begin(data), array_end(data), bind(rand) % _1);
    fill_sequence(array_begin(indexes), array_end(indexes));
    random_shuffle(array_begin(indexes), array_end(indexes), bind(rand) % _1);

    cout << "data:\n";
    for_each(array_begin(data), array_end(data), cout << _1 << ' ');
    cout << endl;

    cout << "indexes:\n";
    for_each(array_begin(indexes), array_end(indexes), cout << _1 << ' ');
    cout << endl;

    cout << "data[indexes]:\n";
    for_each(array_begin(indexes), array_end(indexes), cout << var(data)[_1] << ' ');
    cout << endl;

    cout
            << "\n"
               "Permutations:" << endl;

    for (int i = 0; i < 10; ++i)
    {
        next_permutation(array_begin(indexes), array_end(indexes));

        for_each(array_begin(indexes), array_end(indexes), cout << var(data)[_1] << ' ');
        cout << endl;
    }
}

However, the problem with this is that every next permutation still differs from the previous one with just one transposition. So that is IMO not true randomness. Then again, if you are choosing objects without repetition, the choices are not independent to begin with.

I am thinking also this - the random_shuffle algorithm just performs randomly chosen transpositions on successively bigger portions of the shuffled array. next_permutation sort of does something similar. That is, next_permutation can be viewed as random_shuffle that feeds the transposition indices in deterministic way. The question is - is it possible to increase the number of transpositions that next_permutation causes and still iterate through all of them.

Regarding choosing the permutation index randomly - factorial is a fast growing function. For larger permutations you will need to deal with huge quantities. I mean you can not use built-in types for this in general, I think. (Correct me if I am wrong.)

Regards

EDIT: changes to the code - found errors

Last edited on

ne555 (10692)

1
2
3

#define ARRAY_SIZE(array) (sizeof(array) / sizeof((array)[0]))
template <typename T, size_t N>
inline size_t array_size(T(& array)[N]){ return N; }

I don't understand what you did it.
You start with a random disposition of the indexes and values, however then you just do next_permutation (?). why don't just keep with random_shuffle?

cppmatt (37)

Here is along the lines of how I was thinking. This wouldn't guarantee uniqueness, so a record would have to be kept if that was of primary concern. On the other hand, if the number of objects is large enough, the odds are small of a repeat if the fraction of permutations needed is small enough.

#include <iostream>
#include <vector>
#include <time.h>
using namespace std;
void permutation(vector<int> &array)
{
	size_t size = array.size();
	int tmp, idx1, idx2;

	for(int K=0; K<2*size; K++)
	{
		idx1 = rand() % size;
		idx2 = idx1;
		while (idx2==idx1)
			idx2 = rand() % size;
		tmp = array[idx1];
		array[idx1] = array[idx2];
		array[idx2] = tmp;
	}
}

int main()
{
	vector<int>  myvect;
	for (int ii = 0;ii<15;ii++)
		myvect.push_back(ii+1);
	srand (time(NULL));
	cout << "The original vector: " << endl;
	for (int ii = 0;ii<myvect.size();ii++)
		cout << myvect[ii] << " ";
	cout << endl;

	for (int jj = 0;jj<=20;jj++)
	{
		permutation(myvect);
		cout << "A permutation: " << endl;
		for (int ii = 0;ii<myvect.size();ii++)
			cout << myvect[ii] << " ";
		cout << endl;
	}
}

Last edited on

simeonz (490)

Regarding the macro thing, because I like the template version also...

I don't use array_size, because it turns out that standard C++ does not allow arrays with variable sizes and you need constant expressions. That is, you can not declare array and compute its size even with inline functions. It compiles in gcc, but it is extra. Although I think C++0x will allow it, it would still make it impossible for the compiler to do syntax checks on the size of aggregate initializers and such. So I wait to see what generalized constant expressions in C++0x will offer. I agree that array_size should be used where constant expressions are not required.

I can not random_shuffle, because I do not want repetitions. I try for a strategy without remembering the permutations generated so far.

What I did is this - I have array of indices pointing inside data. I random shuffle data and the indices. Then I start permuting the shuffled indices. The factual state as the client should see it is the array indirectly accessed, first through the indices and then the resulting index in data.

Last edited on

ne555 (10692)

I don't use array_size, because it turns out that standard C++ does not allow arrays with variable sizes and you need constant expressions.

? array_size uses the same logic as array_begin or array_end, there is no variable size

1
2

int array[42];
cout << array_size(array); //this generates the function size_t array_size( int (&array) [42] );

simeonz (490)

I know that array_size produces the same result invariantly. It is just that using function call inside the declaration of array instead of the ARRAY_SIZE macro, makes the size of the array technically speaking (purely as terminology) "variable".

1
2

int fixed_size_array[ARRAY_SIZE(some_other array)];
int variable_size_array[array_size(some_other_array)];

Just to be clear. I know that the result is the same, but the compiler does not know it. The compiler treats the array_size function like any other function. It does not know that this function can be pre-computed. C++0x will have generalized constant expressions which will be pre-computed. In other words, the compiler thinks that the size of the array will be unknown until run-time and can-not pre-allocate space on the stack.

simeonz (490)

Found some errors in the lambda expressions in my code... Not used to them yet. That explains why the code looked a bit puzzling before.

simeonz (490)

Here is another attempt. Green teases my eyes, so I'll explain the idea.

Permutations of arbitrary items correspond to permutation of numbers 1 through n. One can encode a permutation with a vector. One way to do this is to simply store the permuted numbers in the vector and then use them as indices when applying the permutation. Another approach is to store one transposition (essentially a swap operation) in every element. If the element with index i holds value j (j <= i), then the transposition is (i, j). Elements at index i and index j must be swapped. All transpositions are applied one after another by interchanging elements in the permuted array.

For the vector of transpositions: [0, 0, 2, 1],
the transpositions are (0, 0), (1, 0), (2, 2), (3, 1). The corresponding permutation of [0, 1, 2, 3] is [1, 3, 2, 0]. (Swap 0 with 0, 1 with 0,..) The first transposition can be omitted, because of the constraint (i, j), j <= i. This implies that it is always (0, 0). I skip it entirely and my indexes in the code look funnier.

What I do at each step (for generating the next permutation using transpositions vector) is to increment the value stored at every index i with 1. The resulting value j is taken modulo (i+1) (i.e. j = j % (i+1)). This guarantees that j <= i and the prefix part of the vector is independent from the suffix part. That is, the leading transpositions can be applied on the beginning portion of the target array independently of the trailing ones.

The problem is that every so often a transposition will wrap-around, because (j + 1) % (i + 1) == 0. The indices i are generally not co-prime and this can happen simultaneously for the entire vector, even before every permutation has been generated. For large number of elements there is long sequence before this occurs. But anyways, this is why I maintain another vector with offsets. I increase the offsets vector every time a total wrap-around occurs (the transpositions simultaneously reset) as if it is one big numeral (i.e. perform increment with carry). This starts another unique sequence of permutations with those offsets applied. Only prime indices should have offsets (my observation).

All permutations will be generated, and because every step alters (almost) all transpositions, the overall appearance is that the data is shuffled. Of course, this sequence is deterministic and can be hacked. One can always start from random permutation in the target array, but this will only hide the fact from the unsuspecting eye.

Here is the code.

#include <iostream>
#include <algorithm>
#include <iterator>
#include <vector>
#include <ctime>
#include <boost/function.hpp>
#include <boost/lambda/lambda.hpp>
#include <boost/lambda/bind.hpp>

using namespace std;
using namespace boost;
using namespace boost::lambda;

class PermutationIndex {

public:

    PermutationIndex(size_t size) :
        m_transpositions(size - 1), m_offsets(size - 1), m_primes(size - 1, true)
    {
        for (int i = 1; i < (int)m_primes.size(); ++i)
        {
            for (int j = 0; j < i; ++j)
            {
                if (m_primes[i] &&
                        ((i + 2) % (j + 2)) == 0)
                {
                    m_primes[i] = false;
                    break;
                }
            }
        }
    }

    void nextShuffle()
    {
        int non_reset = m_transpositions.size();

        for (int i = 0; i < (int)m_transpositions.size(); ++i)
        {
            m_transpositions[i] = (m_transpositions[i] + 1) % (i + 2);

            if (m_transpositions[i] == m_offsets[i])
            {
                --non_reset;
            }
        }

        if (non_reset == 0)
        {
            for (int i = 0; i < (int)m_transpositions.size(); ++i)
            {
                if (m_primes[i]) { continue; }
                else
                {
                    int result = m_offsets[i] + 1;

                    if (result < i + 2)
                    {
                        m_transpositions[i] = m_offsets[i] = result;
                        break;
                    }
                    else
                    {
                        m_transpositions[i] = m_offsets[i] = 0;
                    }
                }
            }
        }
    }

    template <class RandomAccessIterator>
    void apply(RandomAccessIterator first, RandomAccessIterator last)
    {
        int size = distance(first, last);

        for (int i = 1; i < size; ++i)
        {
            int j = m_transpositions[i - 1];

            if (i != j) swap(first[i], first[j]);
        }
    }

    void print()
    {
        function <void (int)> printer = cout << _1 << ' ';

        cout << "m_transpositions:\n";
        for_each(m_transpositions.begin(), m_transpositions.end(), printer);
        cout << '\n' << endl;

        cout << "m_offsets:\n";
        for_each(m_offsets.begin(), m_offsets.end(), printer);
        cout << '\n' << endl;

        cout << "m_primes:\n";
        for_each(m_primes.begin(), m_primes.end(), printer);
        cout << '\n' << endl;
    }

private:

    vector<unsigned int> m_transpositions;
    vector<unsigned int> m_offsets;
    vector<bool> m_primes;
};

#define ARRAY_SIZE(array) (sizeof(array) / sizeof((array)[0]))

template <typename T, size_t N>
inline size_t array_size(T(& array)[N]) { return N; }

template <typename T, size_t N>
inline T* array_begin(T(& array)[N]) { return array; }

template <typename T, size_t N>
inline T* array_end(T(& array)[N]) { return &array[N]; }

template <typename T>
inline void fill_sequence(T* begin, T* end, const T& start = T())
{
    *begin = start;
    transform(begin, end - 1, begin + 1, _1 + 1);
}

int main()
{
    int data[3];
    int permutation[ARRAY_SIZE(data)];

    srand(time(NULL));

    fill_sequence(array_begin(data), array_end(data));
    random_shuffle(array_begin(data), array_end(data), bind(rand) % _1);

    PermutationIndex p_index(array_size(data));

    cout << "data:\n";
    for_each(array_begin(data), array_end(data), cout << _1 << ' ');
    cout << endl;

    cout
            << "\n"
               "Permutations:" << endl;

    for (int i = 0; i < 7; ++i)
    {
        p_index.nextShuffle();

        // *** FOR DEBUGGING ***
        // p_index.print();
        // cout << endl;

        copy(array_begin(data), array_end(data), array_begin(permutation));
        p_index.apply(array_begin(permutation), array_end(permutation));

        for_each(array_begin(permutation), array_end(permutation), cout << _1 << ' ');
        cout << '\n' << endl;
    }
}

Best Regards

Last edited on

Topic archived. No new replies allowed.