3 partitions of equal sum

Pages: 12
I want to write a function that shows if it is possible for given vector of positive integers to be divided into 3 subsets with equal sums. As I understand I need to make sure that the sum of elements in vector are divisable by 3 and then try all sorts of combination of the elements sum to make sure one partition is equal to sum/3 and then take out used elements out of the vector and move on with the next partitions.

But I can't figure out how should I express it in code. Can you guys help me?
How would you do it on paper? Can you work thru an example? We can then encode that algorithm in C++.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
int partition3(vector<int> &A) {

  int sum = std::accumulate(A.begin(),A.end(),0);

  if(sum%3!= 0 || A.size()<3)
          return 0;

  int goal = sum / 3;

  int subsets[3] = {0};

  for(int e=0; e<A.size(); e++){



  }

}


That's my start
Have you worked thru an example on paper?
Last edited on
Do you know how to solve the problem of partitioning the vector into two?
https://en.wikipedia.org/wiki/Partition_problem

What are the constraints? I.e., how large can the vector get and what is the range of the elements?

Your question is number 6.25 here: https://people.eecs.berkeley.edu/~vazirani/algorithms/chap6.pdf
Last edited on
The constraints are: vector size 1-20, vector element size 1-30.

Yes I know how to solve problem partitioning vector into two
Last edited on
If you really know and understand how to partition it into 2, then it should be pretty straightforward to generalize that to partition it into 3. What's the problem?
Last edited on
With those constraints, back-tracking works.

Doesn't demand much memory, but probably isn't very efficient for large vectors.


(a) I dunno whether this is a good idea or not (although I've no doubt somebody will tell me in no uncertain terms if not!);

(b) I dunno whether my notes as follows make any sense to anyone but me!

I coded up the following and threw a few random samples of size 20 / maximum element 30 at it without problems, but there may be some specific nasty cases that I haven't tried.

Generate or input a trial vector T

Rule out trivial cases where sum(T) is not divisible by N ( N=3 in this instance )


Otherwise ... here goes ...


Find target partition size ( sum(T) / N )
Keep track of partitions (vector of vectors) and the sum of elements in them (vector)
Keep track of which partition you have put each element of T in (use a vector of same size as T).

Start with first element of T as current.
Start with first partition as current.

Loop ...
   Try to place current element of T in current partition OR LATER ( placeable if resulting partitition sum is <= target )

   If placed, update this partition and its sum; move to next element of T as current and set current partition back to first.

   If NOT placeable, then BACK-TRACK to previous element of T, remove it from its partition (and that sum)
      and try to place it in a LATER partition than it was in.

   If you are forced back so that you can't place the first element of T then return FALSE.
   If you've just placed the last element of T successfully then return TRUE (and all your partitions)


-------------

It helps to eliminate some impossible cases quickly if you sort T in reverse order first.


Original vector: 1 2 3 4 4 5 8 
Partitions:
1 8 
4 5 
2 3 4 
Last edited on
@lastchance, I agree that the problem instance is small enough for brut(ish) force. For curiosity's sake you should try your alg on larger instances to see how bad it gets.

Try vector size 100, values from 1 to 5.

Here's a problem generator that ensures the sum of the vector is divisible by 3 (although that doesn't necessarily mean there is a 3-partition).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include <iostream>
#include <random>

const int N     = 100;
const int High  = 5;

int main() {
    std::default_random_engine rng(std::random_device{}());
    std::uniform_int_distribution<> dis(1, High);

    std::vector<int> v;
    int sum = 0;
    for (int i = 0; i < N; i++) {
        int n = dis(rng);
        v.push_back(n);
        sum += n;
    }

    if (sum % 3 != 0) {
        std::uniform_int_distribution<> dis(0, N-1);
        int i;
        if (sum % 3 == 1) { // sum is one too high
            do i = dis(rng); while (v[i] <= 1);
            --v[i];
            --sum;
        }
        else { // sum is one too low
            do i = dis(rng); while (v[i] >= High);
            ++v[i];
            ++sum;
        }
    }

    for (int i = 0; i < N; i++) std::cout << v[i] << ' ';
    std::cout << '\n';
    std::cout << "sum: " << sum << '\n';
}

Hi @tpb,

I'll try the large N when I get back to a PC (currently on a train). I suspect that random cases aren't actually too bad, especially when there are solutions. The worst ones seem to be when all elements are equal, except for the last one or two. Worst I've found so far is 19 the same, followed by an element three less. This took about 0.9 seconds with full optimisation (-O3) or 5 seconds with default. The partitioning is then impossible, but it has to try just about everything to check that.


I wouldn't be surprised if the method deteriorates badly for large N, but sadly I don't know any better algorithms here (despite reading your link).
Last edited on
@tpb,

Here's the output from your generated vector. It's actually easy to partition because there are so many 1's: too short a time to measure!

I'm afraid random_device{} doesn't work as you hope on all implementations - it's not randomly seeding.

Original vector: 1 4 3 3 4 2 2 2 2 1 4 4 2 4 1 2 1 3 2 4 4 5 5 3 5 2 4 1 1 2 4 1 1 3 5 3 5 5 2 5 1 5 5 4 2 4 1 3 2 2 2 3 2 5 2 1 1 3 5 2 3 5 2 2 2 1 1 2 5 2 2 1 1 2 3 5 1 2 5 4 4 1 3 2 3 1 3 3 5 1 3 2 5 1 5 4 2 1 2 2 
Partitions:
2 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 
3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 
Time taken = 0 ms


This one has only 20 elements, but is a lot worse:
Original vector: 9 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 
Impossible
Time taken = 546 ms
@lastchance, I know random_device isn't nondeterministic on all implementations. I'm surprised it doesn't work that way on your's, though. What is your OS? Processor? Compiler?

If it doesn't work then the high_resolution_clock is a better choice than std::time():
1
2
auto seed = std::chrono::high_resolution_clock::
                now().time_since_epoch().count();


How long does your code take for 39 12's and a 9?
tpb wrote:
How long does your code take for 39 12's and a 9?

Longer than I'm prepared to wait for!!! I gave up.

20 elements is OK; much more clearly isn't, although random selections with plenty of low denominations aren't a problem. I'd love to see a good algorithm for this, though. I can follow the 2-partition version, but it's not obvious to me that it wouldn't burgeon out of hand with three partitions.


PS.
As for the implementation: Windows 7 64 bit; compiler: gcc version 6.3.0 (MinGW.org GCC-6.3.0-1). I don't use random_device in my own code as I'm aware that it isn't available on many implementations. I usually seed with the clock.


Last edited on
Maybe I'm oversimplifying it, but my current solution solves N=1000 and High=20 in under half a second. It also solves (i.e., says "no solution") 999 12's and 1 9 in under half a second.

And there's clearly something wrong with MinGW (or some underlying windows library) if it's not able to avail itself of the Intel processors built-in RDRAND instruction. That's just broken.
Last edited on
Ah, you are right about the partitioning. I'm overcomplicating it by actually trying to find the partitions. The question only asks if one exists.
Well, my solution will actually print the partitions if they exist. It's not really possible to generally solve the problem without determining what the partitions are.
I have to throw in the towel; my method's not up to it!

Looks like dynamic programming wins hands-down.

Does your solution work OK with bigger numbers? Say:
- twenty lots of 1002 and one of 999 (no solution)
- eighteen lots of 1002 and three of 999 (balanced partitions)

(Obviously you could subtract the minimum value from all elements first to circumvent the large-sum problem, but how does it do without that?)
Last edited on
20:1002, 1:999
No solution
0.035s

18:1002, 3:999
Solvable
999 1002 1002 1002 1002 1002 1002
sum: 7011
999 1002 1002 1002 1002 1002 1002
sum: 7011
999 1002 1002 1002 1002 1002 1002
sum: 7011
0.050s

2000:1002, 100:999
Solvable
4 minutes 14 seconds

I still feel my current solution is possibly wrong and that there would be inputs that it would not solve correctly. All it does is the 2-partition algorithm on the input vector (using a third of the sum), removes those elements, and then 2-partitions the rest. I was trying to come up with something more sophisticated using a 3D array but couldn't wrap my head around it.
Sounds interesting - I'll post my code later.

But what happens if you have a vector like
1,1,1,2,2,2

You might strip off a partition with sum 3, say 1,1,1. But then that leaves 2,2,2 which can't be split into equal partitions.

However, the original vector could clearly be partitioned 1,2 - 1,2 - 1,2

Not sure whether this is what your code would do. Your times for the difficult cases are incredibly impressive.
I thought for sure you came up with a case that would stump it, but somehow (I don't even know how) it solved it. My brain is not awake enough yet for me to consider why that is (just finishing first coffee; still haven't had breakfast). But I'll definitely think about that case later.

As for the times, the way it does it is to just fill in a table so the times are always going to be good for small values. But it really gets bogged down for large values since the array gets really large. Maybe using a bitset would help, fitting more of it in the cache. I haven't tried any (potential) optimizations yet. Stupidly, I never even used -O3 (or anything) when I compiled it! I'm going to try that 2000:1002, 100:999 case again with -O3 to see what difference it might make.

I'll post my code later, too (not sure when "later" should be). It's kind of a mess right now, though. I'll fix it up a little.

EDIT: Yikes! I just ran that big case with -O3 and instead of 4 minutes it took 8.5 seconds! Don't forget to compile with optimizations, kids!
Last edited on
Pages: 12