Grouping subset into larger size sets

I'd like to write a program that given a list of sets will output sets of larger size. Building them choosing only from the input sets.

i.e.

If I have the following couples:
01 22
04 58
11 34
12 23
15 87
18 26
18 36

23 36
24 81
26 36

The output would be
18 26 36

because
18 26 36 = 18 26 + 18 36 + 26 36 (all of them are in the input sets)
And I can't buid no more triples using gives sets

Does anybody know an efficient algorithm to use for this problem?
Thanks in advance.
Last edited on
Questions about the parameters of the problem:

-Will the first column of numbers always be sorted from smallest to largest?
-For each entry in the first column, will the adjacent entry in the second column always be larger?
-Can the larger output sets be more than three numbers (A,B,C) : (A,B) (A,C) (B,C)?
Yes, we can assume the input is completely ordered (vertically and horizontally ).
Yes the ouput sets can be larger than three (it depens of how many subsets you can find in the input for forming a family of sets).

But it's important that I cannot use the same set (row) twice.

Thank you for interesting in my question
Could you please specify:

Given:
A1 B1
A2 B2
A3 B3
...

Does (Ax < Bx) always hold true?

In your example case it does:
01 < 22
04 < 58
11 < 34
12 < 23
15 < 87
18 < 26
18 < 36
23 < 36
24 < 81
26 < 36

But will it always?
Of course, it's always

Ax < Bx

And no duplicates allowed.
From these laws, we can assume that in order to have a set of more than two pairs, we must come across two pairs where A(x) = A(x+1).

So to start, try coding something that loops through the list, and calls a conditional where A(x) = A(x-1).

From here you now have A(x) = A(x - 1) = num1, B(x) = num2, and B(x-1) = num3. If you are only considering a set of three pairs, you would look for an occurrence of A = num2 (given num2 < num3). Upon finding it you would check to see if this A's B = num3. If it is, you have found a three set!

If you want to consider any subset size, then upon finding an A = num2, if it is not num3, you would store that A's B value and continue searching for it onward for this new value in future A's as well.
----------------------------------------------------------------------------------------------------------------
Edit: I have come up with something in my head, but won't be implementing it right now. I will look into implementation later, but I encourage (if you haven't already) try to implement it yourself.
Last edited on
Sorry if this comes across as a noob question, but I still don't understand how that input produces that output?
Hi again,
thanks @PrivateRyan for suggestions.
I think my problem is strictly related to the "Shortest common supersequence problem" (https://en.wikipedia.org/wiki/Shortest_common_supersequence_problem)


Answering to @Arslan7041 I can provide a new example:

Please consider,
this set of size five { 01 02 03 04 05 },
it generates the subsets of size four
{ 01 02 03 04 }
{ 01 02 03 05 }
{ 01 02 04 05 }
{ 01 03 04 05 }
{ 02 03 04 05 }

while this other set of five { 01 02 10 13 39 }
it generates
{ 01 02 10 13 }
{ 01 02 10 39 }
{ 01 02 13 39 }
{ 01 10 13 39 }
{ 02 10 13 39 }

(you can easily generate subsets of size four just removing one item each time starting from the five item set)

So, if I am processing the following input
(that is a ordered list of sets of size four, and don’t know which sets of size five I can make yet)

01 02 03 04
01 02 03 05
01 02 04 05
01 02 10 11
01 02 10 12
01 02 10 13
01 02 10 14
01 02 10 35
01 02 10 39
01 02 12 21
01 02 12 30
01 02 12 36
01 02 13 39
01 03 04 05
01 10 13 39
02 10 13 39
02 03 04 05
09 10 11 12

The ouptut should be:
01 02 03 04 05
01 02 10 13 39



Infact, as you can see I found in the input:
-all subsets (a) I need for building the set A= 01 02 03 04 05 (first line of the output)
and also
-all subsets (b) I need for building the set B= 01 02 10 13 29 (second line)

01 02 03 04 (a)
01 02 03 05 (a)
01 02 04 05 (a)
01 02 10 11
01 02 10 12
01 02 10 13 (b)
01 02 10 14
01 02 10 35
01 02 10 39 (b)
01 02 12 21
01 02 12 30
01 02 12 36
01 02 13 39 (b)
01 03 04 05 (a)
01 10 13 39 (b)
02 10 13 39 (b)
02 03 04 05 (a)
09 10 11 12


I think I will try with this approach:
1) I'll examine a line of the input (i.e 01 02 03 04)
2) I'll search until I find a set that have 3 common items (01 02 03 05)
3) If I found second sets with 3 common items with the first one I'll build the 5 items set and I'll generate all subsets of size four (i.e all the (a) sets).
3.1) Then I'll scan again to see if I have all the generated subsets in my input.
- If I have all the generated subset in my input I'll delete (or check as used), then I'll go to the step 1)
- If I don't have all the generated subset in my input I'll repeat the step 2) to search for the next set with 3 common items..(while not end of file)
4) I'll examine the next (unsigned) line of the input and go to step 2)

this until the end of file


Thanks again for your replays.
You have introduced another factor into your problem. I was under the impression that you were working with pairs, but this changes things even more.
I'm looking for a solutions that increase the input sets size of a unit (or more if possible), but if you have a good idea that solve the problem for couples, maybe is possible to apply the same algorithm to larger set size.

I'm still looking through the Internet because I think this is a known problem and I hope to find something alreadymade to solve it in the right way.

Greetings
Topic archived. No new replies allowed.