Generating subsequences of string

I need to work out the correct algorithm for generating all the possible subsequences of a string.

Subsequence: http://en.wikipedia.org/wiki/Subsequence

eg.

string:
ABACUS

subsequences:
1 >> A, B, A, C, U, S
2 >> AB, AA, AC, AU, AS, BA, BC, BU, BS, AC, AU, AS, CU, CS, US
3 >> ABA, ABC, ABU, ABS, AAC, AAU, AAS, ACU, ACS, AUS, BAC, BAU, BAS, BCU, BCS, BUS, ACU, ACS, AUS, CUS
4 >> ABAC, ABAU, ABAS, ABCU, ABCS, ABUS, AACU, AACS, ACUS, BACU, BACS, BCUS, ACUS
5 >> etc...

So far, I can work out what the next subsequence is, for the position of each character at least:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
string next(string sub, int max) {

	int length = sub.length();
	int pos = length - 1;
	
	//find first digit that can be increased
	while(pos >= 0)
	{
		if(sub[pos] - '0' == max - (length - 1 - pos))
			pos--;
			
		else
			break;
	}

		sub[pos]++; //increase digit
		
		//update other digits
		for(int a = pos+1; a < length; a++)
			sub[a] = sub[a-1] + 1;
		
		return sub;
		
}


The problem is, this cannot determine when the last subsequence for that length has been found, so my output will look something like this (where the string is 1234567 and the length is 6):


123456
123457
123467
123567
124567
134567
234567
������
������
������	



So, how can I work out when the last subsequence is reached? And is there a specific algorithm for finding all the subsequenes?
Last edited on
I think you're being confused by the letters. As they're not sequential, you probably should consider working with just positions, then filling in the letters at the end.

Also, the list of 3 letter combinations is incomplete. You may want to work thru a few by hand so you can understand the algorithm before you write any code.
Working with positions is what I'm doing at the moment. Well spotted with the 3 letter combinations, I think that was just me being silly.

However, the problem is that I don't know how to work out when I have found the last subsequence, mostly due to weird problems I'm having when I try to implement my ideas. Perhaps I'll try with arrays, instead of strings; I'm more familiar with them.

Thanks for the advice!
This actually confused me a bit. It looks sort of like a permutation but sort of like a substring but not really either. I could see subsets (permutations) but as "ABU" is found nowhere in "ABACUS" is it really a sub-sequence?
Substring means that the letters/digits must be consecutive (eg. ABA), subsequence means any from the order (eg. ABU). I forget the term for any re-arrangements (eg. CBS).
Okay, I think I've got it.

Here is the code for finding all the subsequences of a given string:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
#include<iostream>
#include<string>

using namespace std;

int sub[10];

void next(int max, int length) {

	int pos = length - 1;
	
	//find first digit that can be increased
	while(pos >= 0)
	{
		if(sub[pos] == max - (length - 1 - pos))
			pos--;
			
		else
			break;
	}

		sub[pos]++; //increase digit
		
		//update other digits
		for(int a = pos+1; a < length; a++)
			sub[a] = sub[a-1] + 1;
		
}

int main()
{
	string word;
	cin >> word; 
	
	int max = word.length() - 1; //max value
	

	for(int n=1; n <= max+1; n++)
	{
		
		cout << n << "\n----\n";
	
		for(int i = 0; i < n; i++)
		{
			sub[i] = i;
		}
	
		for(int a = 0; ; a++)
		{				
			for(int b=0; b < max+1; b++)
				cout << word[sub[b]];
				
			cout << '\n';
		
			if(sub[0] == max - (n - 1))
				break;
				
			else
				next(max, n); //maximum value and last position
		}	
		
		cout << '\n';

	}	
	

	return 0;
}


It's a bit of a mess, but it works:


input: 
abacus

output:
1
----
a
b
a
c
u
s

2
----
ab
aa
ac
au
as
ba
bc
bu
bs
ac
au
as
cu
cs
us

3
----
aba
abc
abu
abs
aac
aau
aas
acu
acs
aus
bac
bau
bas
bcu
bcs
bus
acu
acs
aus
cus

4
----
abac
abau
abas
abcu
abcs
abus
aacu
aacs
aaus
acus
bacu
bacs
baus
bcus
acus

5
----
abacu
abacs
abaus
abcus
aacus
bacus

6
----
abacus



The reason I needed the algorithm was to solve this problem: http://codeforces.com/contest/202/problem/A

I can't test my code, because the website won't load for me, but here it is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
#include<iostream>
#include<string>

using namespace std;

int sub[10];

void next(int max, int length) {

	int pos = length - 1;
	
	//find first digit that can be increased
	while(pos >= 0)
	{
		if(sub[pos] == max - (length - 1 - pos))
			pos--;
			
		else
			break;
	}

		sub[pos]++; //increase digit
		
		//update other digits
		for(int a = pos+1; a < length; a++)
			sub[a] = sub[a-1] + 1;
		
}

int main()
{
	string word;
	cin >> word; 
	
	int max = word.length() - 1; //max value
	
	int largest[10] = {'0','0','0','0','0','0','0','0','0','0'};

	for(int n=1; n <= max+1; n++)
	{
		for(int i = 0; i < n; i++)
		{
			sub[i] = i;
		}
	
		for(int a = 0; ; a++)
		{		
			if(word[sub[0]] == word[sub[n-1]])
			{
				for(int c=0; c < n; c++)
				{
					if(word[sub[c]] > word[largest[c]])
					{
						for(int k=0; k < n; k++)
							largest[k] = sub[k];
							
						break;
					}
					else if(word[sub[c]] < word[largest[c]])
						break;
				}
			}		
					
			if(sub[0] == max - (n - 1))
				break;
			else
				next(max, n); //maximum value and last position
		}	

	}	
	
	for(int j=0; j < max; j++)
		cout << word[largest[j]];
		
	cout << '\n';

	
	return 0;
}


I've tested my code on the examples, and got the right answers, but can't submit yet.
Last edited on
Topic archived. No new replies allowed.