problems using string substr()

I am trying to make a program which divides a string into words and shorter sentences.
I want "Word1 word2 word3" to be pushed in to an array like this:
{Word1, word1 word2, word1 word2 word3, word2, word2 word3, word3}

I want it to work so that if you write a sentence it divides it at all spaces, and pushes the words to an array. In addition I want it to push all the shorter sentences you can make of the string. like "this cat" and "want this cat" from the string "I want this cat".
-I want it to work so that i may change the string to whatever i want.


This is my code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
#include<iostream>
#include<string>
#include<vector>
using namespace std;

string sentence;

vector <string> splitting(string sentence) {

	vector <int> space_array{ -1 };
	vector <string> wordSplit{};

	int prev = -1;
	int holder = 0;

	do {
		holder = sentence.find(' ', prev + 1);
		if (holder != -1) {
			space_array.push_back(holder);
		}
		prev = holder;
	} while (prev != -1);


	space_array.push_back(sentence.length());

	for (int i = 0; i < space_array.size(); i++) {
		for (int j = 1; j < space_array.size() - i; j++) {
			wordSplit.push_back(sentence.substr(space_array[i] + 1, space_array[j]));
		}


	}

	for (int i = 0; i < space_array.size(); i++) {
		cout << space_array[i] << "...";
	}
	cout << endl << endl;

	return wordSplit;
}




void main() {
	sentence = "first second third fourth fifth sixth";

	vector <string> isSplit = splitting(sentence);

	for (int i = 0; i < isSplit.size(); i++) {
		cout << isSplit[i] << "-";

	}
	cout << endl << endl;

	system("pause");
}


Here is my output:

-1...5...12...18...25...31...37...

first-first second-first second third-first second third fourth-first second thi
rd fourth fifth-first second third fourth fifth sixth-secon-second third-second
third fourt-second third fourth fifth-second third fourth fifth sixth-third-thir
d fourth-third fourth fifth-third fourth fifth sixth-fourt-fourth fifth-fourth f
ifth sixth-fifth-fifth sixth-sixth-


As you can see above it prints the words and sentences i want, however when it has covered all the combinations with "first" it misses one character of the next word:
It prints secon instead of second.

Does anyone know why this is?


With a different string like:
"word1 word2 word3 word4 word5" I get the desired output:

-1...5...11...17...23...29...

word1-word1 word2-word1 word2 word3-word1 word2 word3 word4-word1 word2 word3 wo
rd4 word5-word2-word2 word3-word2 word3 word4-word2 word3 word4 word5-word3-word
3 word4-word3 word4 word5-word4-word4 word5-word5-


Here is a third case where i use the string "anna and otto are here".
Which gives me this output

-1...4...8...13...17...22...

anna-anna and-anna and otto-anna and otto are-anna and otto are here-and -and ot
to-and otto are -and otto are here-otto-otto are-otto are here-are -are here-her
e-


At some points there is an extra space, which should not be there. Like "-are -"

The numbers on top if the placement of the spaces (and -1 for the start of the string and the length of the string at the end)

Looking at my use of find() and substr(), i would say my code should work. Where did i do wrong?
Thank you in advance!
Last edited on
I want "Word1 word2 word3" to be pushed in to an array like this:...
using the string "word1 word2 word3 word4 word5" gives me the desired output:

the pattern of words in the string from the first quote does not match the pattern in the second
should the string have only 3 words (as in the first quote) or can it have more (as in second quote)? in the latter case what happends if the total number of words is not a multiple of 3 (also as in the second quote) - overall your post is not quite clear
It should work for any string, so that I may take input from a user. Word1 word2 etc. was meant as an example.

I tried to clean up my post a bit.

What seems (to me) to be the issue, is where i use substr(a,b):
It looks like i input the desired numbers a and b, but sometimes b is 1 higher or lower than intended, in my output.
Causing some of the words or sentences to have one character too few or too much.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
using namespace std;

vector<string> splitSentence( string sentence )
{
   vector<string> V;
   stringstream ss( sentence );
   string word;
   while ( ss >> word ) V.push_back( word );
   return V;
}

int main()
{
   string sentence = "one two three four five";
   vector<string> words = splitSentence( sentence );
   int numWords = words.size();

   for ( int start = 0; start < numWords; start++ )
   {
      for ( int seqLen = 1; seqLen <= numWords - start; seqLen++ )
      {
         for ( int w = start; w < start + seqLen; w++ ) cout << words[w] << " ";
         cout << endl;
      }
   }
}


one 
one two 
one two three 
one two three four 
one two three four five 
two 
two three 
two three four 
two three four five 
three 
three four 
three four five 
four 
four five 
five 

Thank you!

The initial idea was to do this with the find(), length() and substr() functions.

However seeing your code I realise that I can do the same with mine.
I'll just divide it in to words then put them back to gether like you.

Thanks for the help :)
Topic archived. No new replies allowed.