List words of text one-by-one

Sep 27, 2014 at 12:02pm
Hello

I want to do something, but I'm not sure how to start.
If I have text in a text file,
example.txt:

The example programs of the previous sections provided little interaction with the user, if any at all. They simply printed simple values on screen, but the standard library provides many additional ways to interact with the user via its input/output features. This section will present a short introduction to some of the most useful.


I want to list all those words beneath each other in another file,

result.txt:

The
Example
Programs
Of
The
Previous
...


How would I do that?

Thanks for reading,
Niely
Sep 27, 2014 at 12:05pm
closed account (48T7M4Gy)
First you need to read the file in line by line than tokenize it, and capitalise the first letter of each word.

Tokenizer ... http://www.cplusplus.com/reference/cstring/strtok/?kw=strtok
Sep 27, 2014 at 12:22pm
Here is the code I used:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
using std::ifstream;
using std::ofstream;
using std::getline;
using std::string;
using std::vector;
using std::endl;
using std::cout;
using std::cin;

int main() {
	vector<string> lines;
	vector<string> newlines;
	string path = "C:\\Shared\\Test.txt";
	string newpath = "C:\\Shared\\Test2.txt";
	ifstream file(path.c_str());
	while (!file.eof()) {
		string temp;
		getline(file, temp);
		lines.push_back(temp);
	}
	bool eof = false;
	int a;
	a = lines[0].find(" ");
	for (int x(0); a != string::npos; x++) {
		eof = false;
		string word;
		while (!eof) {
			if (a != string::npos) {
				word = lines[x].substr(0, a);
				lines[x] = lines[x].substr(a + 1, lines[x].length() - a);
			} else {
				word = lines[x];
				eof = true;
			}
			newlines.push_back(word);
			a = lines[x].find(" ");
		}
	}
	file.close();
	ofstream newfile(newpath);
	for (int x(0); x < newlines.size(); x++) {
		newfile << newlines[x] << endl;
	}
	newfile.close();
	cin.get();
	return 0;
}


If you'd like me to explain what parts of it mean then I would be glad to.
Last edited on Sep 27, 2014 at 12:27pm
Sep 27, 2014 at 12:34pm
@Kemort: Thanks a lot for your reply, helped a lot!
But can you explain this code a bit more detailed please? Also with the %s etcetera.
Last edited on Sep 27, 2014 at 3:19pm
Sep 27, 2014 at 12:39pm
Do you realize that if all you want to do is write a new file with every word from an existing file on a new line you could do this is one loop? Consider reading the file word by word instead of reading an entire line by using the extraction operator>>.



Sep 27, 2014 at 1:25pm
closed account (48T7M4Gy)
Worth a try jlb but the problem might be commas, full stops, slashes etc in the original text. strtok overcomes this problem if the word (token) delimeter is not just whitespace.
Sep 27, 2014 at 1:58pm
Yes strtok() may be a better solution in a C program or a C++ program that is using C-strings. But if this is a C++ program std::strings should be used and strtok() should be avoided since there are string methods to parse lines of text.

And if you look at the first post you will see that the primary delimiter seems to be the space character. That any punctuation must also be removed has not been stated by the OP, but this can be handled separately.
Sep 27, 2014 at 3:25pm

Do you realize that if all you want to do is write a new file with every word from an existing file on a new line you could do this is one loop? Consider reading the file word by word instead of reading an entire line by using the extraction operator>>.


How would you do that?

Also, can someone explain this code line-by-line?
It's exactly what I want:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <stdio.h>
#include <string.h>
#include <iostream>

using namespace std;
int main ()
{
  char str[] ="- This, a sample string test; word-1 word-2 under_line, . j ^ _ ° _ù⁼ test.";
  char * pch;
  printf("Splitting string %s \n",str);
  pch = strtok (str," ,.-");
  while (pch != NULL)
  {
    printf ("%s\n",pch);
    pch = strtok (NULL, " ,.-");
  }
  return 0;
}
Sep 27, 2014 at 3:28pm
What language are you using to write your program? C? C++?

Sep 27, 2014 at 5:13pm
^C++.
Sep 27, 2014 at 6:07pm
I also need to know how to hook a text file on that piece of code.
Sep 27, 2014 at 6:57pm
That code is essentially C.

You have two things that should be kept separate: the echo and the true nature of the streams.

Lets do echo:
1
2
3
4
5
6
7
void echo( std::istream & in, std::ostream & out ) {
  std::string word;
  while ( /*read from in into word*/ ) {
    /*write word into out*/
    out << '\n';
  }
}

The code that you should put there instead of the comments uses operator>> and operator<<.

Why have a fancy function. What to do with it?
1
2
3
4
5
6
7
8
9
10
int main( int argc, char* argv[] ) {
  if ( 2 <= argc ) {
    std::ifstream fin( argv[1] );
    echo( fin, std::cout );
  }
  else {
    echo( std::cin, std::cout );
  }
  return 0;
}

What does that do? If you give at least one command line argument, when running the program, the first argument is used as name of the input file. If run without arguments, the std::cin is used. In both cases the output goes to std::cout.

Therefore, either of these would be ok:
a.out < example.txt > result.txt
a.out example.txt > result.txt

You could, obviously, go one step further and make the program use two arguments, one for input and the other for output filename.

The above code depends on
1
2
3
#include <iostream>
#include <fstream>
#include <string> 

Sep 27, 2014 at 9:56pm
Sorry, but I didn't understand that.
How does C++ have echo?

I just want a simple C++ code who just takes the content of one text file, and puts it one by one to another...
Just keep it simple;
Sep 28, 2014 at 12:14am
closed account (48T7M4Gy)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#include <stdio.h>
#include <string.h>
#include <iostream>

const int MAX_LINE_LENGTH = 100;

int main()
{
	FILE* source;
	FILE* destination;

	char line[ MAX_LINE_LENGTH ];
	char separators[]   = "?!. ,\t\n";
	char* token;

	char* format = "+%s+ ";

	source = fopen( "data.txt", "r" );
	destination = fopen( "output.txt", "w" );

	if(  source != NULL )
	{
		while( fgets( line, MAX_LINE_LENGTH, source ) != NULL )
		{
			token = strtok( line, separators );
			while( token != NULL )
			{
				printf( format, token );
				fprintf( destination, format, token );
				token = strtok( NULL, separators );
			}
		}
		printf("\n +++ ENDS +++\n");
	}
	else
		printf( "fgets error\n" );

	fclose( source );
	fclose( destination );

	return 0;
}


With a bit of luck it should be self explanatory - read each line, separate out each word setting a pointer called 'token', then process the token, then moving along the line and then down the file.

This does something very close to what you want. Sure it uses C-strings but C++ according to Stroustrup is an extension of C not an alternative. But, sure, if you like do it in purist mode. :-)
Sep 28, 2014 at 1:12am

I just want a simple C++ code who just takes the content of one text file, and puts it one by one to another...
Just keep it simple;


So what have you tried? You may want to study the following tutorial: http://www.cplusplus.com/doc/tutorial/files/
It should explain basic C++ file IO.

Sep 28, 2014 at 8:11am
Just keep it simple;

Boring, rigid, not reusable, but whatever:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <iostream>
#include <fstream>
#include <string>

void echo( std::istream & in, std::ostream & out ) {
  std::string word;
  while ( /*read from in into word*/ ) {
    /*write word into out*/
    out << '\n';
  }
}

int main() {
  std::ifstream fin( "example.txt" );
  std::ofstream fout( "result.txt" );
  echo( fin, fout );
  return 0;
}

All you need to do is to fix lines 7 and 8.
Sep 28, 2014 at 9:01am
This does something very close to what you want. Sure it uses C-strings but C++ according to Stroustrup is an extension of C not an alternative. But, sure, if you like do it in purist mode. :-)

Love to see that quote, but I suspect it doesn't exist.

A relevant quote might be from Stroustrup's faq:
I have never seen a program that could be expressed better in C than in C++ (and I don't think such a program could exist - every construct in C has an obvious C++ equivalent).

From: http://www.stroustrup.com/bs_faq.html#difference

A C++ solution might look like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#include <vector>
#include <string>
#include <iostream>
#include <fstream>
#include <sstream>

std::vector<std::string> tokenize(const std::string& s, const std::string& delim)
{
    std::vector<std::string> result;

    std::size_t prev = 0, next = 0;

    while ((next = s.find_first_of(delim, prev)) != std::string::npos)
    {
        if (std::size_t length = next - prev)
            result.push_back(s.substr(prev, length));

        prev = next + 1;
    }

    if (prev != s.size())
        result.push_back(s.substr(prev));

    return result;
}

int main()
{
    const char* in_file = "example.txt";
    const char* out_file = "result.txt";

    std::ifstream in(in_file);
    if (in.is_open())
    {
        std::ostringstream os;
        os << in.rdbuf();

        std::vector<std::string> tokens = tokenize(os.str(), " \n\t,./\\-!");

        std::ofstream out(out_file);
        if (out.is_open())
        {
            for (auto& token : tokens)
                out << token << '\n';

            //If no C++11 support for ranged for loops use the following loop instead:
            //for (std::vector<std::string>::iterator it = tokens.begin(); it != tokens.end(); ++it)
            //    out << *it << '\n';
        }
        else
            std::cerr << "Unable to open file " << out_file << " for output.\n";
    }
    else
        std::cerr << "Unable to open file " << in_file << " for input.\n";
}
Sep 28, 2014 at 11:33am
my solution is short!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include <iostream>
#include <fstream>
#include <string>
using namespace std;

int main() {
    string s;
    ifstream in ("example.txt");
    ofstream out ("result.txt");
	
    if(in.is_open() && out.is_open())
      {
        while(in >> s)
            {
    	     s[0] = toupper(s[0]);
	     out << s << endl;
	    }   
        in.close();
        out.close();      
      }  
    else cout << "Unable to open file."; 

    return 0;
}
Last edited on Sep 28, 2014 at 11:53am
Sep 28, 2014 at 6:08pm
^Thanks a lot! :)
Works like a charm and I fully understand it.

Thanks everyone!!!
Topic archived. No new replies allowed.