Storing words into an array from a file in reverse order, and write into a new File.

So, I'm currently trying to figure out how to open a file that has a ton of different words of different sizes. Then search for all the words in the file of a certain size and read it into an array. Then write all the words from the array onto the screen and into a new file backwards. Say the original file has words starting with A first, the new file I create should have words with Z first.

This is the code I have right now.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
  cout << "What size word do you want? ";
cin >> size;

//INput must be between 1 and 30
while (size > 30 || size < 1)
{
    cout << "Please enter a number between 1 and 30\n";
    cout << "Try again: ";
    cin >> size;
}

ifstream inputFile;
ofstream outputFile;

string* arrPtr;

arrPtr = new string[count]; //I got count from a different part of the code 
                            //that counts how many words there are of that 
                            //size. Because I also have to display how many 
                            //words of the sizethere are.
int j = 0;
inputFile.open("List.txt"); 
outputFile.open("List2.txt");


while (getline(inputFile, arrPtr[j]))
{
    inputFile >> arrPtr[j];
    if (arrPtr[j].length() == size)
    {
        outputFile << arrPtr[j];
        cout << arrPtr[j] << " ";
    }
}


cout << endl;

outputFile.close();
inputFile.close();

delete[] arrPtr;


So far it can find the words of a certain size, print them out and store them into a file. However, I can't figure out how to put each word on a different line in the file, or how to read and store the words backward. Also, when I print out the words it seems to skip the first word that is the right size in the file. If I could get some help on this that would be much appreciated. Thanks.
C++, so use std::vector<>
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rsl-arrays
Tutorial: https://cal-linux.com/tutorials/vectors.html

Something like this, perhaps:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
#include <string>
#include <vector>
#include <iostream>
#include <fstream>

// get valid size in range from user invariant: minv <= maxv
std::size_t get_size( std::size_t minv = 1, std::size_t maxv = 30 )
{
    std::cout << "word size [" << minv << ',' << maxv << "]: " ;
    std::size_t sz ;

    if( std::cin >> sz && sz >= minv && sz <= maxv ) return sz ;

    // else input failure: clear error state, discard the bad input line and try again
    std::cout << "invalid input. try again\n" ;
    std::cin.clear() ;
    std::cin.ignore( 1'000'000, '\n' ) ;
    return get_size( minv, maxv ) ; // try again
}

// return vector with words of size == sz read from the input stream
std::vector<std::string> get_filtered_words( std::istream& stm, std::size_t sz )
{
    std::vector<std::string> filtered_words ;

    for( std::string word ; stm >> word ; )
            if( word.size() == sz ) filtered_words.push_back(word) ;

    return filtered_words ;
}

// write the contents of the vector, one word per line, in reverse order
void write_reverse( std::ostream& stm, const std::vector<std::string>& words )
{
    // iterate in reverse using the vector's reverse iterator:
    for( auto iter = words.rbegin() ; iter != words.rend() ; ++iter ) stm << *iter << '\n' ;
}

int main()
{
    const std::string input_file_path = "word_list.txt" ;
    const std::string output_file_path = "rev_filtered_word_list.txt" ;

    if( std::ifstream in_file{ input_file_path } ) // if the file was opened for input
    {
        const auto sz = get_size( 1, 30 ) ;
        auto filtered_words = get_filtered_words( in_file, sz ) ;

        if( std::ofstream out_file{ output_file_path } ) write_reverse( out_file, filtered_words ) ;
        else std::cout << "error: failed to open output file\n" ;
    }

    else std::cout << "error: failed to open input file\n" ;
}
Last edited on
Would there be a way to do it with arrays instead of vectors? I dont really know how to use vectors yet.
Thanks for your help though.
> I dont really know how to use vectors yet.

Treat this as an opportunity to learn about them; the invested effort will reap handsome rewards.
Start with a simple tutorial: https://cal-linux.com/tutorials/vectors.html
It's not recommended to use size as a variable name as there is now a function called size (std::size())

Without using vector, consider:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include <iostream>
#include <string>
#include <fstream>

int main()
{
	const size_t count {100};

	size_t sz {};

	std::cout << "What size word do you want? ";
	std::cin >> sz;

	//Input must be between 1 and 30
	while (sz > 30 || sz < 1) {
		std::cout << "Please enter a number between 1 and 30\n";
		std::cout << "Try again: ";
		std::cin >> sz;
	}

	auto arrPtr {new std::string[count]};

	std::ifstream inputFile("List.txt");
	std::ofstream outputFile("List2.txt");

	if (!inputFile || !outputFile)
		return (std::cout << "Cannot open files\n"), 1;

	size_t j {};

	for (std::string wrd; j < count && inputFile >> wrd; )
		if (wrd.length() == sz)
			arrPtr[j++] = wrd;

	for (auto rit = arrPtr + j - 1; rit >= arrPtr; --rit) {
		std::cout << *rit << "  ";
		outputFile << *rit << '\n';
	}

	delete[] arrPtr;
}


Last edited on
Hello Awesomeness,

Your posted code is good, but could be improved upon.

As is the code poses more questions because of what is missing like the include files, "main". Is this code in a function or part of "main"?

It is always best to post enough code that can be compiled and tested because others may see something that you may have missed.

The program uses an input file. You need to provide the input file. If it is large then a good sample that will work with your program. If possible maybe a link to the file.

I see the point of the first while loop and would suggest something like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
//while (std::cout << "What size word do you want? " && !std::cin || size > 30 || size < 1)  // <--- An alternative.
while (!std::cin || size > 30 || size < 1)
{
    if (!std::cin)
    {
        std::cerr << "\n     Invalid Input!. Must be a number.\n\n";

        std::cin.clear();  // <--- Resets the state bits on the stream.
        std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');  // <--- Requires header file <limits>. Clears the input buffer.
    }
    else if (size > 30 || size < 1)
    {
        std::cerr <<
            "Please enter a number between 1 and 30\n"
            "Try again: ";
        cin >> size;  // <--- Not needed if you use the alternative while loop.
    }
}

This should work, but is untested until I get all the errors worked out of your code.

The next problem I see is with lines 22 and 23 of your code. How do you know if the files are open? You do not because you never check them. Your program would continue as if nothing is wrong until you try to read the file and get nothing.

When dealing with files and some type of "fstream" I found this the be helpful. It may be a little to much, but in the beginning it helps.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
const std::string inFileName{ "" };  // <--- Put File name here.

std::ifstream inFile(inFileName);

if (!inFile)
{
    std::cout << "\n File " << std::quoted(inFileName) << " did not open." << std::endl;
    //std::cout << "\n File \"" << inFileName << "\" did not open." << std::endl;

    return 1;
}

std::string outFileName{ "" };  // <--- Put File name here.

std::ofstream outFile(outFileName);

if (!outFile)
{
    std::cout << "\n File " << std::quoted(outFileName) << " did not open" << std::endl;
    //std::cout << "\n File \"" << inFileName << "\" did not open." << std::endl;

    return 2;
}

Something to note here: Do not be afraid to use long variable names, e.g., "inFileName", as long as it describes what it is. Also you use the variable name "count". "wordCount" would be more descriptive and the code would be easier to understand.

The next while loop:
1
2
3
while (getline(inputFile, arrPtr[j]))
{
    inputFile >> arrPtr[j];

The while condition is good, but the next line overwrites what you just read, so you miss processing the first read and everything will be off.

I believe the if statement is premature. You say the the output file should be in reverse order, but you are writing to the file in forward order.

I am think that instead of writing to the output file you would need to write to a second array the words that fit then later write that second array to the output file , but in reverse order. That would be starting at the end of the array and working backwards.

Without knowing what the input file looks like it is hard to say if what you have started with actually works.

Andy
Hey Andy,

Thanks for the reply. I noticed that I forgot the main() part right before I went to bed. It is just a part of main, no functions. The original file is pretty big but I do have a sample of it. However, I do not know how to add it to the post.
Hey seeplus,

That seems to do the trick. I do have a few questions about it so that I understand what's happening if you don't mind.

1. Why did you add auto arrPtr {new std::string[count]}; and auto rit, what's the point of the auto part?

2. Why did you add 100 inside of const size_t count {100}; and what is size_t sz {} and size_t j {}? it seems related to the const but I'm stuck on what they actually do.

Sorry, if these are dumb questions, I just want to understand the code so I'm not just blindly using it. Thanks for your help.
1. Why did you add auto arrPtr {new std::string[count]}; and auto rit, what's the point of the auto part?
auto is a placeholder type specifier, used to allow the compiler to determine a variable's type by how it is initialized.
https://en.cppreference.com/w/cpp/language/auto

auto arrPtr is letting the compiler determine what type the array is, auto rit is determining the type of the for loop variable from the already known arrPtr type.

2. Why did you add 100 inside of const size_t count {100}; and what is size_t sz {} and size_t j {}? it seems related to the const but I'm stuck on what they actually do.
size_t is an unsigned integer type. const size_t count {100}; is declaring a constant named count (type unsigned) initialized with a value of 100.

sz and j are two variables, type unsigned, initialized to zero.
https://en.cppreference.com/w/cpp/types/size_t

The brackets around 100 and the two sets of empty brackets is "uniform initialization."
https://mbevin.wordpress.com/2012/11/16/uniform-initialization/

Those are not dumb questions. :)
I dont really know how to use vectors yet.


Another introduction to vectors: https://www.learncpp.com/cpp-tutorial/an-introduction-to-stdvector/
Hello Awesomeness,


However, I do not know how to add it to the post.


That is as easy as typing a response. Just copy part of the file and pase it into the message. Or you could use the output tag, middle button top row, under the "Format:" heading to the right of the reply box.

Using this file of 50 words, the first 20 here:

The
Last
Class
The
Story
of
a
Little
Alsatian
I
WAS
very
late
for
school
that
morning
and
I
was



I ended up with this on the screen:


What wordSize word do you want? 4

 Unsorted array:
Last very late that told that know them

 Sorted array:
know last late that that them told very


 Press Enter to continue:



And the output file looks like:

very
told
them
that
that
late
last
know


To achieve this I created a second array to hold the words that match. The problem is that you will not know how many words match unless you go through the first array and count the matches B4 making a dynamic array. Or try to create an array large enough with out being to large. That is the trick.

When finished creating the second array I went through the array changing any upper case letters to lower case B4 I sorted the array.

The last step was to go through the second array from end to start and wriite this to the file.

Andy
Another quick question. With seeplus's code, is there a reason that it doesn't explicitly open the file with inputFile.open("List.txt")?
Hello Awesomeness,

Line 23 and 24 of seeplus's code is just making use of the overloaded ctor to define the stream variable and open the file when the variable is constructed. This is the more preferred way to setup some type of "fstream", "ifstream" or "ostream".

Although the following if statement does not tell you which file stream did not open just that 1 of them did not. This may be acceptable to you or you may want to check each file stream separately.

Andy
Hello Andy,

Thanks a lot for your help on this. If I could get your help on one more thing that would be great.
I have to find the largest word in the file. I'm able to find the largest word which is 27 letters, except there are two words that are 27 letters long. How would I get it to print out both of the words, not just one.

edit: It doesn't actually work anymore. I thought it did, I was wrong. It just brings me to some xstring debugging thing. What is that? and why does it pop up instead of just not running. Cause it runs half of the code just not the last part.

edit2: Fixed it I changed it to a for loop. Still can't figure out how to get both words though

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#include <iostream>
#include <string>
#include <fstream>
using namespace std;

int main()
{

ifstream inputFile
inputFile.open("List.txt");

if (!inputFile) //If files doesn't open
	return cout << "Cannot open file\n",1; 

	string largestWord = "";
	string wrd;
	

	for (int i = 0; inputFile >> wrd; i++)
	{
		if (wrd.length() > largestWord.length())
		{
			largestWord = wrd;

		}
		else;
			
	}
	
	cout << endl << endl<< "The largest word in the file is " << largestWord;
}


Would I need to store both the words into an array of some sort and then print using a for loop or...?
File sample:

ab
ad
ae
lag
log
mid
mil
nan
bumps
bunch
cohog
cohos
xenogamies
yellowtail
yellowwares
zygomophy
ethylenediaminetetraacetate
electroencephalographically
Last edited on
I'm able to find the largest word which is 27 letters, except there are two words that are 27 letters long.


@OP
As you read the words in you are currently checking whether the current word is bigger than the current biggest. No problem doing that.

However, in that case you need to record the index number of the (now) largest word in the array, not the word itself.

Where you need to allow for more than one biggest, instead of saving the index in a single variable (int index_of_biggest, say) all you need to do is save the index numbers in an array of index numbers. Also the > test in your line 23 above needs to be >=

<vectors> make this easier because you don't have to concern yourself about estimating the number of duplicates, but the array of indices is easy to implement.

Also, provided you list is only in the form of single words per line, then getline() isn't necessary.
To generate an array of largest length words, consider:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#include <iostream>
#include <string>
#include <fstream>
using namespace std;

int main()
{
	const size_t nomax {20};

	ifstream inputFile("List.txt");

	if (!inputFile)
		return (cout << "Cannot open file\n"), 1;

	string largestWord[nomax];
	size_t nolarge {};

	for (string wrd; inputFile >> wrd; ) {
		if (wrd.length() > largestWord[0].length())
			nolarge = 0;

		if ((nolarge < nomax) && (wrd.length() >= largestWord[0].length()))
			largestWord[nolarge++] = wrd;
	}

	cout << "\nThe largest word(s) in the file are ";
	for (size_t i = 0; i < nolarge; ++i)
		cout << largestWord[i] << "  ";

	cout << '\n';
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include <iostream>
#include <string>
#include <fstream>

int main()
{
    const std::string input_file_path = __FILE__ ; // "word_list.txt" ;

    if( std::ifstream in_file{ input_file_path } ) // if the file was opened for input
    {
        std::string longest_words ;
        std::size_t longest_word_size = 0 ;

       // test cases: oversimplification conversationalists disenfranchisement 

        std::string word ;
        while( in_file >> word )
        {
            if( word.size() > longest_word_size )
            {
                longest_words = word ; // replace 
                longest_word_size = word.size() ;
            }
            
            else if( word.size() == longest_word_size ) longest_words += '\n' + word ; // append after a new line
        }

        // print all the longest words
        std::cout << "longest word(s) in the file " << input_file_path << ":\n-------"
                  << "----------\n" << longest_words << '\n' ;
    }

    // test cases: characteristically transcendentalists telecommunications 
    //             disproportionately institutionalising misrepresentations

    else std::cout << "error: failed to open input file " << input_file_path << '\n' ;
}


http://coliru.stacked-crooked.com/a/5b5e53a8c6da12e3
https://rextester.com/WWDE77576
@OP
Despite your green tick an interesting problem arises if all the words are the same length and there are 5000 of them.

A simple fix to cover that contingency would be to make the index array(<vector> in time) the same size as the word count, which you already have.

Keep in mind the extra memory overhead for all of that isn't much to handle for an average PC and you don't have to make any estimates of duplicate biggest's and error trapping when the limits are exceeded.
Topic archived. No new replies allowed.