Why can i tokenize string passed as func

Forum

Forum
Beginners
Why can i tokenize string passed as func

Why can i tokenize string passed as function parameter but not string retrieved from getline?

The problem I have is that I am able to successfully tokenize the string parameters which are input into my function menucall, but using the same method I cannot properly tokenize a string obtained from a file using getline and I do not understand why. My question is why does it behave this way so i understand how to fix it.

The first loop properly tokenizes the menucall, properly enters the case using the option number, and opens the file with the name extracted from the call. The second loop tokenizes the string at the commas but does not remove any spaces.

This is my call to the function:
menuCall("1, BookInput.txt", store);

And this is the first line from the text file:
12345, Michael Main, Data Structures, 100.00

Put the code you need help with here.

void menuCall(string parameters, Bookstore bookstore)
{
    ifstream in; 
    int i, o, option;
    int j = 0, p = 0;
    string optionToken, bookToken, book;
    string *sp, *so;
    Book newBook;
    sp = new string[5];
    so = new string[4];

    while (parameters != "")                                            //Check if the string is empty
    {
        i = parameters.find(",", 0);                                    //Find next comma in the string
        if (i > 0)                                                      //If a comma is found
        {
            optionToken = parameters.substr(0, i);                      //Substring ends at first comma
            parameters.erase(0, i + 1);                                 //Delete substring and comma from original string
            while (optionToken[0] == ' ')                               //Check for spaces at beginning of token
            {
                optionToken.erase(0, 1);                                //Delete spaces at beginning of token
            }
        }
        if (i < 0)                                                      //If no comma is found
        {
            optionToken = parameters;                                   //Substring ends at the end of the original string
            parameters.erase(0, i);                                     //Delete substring from original string
            while (optionToken[0] == ' ')                               //Check for spaces at beginning of token
            {
                optionToken.erase(0, 1);                                //Delete spaces at beginning of token
            }
        }
        sp[j] = optionToken;                                            //Token is added to dynamic array
        j++;                                                            //Keeps track of iterations
    }

    option = stoi(sp[0]);                                               //Change option number from string to usable int

    switch (option)                                                     //Switch to determine what menu option is to be performed
    {
    case(1):                                                            //Case 1
        in.open(sp[1]);                                                 //Open file from extracted token containing file name
        while (!in.eof())
        {
            getline(in, book);                                          //Get string containing book information from input file file
            while (book != "")                                          //Check if the string is empty
            {
                o = book.find(",", 0);                                  //Find the next comma in the string
                if (o > 0)                                              //If a comma is found
                {
                    bookToken = book.substr(0, o);                      //Substring ends at first comma
                    book.erase(0, o + 1);                               //Delete substring and comma from original string
                    while (bookToken[0] == ' ')                         //Check for spaces at beginning of token
                    {
                        bookToken.erase(0, 1);                          //Delete spaces at beginning of token
                    }
                }
                if (o < 0)                                              //If no comma is found
                {
                    bookToken = book;                                   //Substring ends at the end of the original string
                    book.erase(0, o);                                   //Delete substring from original string
                    while (bookToken[0] == ' ')                         //Check for spaces at beginning of token
                    {
                        bookToken.erase(0, 1);                          //Delete spaces at beginning of token
                    }
                }
                so[p] = bookToken;                                      //Token is added to dynamic array
                p++;                                                    //Keeps track of iterations
            }
        }
        break;

Last edited on

JLBorges (13770)

#include <iostream>
#include <string>
#include <regex>
#include <sstream>
#include <vector>
#include <cctype>
#include <iomanip>

// 1. tokenise using the regular expressions library
std::vector<std::string> tokenise_1( const std::string& str )
{
    static const std::regex ws_comma_ws( "\\s*,\\s*" ) ;
    // http://en.cppreference.com/w/cpp/regex/regex_token_iterator
    std::sregex_token_iterator begin( str.begin(), str.end(), ws_comma_ws, -1 ), end ;
    return { begin, end } ;
}

std::string trim( std::string str )
{
    while( !str.empty() && std::isspace( str.back() ) ) str.pop_back() ;
    std::size_t pos = 0 ;
    while( pos < str.size() && std::isspace( str[pos] ) ) ++pos ;
    return str.substr(pos) ;
}

// 2. tokenise using an input string stream
std::vector<std::string> tokenise_2( const std::string& str )
{
    std::vector<std::string> result ;

    // http://en.cppreference.com/w/cpp/io/basic_istringstream
    std::istringstream stm(str) ;
    std::string tok ;
    while( std::getline( stm, tok, ',' ) ) result.push_back( trim(tok) ) ;

    return result ;
}

int main()
{
    for( std::string str : { "1, BookInput.txt", "12345, Michael Main, Data Structures, 100.00" } )
    {
        std::cout << std::quoted(str) << "\n\n" ;

        for( std::string tok : tokenise_1(str) ) std::cout << std::quoted(tok) << '\n' ;
        std::cout << '\n' ;

        for( std::string tok : tokenise_2(str) ) std::cout << std::quoted(tok) << '\n' ;
        std::cout << "\n-----------------\n\n" ;
    }
}

http://coliru.stacked-crooked.com/a/c1437e1463e40e2d

Carsomyr (4)

In this instance I cannot use regex and vectors and all of that. I also specifically asked for an explanation of the logic here to further my understanding, and your post doesn't provide that. I also cannot pass the book information into the string manually, I must retrieve it from the file first, and your post doesn't demonstrate that either. Thanks for the input though!

jlb (4973)

My first suggestion is to be careful with std::string.find() this function returns a size_t not an int. A size_t is an implementation defined unsigned type, which means the value can be much larger that what can be held in an int, you really should be using a size_t type for the variable holding the return value from this function. And you should really be checking against std::string::npos to test for success/failure.

And with your current code, what happens if a comma is the first character in the string?

...
    size_t i;
    while (!parameters.empty())                                            //Check if the string is empty
    {
        i = parameters.find(",");                                    //Find comma in the string
        if (i != std::string::npos) // found the string.
...
        else    // String not found.
       {
...

Also how are you reformatting the information from the file to match the format required in the function?

Carsomyr (4)

@jlb I have taken your suggestions and implemented them. At first I got really awkward behavior but then I noticed the solution wasn't being rebuilt before the debugger ran so i manually rebuilt the solution. The first piece now looks like this:

cout << "TOKENIZING OPTION STRING" << endl;
	while (!parameters.empty())											//Check if the string is empty
	{
		i = parameters.find(",", 0);									//Find next comma in the string
		if (i != parameters.npos)										//If a comma is found
		{
			cout << "i is not npos." << endl;
			optionToken = parameters.substr(0, i);						//Substring ends at first comma
			parameters.erase(0, i + 1);									//Delete substring and comma from original string
			while (optionToken[0] == ' ')								//Check for spaces at beginning of token
			{
				cout << "erasing spaces." << endl;
				optionToken.erase(0, 1);								//Delete spaces at beginning of token
			}
		}
		else															//If no comma is found
		{
			cout << "i is npos." << endl;
			optionToken = parameters;									//Substring ends at the end of the original string
			parameters.erase(0, i);										//Delete substring from original string
			while (optionToken[0] == ' ')								//Check for spaces at beginning of token
			{
				cout << "erasing spaces." << endl;
				optionToken.erase(0, 1);								//Delete spaces at beginning of token
			}
		}
		sp[j] = optionToken;											//Token is added to dynamic array
		cout << "token" << j << ": " << optionToken << endl;
		j++;															//Keeps track of iterations
	}

	option = stoi(sp[0]);												//Change option number from string to usable int
	cout << "option: " << option << endl;

The out put of that piece looks like this:

TOKENIZING OPTION STRING
i is not npos.
token0: 1
i is npos.
erasing spaces.
token1: BookInput.txt
option: 1

That part still works as intended. This is the 2nd part of the troublesome chunk:

switch (option)														//Switch to determine what menu option is to be performed
	{
	case(1):															//Case 1
		in.open(sp[1]);													//Open file from extracted token containing file name
		while (in)
		{
			getline(in, book);											//Get string containing book information from input file file
			cout << endl << "TOKENIZING BOOK STRING" << endl;
			while (!book.empty())										//Check if the string is empty
			{
				o = book.find(",", 0);									//Find the next comma in the string
				if (o != book.npos)										//If a comma is found
				{
					cout << "o is not npos." << endl;
					bookToken = book.substr(0, o);						//Substring ends at first comma
					book.erase(0, o + 1);								//Delete substring and comma from original string
					while (bookToken[0] == ' ')							//Check for spaces at beginning of token
					{
						cout << "erasing spaces." << endl;
						bookToken.erase(0, 1);							//Delete spaces at beginning of token
					}
				}
				else													//If no comma is found
				{
					cout << "o is npos." << endl;
					bookToken = book;									//Substring ends at the end of the original string
					book.erase(0, o);									//Delete substring from original string
					while (bookToken[0] == ' ')							//Check for spaces at beginning of token
					{
						cout << "erasing spaces." << endl;
						bookToken.erase(0, 1);							//Delete spaces at beginning of token
					}
				}
				so[p] = bookToken;										//Token is added to dynamic array
				cout << "token" << p << ": " << bookToken << endl;
				p++;													//Keeps track of iterations
			}

And here is the output:

TOKENIZING BOOK STRING
o is not npos.
token0: 12345
o is not npos.
token1:         Michael Main
o is not npos.
erasing spaces.
token2:                 Data Structures
o is npos.
token3:                 100.00

So again even with the suggested changes this just doesn't turn out right and i still don't understand why. The code on both loops is the same, and i have no idea why only 1 of the tokens in the second piece attempts to erase the leading spaces.

Carsomyr (4)

I have to apologize, I feel really stupid now. There is nothing wrong with the code at all. I had copy/pasted the book information list from a word document, and upon examining the text file I was using I discovered that we had tabs in with our spaces. I have modified everything and it now has the desired behavior. Thanks for your time!

Topic archived. No new replies allowed.

C++

Forum

Why can i tokenize string passed as function parameter but not string retrieved from getline?