Need to copy part of a string to a new string

Hello all,

I am working on a program that will check to see if an infix expression is valid, as defined by grammar given to me by my teacher. I am stumped at a particular section though. I need to take part of the expression, for instance from the original string of (a+b)*c, I need to be able to take (a+b) and make that a new string, while removing it from the original string.

Below is my attempt so far, not the entire code, I can post that if you want but it is NOWHERE near complete, this is just the for statement I am using. Any cout is simply to error check where the program is causing a core dump, and it is always at one part, clearly marked in the code.

1
2
3
4
5
6
7
8
9
10
11
	else if (original.at(position)=='('){
		cout << "Got to the point where it found '('" << endl;
		position=position+1;
		for (i; i<=original.length(); i++){
			cout << "core dump?" << endl;
			if (original.at(i)!=')'){
			cout << "Got inside if?" << endl;
		!HERE!	tempExpress.at(i)=original.at(i); !HERE!
			cout << "at position: " << position << endl;
			}
		}


I am guessing that string.at(i)=string2.at(i) is not a valid expression, but for the life of me I can't figure out another way to have this happen. It would be easy to do the entire thing with an array, but that would limit the maximum size of the expression, and I don't want to do that. Keep in mind please that I am in the first weeks of my second C++ class, so I am by no means advanced.
See if this helps.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
#include <iostream>
#include <string>
using namespace std;
int main(){
  string original("(a+b)*c"),
         tempExpress,orig(original);
  size_t i(0),position(0); 
  if ( original.at(position) == '(' ){
     for (i=position+1; i<original.length(); i++){
        if ( original.at(i) == ')' ){
             tempExpress = original.substr(position,i-position+1);
             if ( i+1 < original.length() ){
                original = original.substr(i+1);
             } else {
                // The ')' was the last char in the string
                // So, set the original to empty
                original = "";
             }
             break;
        }
     }
  }
  cout << "Using for loop" << endl;
  cout << "--------------" << endl;
  cout << "  orig: " << orig << endl;
  cout << "  original: " << original << endl;
  cout << "  temp: " << tempExpress << endl;
//---------------------------------------
// Or you could use the find function
//---------------------------------------
  original = orig;

  if ( original.at(position) == '(' ){
     if ( (i=original.find(')')) != string::npos){
        tempExpress = original.substr(position,i-position+1);
        if ( i+1 < original.length() ){
           original = original.substr(i+1);
        } else {
           // The ')' was the last char in the string
           // So, set the original to empty
           original = "";
        }
     } 
  }
            
  cout << "Using find()" << endl;
  cout << "--------------" << endl;
  cout << "  orig: " << orig << endl;
  cout << "  original: " << original << endl;
  cout << "  temp: " << tempExpress << endl;

  return 0;
}
$ ./a.out
Using for loop
--------------
  orig: (a+b)*c
  original: *c
  temp: (a+b)
Using find()
--------------
  orig: (a+b)*c
  original: *c
  temp: (a+b)

If parenthesised sub-expressions cannot be nested, the solution is trivial.
a. Find the first '(' in the expression
b. Find the first occurrence of a ')' after that
c. Extract everything in betwen.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include <iostream>
#include <string>

const char OPEN_PARANTHESIS = '(' ;
const char CLOSE_PARANTHESIS = ')' ;

std::string extract_subexpression( std::string& expression )
{
    // locate opening OPEN_PARANTHESIS
    const std::size_t start_pos = expression.find( OPEN_PARANTHESIS ) ;
    if( start_pos == std::string::npos ) return "" ; // not found

    // look for a CLOSE_PARANTHESIS
    std::size_t found_pos = expression.find( CLOSE_PARANTHESIS, start_pos+1 ) ;
    if( found_pos ==std::string::npos ) throw "unmatched paranthesis" ;

    // extract the subexpression
    const std::size_t length = found_pos-start_pos + 1 ;
    const std::string subexpr = expression.substr( start_pos, length ) ;
    expression.erase( start_pos, length ) ;
    return subexpr ;
}

int main()
{
    std::string expr = "a + b - c * (d+e) + f / (g+h)" ;
    std::cout << "expression: "<< expr << '\n' ;
    std::string subexpr = extract_subexpression(expr) ;
    std::cout << "extracted: " << subexpr << "\nleaving: " << expr << '\n' ;
}



If parenthesised sub-expressions can be nested, it becomes a bit more interesting. We need keep track of the nesting level, looping till we find the matching ')'.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
#include <iostream>
#include <string>

const char OPEN_PARANTHESIS = '(' ;
const char CLOSE_PARANTHESIS = ')' ;

std::string extract_subexpression( std::string& expression )
{
    // locate opening OPEN_PARANTHESIS
    const std::size_t start_pos = expression.find( OPEN_PARANTHESIS ) ;
    if( start_pos == std::string::npos ) return "" ;

    std::size_t found_pos = start_pos ;
    int open_paranthesis_cnt = 1 ; // keep track of nesting depth

    const char tokens[] = { OPEN_PARANTHESIS, CLOSE_PARANTHESIS, 0 } ;

    // look for the next OPEN_PARANTHESIS or CLOSE_PARANTHESIS
    while( ( found_pos = expression.find_first_of( tokens, found_pos+1 ) )
           != std::string::npos )
    {
        // found another OPEN_PARANTHESIS; increment nesting depth
        if( expression[found_pos] == OPEN_PARANTHESIS ) ++open_paranthesis_cnt ;

        else // found a CLOSE_PARANTHESIS
        {
          --open_paranthesis_cnt ; // decrement nesting depth
          if( open_paranthesis_cnt == 0 ) // found matching CLOSE_PARANTHESIS
          {
              const std::size_t length = found_pos-start_pos + 1 ;
              const std::string subexpr = expression.substr( start_pos, length ) ;
              expression.erase( start_pos, length ) ;
              return subexpr ;
          }
        }
    }

    throw "unmatched paranthesis" ;
    return "" ;
}

int main()
{
    std::string expr = "a + b - c * ( (d+e) + f / (g+h) ) * (i-j) + k" ;
    std::cout << "expression: "<< expr << '\n' ;
    std::string subexpr = extract_subexpression(expr) ;
    std::cout << "extracted: " << subexpr << "\nleaving: " << expr << '\n' ;
}

First of all thank you both for you answers and help, especially JL for the second one. I should have specified that I needed these to be nested. I do have one question though, after looking up throw I see that it needs catch to accompany it in order for it to work like the return statement almost. Without catch what is the

throw "unmatched paranthesis";

doing?

Thanks again for everything guys, this set me down what I hope to be the correct path for the project. I find that my biggest limitation in C++ at the moment is simply not knowing all of the options. I guess that comes with time and studying.

Billy
Yes, the throw, to be useful, needs an associated try-catch construct.

An alternative (using return values to indicate success/failure) would be:

1
2
3
enum result_t { OK = 0, NO_PARANTHESIZED_SUBEXPRESSION = 1, UNMATCHED_PARANTHESIS = 2 };

result_t extract_subexpression( std::string& expression, std::string& subexpression ) ;


I don't know if you have just started learning C++, or if you have already done basic C++, and are doing this as part of Compilers 101. If it is the second, your teacher probably expects you to write a recursive descent mini-parser. http://www.engr.mun.ca/~theo/Misc/exp_parsing.htm

For a reasonably small grammar like this, I would write a simple Pratt parser http://en.wikipedia.org/wiki/Pratt_parser rather than use the heavy-weight shunting yard algorithm.
I just wanted to give everyone an update,

First of all, thank you again JLBorges, I am doing Compilers 101 basically, and that link you sent me really helped me understand what I was working on.

I think I managed to get the program done, just have to turn it in and see if he comes up with errors that I couldn't find in my checking. Thanks again, you guys really helped a lot.

Billy
Topic archived. No new replies allowed.