i'm reading from a text file which includes a dictionary in UTF8.
it has the following structure:
word, bla "bla"\ttranslation
so there's a word and a translation seperated by tab. both can include any kind of charactars, excluding tab of course, especially blanks (found one code that claimed to seperate a string at tab, but also did at blank).
how can i seperate one line into two (or more) strings at the tab? i've been searching for a while, but can't find anything useful...
std::tuple <std::string, std::string>
split_at_tab( const std::string& s )
{
std::size_t n = s.find( '\t' );
return std::make_tuple( s.substr( 0, n ), s.substr( n ) );
}
1 2 3 4 5 6
string line, word, translation;
while (getline( f, line ))
{
std::tie( word, translation ) = split_at_tab( line );
...
}
(2) getline
1 2 3 4 5
string word, translation;
while (getline( f, word, '\t' ) && getline( f, translation ))
{
...
}
(3) stringstream + getline
1 2 3 4 5 6 7 8
string line, word, translation;
while (getline( f, line ))
{
istringstream ss( line );
getline( ss, word, '\t' );
getline( ss, translation );
...
}
Options (1) and (3) are more robust, because they handle the possibility that a given line does not contain a TAB. But if you can guarantee that every line in your dictionary is <word> TAB <translation> then option (2) is best.
yes, this helps! especially the 3rd one is great for my use! so i can even handle the in some lines existing second tab for information about word classes (\t noun). it's already working with my program.