iterator

Hi

I am attempting to populate a map with a word count and a list of words, based on an input string. For example, the string "great expectations! great news", would result in [(great, 2), (expectations!, 1), (news, 1)]

The code that I have thus far is:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#include <iostream>
#include <map>
#include <string>

using namespace std;

//  must be compiled with g++ --std=c++14 

template <typename C>
void print_any_iterator( const C &  );

template <typename C & s>
map<string,int> count_words ( const C & s ) {
  map<string, int> temp_m;
  for(auto c1 = begin(s), auto c2 = begin(s); c1 != end(s); ++c1) {
    if (*c1 == ' ') {
      c2 = c1; // can one do this?
      temp_m[*] // what can I use as a key?
    }
  }
  return temp_m;
}

int main() {
  string s;
  cout << "Enter line : ";
  getline(cin, s);
  cout << s << endl;
  return 0;
}

template <typename C>
void print_any_iterator( const C & v ) { // New C++ (C++11)
  cout << '<';
  for (auto i = begin(v); i != end(v); ++i)
    cout << ' ' << *i;
  cout << " >";
}


I'm iterating over a string, identifying spaces in the for loop. How can I keep track of a position, so that I know the start and end of a word?

Also, how can I define a part of a string, all characters from the previous space to the current space, as a key to the map, within the iterator?

The code does not work, it is simply my initial attempt.
Last edited on
Why not put the entire line in a string stream. Then remove one word at a time from the stream and put it into a map (along with incrementing the count) ?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <iostream>
#include <map>
#include <sstream>

int main()
{
    std::map<std::string, int> m;

    std::cout << "Enter a line: ";
    std::string line;
    std::getline(std::cin, line);
    std::stringstream ss(line);

    std::string word;
    while(ss >> word) ++m[word];

    for(const auto& i : m) std::cout << "(" << i.first << ", " << i.second << ") ";
}



Enter a line: Great expectations! Great news
(Great, 2) (expectations!, 1) (news, 1)
Last edited on
Why not put the entire line in a string stream

You will have to do this eventually but also need a way to deal with characters like period, parentheses, ampersands, exclamation marks, &c. Unlike in the OP's example, I'd not count 'expectations!' as a word but rather 'expectations', it standardizes the word for comparison with any further occurrences of 'expectations'.

In the following code, the my_ctype class (derived from std::ctype<char>) deals with this. There are, more than my usual, comments in the body of the program on how my_ctype deals with this but essentialy a my_ctype object is used in the construction of an augmented locale (here, x / reference: std::locale) that is then passed to the stringstream (here, stream) via the imbue() method. This provides a user-defined array of characters to use as delimiters. Then we save the results in a vector<string> and construct the map<string,int> from the vector. If anything is unclear after read and research do come back:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
#include <locale>
#include <iostream>
#include <algorithm>
#include <iterator>
#include <fstream>
#include <sstream>
#include<string>
#include <vector>
#include<map>

using namespace std;
//From cppreference.com (mostly): Class ctype encapsulates character classification features. All stream input operations performed
//through std::basic_istream<charT> use the std::ctype<charT> of the locale imbued in the stream to identify whitespace characters
//for input tokenization. A locale, in turn, includes a ctype facet that classifies character types. Such a facet, incorporating
//further characters, could be as follows:
class my_ctype : public ctype<char>{
    private:
        mask my_table[table_size];  //unspecified bitmask type;
    public:
        my_ctype(size_t refs = 0) : std::ctype<char>(&my_table[0], false, refs){
        copy_n(classic_table(), table_size, my_table);
        my_table['-'] = (mask)space; //casts the delimiters to space;
        my_table['\''] = (mask)space;
        my_table['('] = (mask)space;
        my_table[')'] = (mask)space;
        my_table['!'] = (mask)space;
        my_table[','] = (mask)space;
        my_table['/'] = (mask)space;
        my_table['.'] = (mask)space;
        my_table['%'] = (mask)space;//sample array; can be expanded/modified depending on type of delimiters being handled;
    }
};

int main(){
fstream File;
vector<string>v;
File.open("F:\\test.txt");
if(File.is_open()){//no error-handling here, OP to consider; 
    while(!File.eof()){
        string line;
        getline(File, line);
        stringstream stream(line);
        locale x(locale::classic(), new my_ctype);
            //locale ctor using the classic() and the my_ctype facet; locale destructor deletes the raw pointer to my_ctype;
        stream.imbue(x);//imbue sets the locale of the stream object;
        copy(istream_iterator<string>(stream),istream_iterator<string>(),back_inserter(v));
            //copies all elements in the range into the vector<string>;
            //derived, stringstream class, uses istream iterator;
            // std::ostream_iterator<std::string>(std::cout, "\n")//in case you want to print to screen;
    }
}
map<string, int> m;
for(auto& itr: v){//creating the map with the vector elements;
    ++m[itr];
    }
for(auto& itr: m){
    cout<<itr.first<<" : "<<itr.second<<"\n";//printing the map;
    }
}

Sample text

Investors have been feeling better about Apple Inc. these days. The company's stock price had climbed 31 percent since closing at a two-year low in May. Expectations changed to reflect the realization that Apple's go-go days are behind it, at least for now. Apple's less-supercharged new reality felt ... fine.
Apple shares have rebounded 31% since a May low as investors grew more comfortable with the company's slowing-to-negative revenue growth.
CEO Tim Cook declined to comment on recent reports that the Apple Watch production would be discontinued shortly on falling demand.

Output
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
31 : 2
Apple : 5
CEO : 1
Cook : 1
Expectations : 1
Inc : 1
Investors : 1
May : 2
The : 1
Tim : 1
Watch : 1
a : 2
about : 1
are : 1
as : 1
at : 2
be : 1
been : 1
behind : 1
better : 1
changed : 1
climbed : 1
closing : 1
comfortable : 1
comment : 1
company : 2
days : 2
declined : 1
demand : 1
discontinued : 1
falling : 1
feeling : 1
felt : 1
fine : 1
for : 1
go : 2
grew : 1
growth : 1
had : 1
have : 2
in : 1
investors : 1
it : 1
least : 1
less : 1
low : 2
more : 1
negative : 1
new : 1
now : 1
on : 2
percent : 1
price : 1
production : 1
reality : 1
realization : 1
rebounded : 1
recent : 1
reflect : 1
reports : 1
revenue : 1
s : 4
shares : 1
shortly : 1
since : 2
slowing : 1
stock : 1
supercharged : 1
that : 2
the : 3
these : 1
to : 3
two : 1
with : 1
would : 1
year : 1

PS: there's more that can be done like case insensitivizing the vector of strings so that 'At' and 'at' are read as the same word, etc but here I am not focusing on this aspect
Topic archived. No new replies allowed.