a question about NLP

now I have a train file, the/DT fiber/NN ... , the first one is word, second one is a tag, now I want to count the pair (word , tag),every word may have several tags, and use map or something to store it, so it will like: string, string, int . Then count the pair( tag1,tag2) , tag1 and tag2 are two contiguous tag, like DT and NN , and store the count, it will like: string, string , int.

Who can help me, thank you
The input contains words. Each word contains a key and one or more tags, all separated with slashes. Is this correct?

If you write into std::multimap<string,string>, then the count of tags for each key is an implicit property of the map.

Ordinary std::map with tag-pair as key and count as value would suffice.
Topic archived. No new replies allowed.