duplicate removal

Could anyone please help

[code removed]


I have a file having blocks of data. The data block has some lines having OrderReceive in it. These lines are duplicated except the time part which comes after the first comma till the second comma. I have to remove those line with "later date/time" keeping the one with earlier time. I have written a function for extracting date time

[code removed]

I have collected these lines (with OrderReceive) and stored them in another vector

the code for collecting these lines is as follows

[code removed]


I dont know how do I remove the duplicate line based on the condition I described. How can I compare the time of one vector iterator with the next one?

could you please help?
Last edited on by admin
1) Create a class that contains a line (as a string) along with the extracted date/time.
2) Implement '<' (less than) operator for this class. It should sort objects based on a date/time.
3) Put all lines into a std::set container.
4) First element of the container will have the earliest date/time.
Thanks Abramus

there are duplicates and I have to select the one with the earliest time

[code removed]


where first two lines are identical except the time part and the last two lines are identical but again except for its time part. I need to select the second line of each pair (because they have the earliest time) and the final output should be

[code removed]


could you please help?
Last edited on by admin
when I thought about your problem I came to the conclusion that it's better to write it down. then i figured that your get_time() doesn't match the time in the data but then it was too late ;) Ok here it comes:

the needed class
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
class CLine
{
public:
  CLine() :
    m_Time(time_t()),
    m_Valid(false)
  {
  }
  CLine(const std::string &str) :
    m_Time(time_t()),
    m_Valid(false)
  {
    Extract(str);
  }


public:
  bool operator==(const CLine &line) const
  {
    return (line.m_Part == m_Part);
  }
  bool operator<(const CLine &line) const
  {
    return (m_Time < line.m_Time);
  }

  bool IsValid() const
  {
    return m_Valid;
  }
  const std::string &GetContent() const
  {
    return m_Content;
  }

  void Extract(const std::string &str);

private:
  void get_time(const std::string &str);

private:
  std::string m_Content;
  std::string m_Part;
  time_t m_Time;
  bool m_Valid;
};

void CLine::get_time(const std::string& s)
{
  m_Valid = (s.size() > 15);
  if(m_Valid)
  {
    tm date;

    memset(&date, 0, sizeof(date));

    date.tm_mday = atoi(s.substr(0, 2).c_str());
    date.tm_mon = atoi(s.substr(3, 2).c_str()) - 1;
    date.tm_year = atoi(s.substr(6, 4).c_str()) - 1900;

    date.tm_hour = atoi(s.substr(11, 2).c_str());
    date.tm_min = atoi(s.substr(14, 2).c_str());
    date.tm_sec = atoi(s.substr(17, 2).c_str());

      m_Time = mktime(&date);
  }
}

void CLine::Extract(const std::string &str)
{
  m_Content = str;

  const std::string::size_type pos1 = str.find(",");
  m_Valid = (pos1 != std::string::npos);
  if(m_Valid)
  {
    const std::string::size_type pos2 = str.find(",", pos1 + 1);
    m_Valid = (pos2 != std::string::npos);

    if(m_Valid)
      get_time(str.substr(pos1 + 1, pos2 - (pos1 + 1)));

    if(m_Valid)
      m_Part = str.substr(0, pos1 + 1) + str.substr(pos2);
  }
}

the rest
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
std::vector<CLine> l_v;

...

while(getline(is,line))
{            
        size_t pos1 = line.find("OrderReceive");

        if (pos1 != string::npos)    // npos indicates the end of the string
        {
                dupOrdTime->push_back(line);
                continue;
        }
        else
        {
          CLine l(a[i]);

          std::vector<CLine>::iterator it = std::find(l_v.begin(), l_v.end(), l);
          bool push = (it == l_v.end());
          if(not push)
          {
            push = (l < (*it));
            if(push)
              l_v.erase(it);
          }
          if(push)
            l_v.push_back(l);
        }
}


l_v is the vector that contains the wanted lines
Last edited on
Thanks a LOT coder777 :-). Let me try your code
Topic archived. No new replies allowed.