Algorithmic problem

Pages: 12
can anyone here help me with a problem: given a number of intervals where each interval is a time where a person is in a room, find the times where there is a maximum number of people in the room,

For example, given the intervals

5..8
3..4
13..20
7..10

the answer would be [7,8]

Complexity should be O(n), the O(n^2) solution is quite obvious
Last edited on
Look for overlaps:
                             1  1  1  1  1  1  1  1  1  1  2  2
--1--2--3--4--5--6--7--8--9--0--1--2--3--4--5--6--7--8--9--0--1--

              5........8
        3..4        :  :
                    :  :              13..................20
                    7.......10

Ignore things that don't overlap with anything else. So 3..4 and 13..20 are out. 5..8 and 7..10 overlap: 7..8.

Try this one too:

5..8
3..6
8..20
7..10

Hint: draw it.

Once you see how it works, you should be able to craft a solution.

Hope this helps.
Last edited on
Brute force O(N*M) time, O(K) space, where N is the number of intervals, M is the average duration of an interval and K is the total elapsed time

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include <iostream>
#include <unordered_map>

using pair_t = std::pair<int,int> ;

void increment_counts( std::unordered_map<int,int>& counts, pair_t interval )
{ for( int t = interval.first ; t <= interval.second ; ++t ) ++counts[t] ; } // O(M)

int main()
{
    pair_t intervals[] { {5,8}, {3,4}, {13,20}, {7,10} } ;

    std::unordered_map<int,int> counts ;
    for( const pair_t& interval : intervals ) // O(N)
        increment_counts( counts, interval ) ;

    int max_count = 0 ;
    for( const pair_t& p : counts ) // O(N)
        if( p.second > max_count ) max_count = p.second ;

    std::cout << "[ " ;
    for( const pair_t& p : counts ) // O(N)
        if( p.second == max_count ) std::cout << p.first << ' ' ;
    std::cout << "]\n" ;
}
.
http://coliru.stacked-crooked.com/a/d4c220faafee1fe2
Way to do OP's homework for him. Can you imagine him turning something like that in?

Not to mention, how likely do you think it may be that the professor has other data he may try on it? I think it is a better than 50% chance.

11
12
    pair_t intervals[] { {5,9}, {3,4}, {13,20}, {7,10} } ;
[ 7 8 9 ]

Oops!


Sorry to grouch. Just hits one of my pet peeves.
> Can you imagine him turning something like that in?
> how likely do you think it may be that the professor has other data he may try on it?

It is deliberately written with a hard-coded array of intervals for a reason which I thought would have been blatantly obvious - it is not wriien with the intent that it can be submitted verbatim, it merely illustrates an idea for a possible solution.

For the record, I do not assume that every poster who comes to this forum would blindly copy a snippet without trying to understand what it does. Prima facie, I assume that, a person who is enrolled in a course to learn C++ actually wants to learn C++. And happily, in the vast majority of cases that assumption has turned out to be a valid one; the trust and respect that I give to newbies who post in this forum has generally turned out to be well-placed.
Nice commercial.

Unfortunately, I work in education and know better. Further, nothing I said implies that the OP will do anything dishonest. I do think your version of reality is a little extra rosy, though.

What I tried to get across is that the OP (and any other Googler) is unlikely to find this particular piece of code digestible. There is nothing obvious about it, blatantly or otherwise. Particularly beginners in introductory classes. (And, frankly, anything below a CS-400 level is an introductory class.) As this particular question is a CS 100-level kind of question -- maybe 200-level, I assume OP is not too capable yet.

Here's some good reading that may help understand why I don't like posting working solutions:
http://www.samplereality.com/2012/12/18/intrusive-scaffolding-obstructed-learning-and-moocs/


BTW, I am usually very impressed by your code. I'm not in this case, though. unordered_map is significant overhead, and storing every point in an integer range is unnecessary.

My own solution -- a logical extension from what I posted above -- blows yours out of the water.

I used two vectors to store a simple interval class that knows how to perform intersection and used O(N+1) space. Worst case is still O(n*n), but best case is O(n).

O(N*M) is dangerous whenever M approaches (or exceeds) N -- something more likely to fluctuate wildly than N, as M is whatever the largest integer range can be in the worst case.

I am sorry if I offended you about the code. That wasn't my intent, and I see how my wording ("something like that") was inflammatory. If it is any consolation, I am often cowed by your solutions to problems, and often feel my own were silly in comparison.
Another solution:

1. Sort all the interval start times and end times chronologically in one array.
2. Initialize counter to zero, and MAX_NUMBERS to zero.
3. Iterate through the array. If you encounter a start-time, increment the counter.
4. If you encounter an end-time, decrement the counter.
5. Keep updating MAX_NUMBERS accordingly.


6. Iterate once again through the array and note the intervals when MAX_NUMBERS people are there in the room.
Last edited on
Here are the ranges I tested against:

1
2
3
4
5
6
7
8
a 1..9 2..7 12..13 5..8
a 1..9 2..7 12..13
a 1..9 2..7 12..13  19..20 19..20
a 1..9 2..7 12..13  19..20 19..20 19..20
a 1..9 10..13
a
a 1..9 1..9 1..9 1..9 1..9
a 1..9 11..19 20..30 40..50 60..70
 5..7        | OP's original code (three ranges intersect)
 2..7        | modified so that only two intersect
19..20, 2..7 | two sets of ranges intersect (two answers)
19..20       | one set intersects more (one answer)
10..13, 1..9 | no intersections (meaning all ranges are equally peopled)
             | no ranges (no output)
 1..9        | best case
<all>        | worst case

For timing purposes, I used pre-loaded data (like in JLBorges's example) and 1000000 iterations.
The code for the algorithm I stated is as follows. It is an O(NLogN) algorithm since it uses sort.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

struct intervalTime {

    bool isStartTime;
    int time;

};

bool compareTimes( const intervalTime& t1, const intervalTime& t2 ) {
    return t1.time < t2.time;
}

int main()
{
    int N;
    cin >> N;
    vector< intervalTime > timeVec;

    for ( int i = 0; i < N; i++ ) {

        intervalTime newTime;

        cin >> newTime.time;
        newTime.isStartTime = true;
        timeVec.push_back( newTime );

        cin >> newTime.time;
        newTime.isStartTime = false;
        timeVec.push_back( newTime );
    }

    sort( timeVec.begin(), timeVec.end(), compareTimes );

    int peopleCount = 0;
    int maxPeople = 0;

    for ( int i = 0; i < timeVec.size(); i++ ) {

        if ( timeVec[i].isStartTime ) {
            peopleCount++;
            if ( peopleCount > maxPeople )
                maxPeople = peopleCount;
        } else
            peopleCount--;

    }

    vector< pair<int, int> > maxIntervals;
    int currentMaxStartTime = -1;

    for ( int i = 0; i < timeVec.size(); i++ ) {

        if ( timeVec[i].isStartTime ) {
            peopleCount++;
            if ( currentMaxStartTime == -1 && peopleCount == maxPeople ) {
                currentMaxStartTime = timeVec[i].time;
            }
        } else {
            peopleCount--;
            if ( currentMaxStartTime != -1 ) {

                pair<int, int> newInterval;
                newInterval.first = currentMaxStartTime;
                newInterval.second = timeVec[i].time;
                maxIntervals.push_back( newInterval );
                currentMaxStartTime = -1;

            }
        }

    }

    cout << "Maximum number of people was " << maxPeople << " at intervals:" << endl;

    for ( int i = 0; i < maxIntervals.size(); i++ )
        cout << maxIntervals[i].first << " " << maxIntervals[i].second << endl;

    return 0;
}
Last edited on
That there std::sort() is O(n log n), so your algorithm is also. Very nice, though.

I suppose I'll have to post my solution. (Which edges yours only marginally. I'm still impressed with myself...)
For completeness, there's also boost (although still O(n log n) in the number of intervals, since it uses a map)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <iostream>
#include <boost/icl/interval_map.hpp>
#include <boost/range/adaptor/map.hpp>
#include <boost/range/algorithm/max_element.hpp>
namespace icl = boost::icl;
namespace rng = boost::adaptors;
int main()
{
    icl::discrete_interval<int> input[] = {{5,8}, {3,4}, {13,20}, {7,10}};
    icl::interval_map<int, int> party;
    for(auto& i : input)
       party += make_pair(i, 1);

    int max = *max_element(party | rng::map_values);
    std::cout << "max value = " << max << '\n' << "found in the interval(s) ";
    for(const auto& p : party)
        if(p.second == max)
            std::cout << p.first << ' ';
    std::cout << '\n';
}

demo http://coliru.stacked-crooked.com/a/0de6f8abe104d330
Last edited on
@Duoas OOPs! Missed the sort complexity.

Thanks for pointing out. I have edited it.
Nice. Why did you write an open interval?

So... here's mine.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
#include <algorithm>
#include <ciso646>
#include <iostream>
#include <iterator>
#include <sstream>
#include <string>
#include <vector>
#include <utility>
using namespace std;

struct interval_t: public pair <int, int>
  {
  // This ctor isn't actually used in this example...
  interval_t( int first, int second ): pair <int, int> ( first, second ) { }

  // Ctor from string of form "3..72" == [3,72]
  interval_t( const string& s )
    {
    istringstream ss( s );
    ss >> first;
    while (ss.peek() == '.') ss.get();
    ss >> second;
    }

  // Endeavor to modify self as intersection of self and r. Returns true on success.
  bool try_intersect( const interval_t& r )
    {
    bool result = !(r.second < first) and !(r.first > second);
    if (result)
      {
      first  = max( first,  r.first  ); 
      second = min( second, r.second );
      }
    return result;
    }
  };

// Print the interval for the user in "[3,72]" format
ostream& operator << ( ostream& outs, const interval_t& r )
  {
  return outs << "[" << r.first << "," << r.second << "]";
  }

// Call as: "a.exe 5..8 3..4 13..20 7..10"
int main( int argc, char** argv )
  {
  vector <interval_t> intervals( argv + 1, argv + argc );
  vector <interval_t> results;
  size_t              max_count = 0;

  // While there are intervals, 
  //   take the last and try to merge the remaining intervals, 
  //   removing those for which it was successful
  // If we merged at least as many as any previous attempt, 
  //   store the final (merged) interval in the results.
  // The results only contain final intervals obtained by 
  //   merging the maximum number of input intervals.
  while (intervals.size())
    {
    size_t count = 1;
    interval_t r = intervals.back();
    intervals.pop_back();
    for (auto s = intervals.rbegin(); s != intervals.rend(); ++s)
      if (r.try_intersect( *s ))
        {
        intervals.erase( s.base() );
        count++;
        }
    if (count < max_count) continue;
    if (count > max_count) results.clear();
    max_count = count;
    results.push_back( r );
    }

  cout << "Maximum of " << max_count << " people during the following intervals:\n";
  copy( results.begin(), results.end(), ostream_iterator <interval_t> ( cout, "\n" ) );
  }

Yeah... so that's it.

I wonder if OP has learned anything...
Duoas wrote:
Why did you write an open interval?

Because it makes sense with time intervals (are there still 2 people in that room at 8:59?) and because I didn't notice the answer was [7,8] rather than [7,8).

Changing the library-based solution is easy: http://coliru.stacked-crooked.com/a/042ec5bed35f4d43
Last edited on
Wow! Great learning experience for me at least.

@Duoas
Are you using reverse iterators to improve upon "erase" efficiency?

@Cubbi
I have never used boost libraries. It'll be very helpful if you can explain the algorithm. It seems boost can shorten the code quite a lot.
abhishekm71 wrote:
I have never used boost libraries. It'll be very helpful if you can explain the algorithm.

You should at least install them if you're serious about learning C++.

In this case, I used boost interval container library, which already has a container that maps intervals to values and automatically adds up the values when the intervals intersect. The 'algorithm', as it was, was to add your intervals to the interval_map, each with associated value 1. The resulting map already holds the answer (if you do cout << party << '\n' in my program, you'll see ([3,7)->1) ([7,8]->2) ((8,10]->1) ([13,20]->1) in that case), the rest was just extracting it.
@Duoas
Are you using reverse iterators to improve upon "erase" efficiency?

The main reason it is there is to avoid invalidating the iterator when elements are removed. But yes, that is also a reason. Don't know how much of a difference it actually makes without hard profiling data.

Dealing with intervals is actually a common need in computing. So Boost of course has stuff to do it for us:
http://www.boost.org/doc/libs/release/libs/icl/doc/html/index.html

Cubbi didn't write the algorithm, he was just showing us that Boost could do it for us, complete with example.

The algorithm is actually very similar to yours.

The icl::interval_map does the job of tracking which intervals overlap and how many intervals there are for each overlap.

The max_element() algorithm, along with some of Boost's really nifty map adaptors, finds the number of the largest interval.

Finish with a final loop through all the common intervals and print the ones that have the matching largest value.
Last edited on
Ok, I'll get my hands on boost soon enough.

Thank you both of you!
> For completeness, there's also boost (although still O(n log n) in the number of intervals, since it uses a map)
The order is incorrect, check out `party.size()'

> For timing purposes, I used pre-loaded data (like in JLBorges's example) and 1000000 iterations.
I don't see the point in testing against a small number of intervals.
Try n={1e3, 10e3, 100e3} with different input data (no need to iterate)
- random generated intervals
- all intervals intersect
- none intervals intersect
- all intervals are the same
- another degenerate case

then tell me if you are really proud of your approach
@Duoas: wrong answer
$ ./a.out 1..5 2..7 4..10 9..11
Maximum of 2 people during the following intervals:
[9,10]
[2,5]

Expected
Maximum of 3 people during the following intervals:
[4,5]
Pages: 12