6

What I want to do is handling interval efficiently. For example, in my example, intervals are like the following:

[10, 20], [15, 25], [40, 100], [5, 14]

Intervals are closed and integers, and some of intervals may be ovelapped. I want to find overlapped intervals for a given query efficiently. For example, if [16, 22] is given:

[10, 20], [15, 25]

The above intervals should be computed as overalpped intervals.

I'm currently writing an interval tree based on Red-Black Tree (reference: CLRS, Introduction to Algorithms). Although finding all overlapped intervals can be O(n), the running time should be faster. Note that intervals can be deleted and inserted.


However, I just found that Boost has interval_map and interval_set: http://www.boost.org/doc/libs/1_46_1/libs/icl/doc/html/index.html

I tried it, but the behavior is very strange for me. For example, if [2, 7] is inserted first and then [3, 8] is inserted, then the resulting map will have [2, 3), [3, 7], and (7, 8]. That is, when a new interval is inserted, splitting is automatically done.

Can I turn off this feature? Or, Boost's interval_map is right for my purpose?

ildjarn
  • 59,718
  • 8
  • 115
  • 201
Nullptr
  • 2,527
  • 4
  • 23
  • 27

3 Answers3

5

You asked for a data structure that could find overlaps efficiently. This does so, by storing overlaps in the data structure. Now you seem to be complaining that it has done so.

This example explains the logic:

typedef std::set<string> guests;
interval_map<time, guests> party;
party += make_pair(interval<time>::right_open(time("20:00"), time("22:00")),
guests("Mary"));
party += make_pair(interval<time>::right_open(time("21:00"), time("23:00")),
guests("Harry")); 
// party now contains
[20:00, 21:00)->{"Mary"} 
[21:00, 22:00)->{"Harry","Mary"} //guest sets aggregated on overlap
[22:00, 23:00)->{"Harry"}

When you add two overlapping intervals, you actually create three intervals with distinct properties. The overlap is in both original intervals, make it a logically distinct interval from either of the original intervals. And the two original intervals now span times with different properties (some that overlap the original, some that don't). This splitting makes it efficient to find overlaps, since they are their own intervals in the map.

In any event, Boost does allow you to select the interval combining style. So if you want to force a structure that makes it harder to find overlaps, you can do so.

David Schwartz
  • 166,415
  • 16
  • 184
  • 259
  • Thanks, but I need to maintain the original intervals. All supported joining operations destroy the original intervals. Thanks, thoguh! – Nullptr Nov 02 '11 at 02:50
  • 1
    The original intervals are still there. If you look at the example, you can trivially see that the interval for 'Mary' is 20-22. They're just encoded in a way that makes overlaps efficient. – David Schwartz Nov 03 '11 at 04:12
  • I see how this satisfies OPs request, but can anyone explain why the Boost docs say: Caution We are introducing interval_maps using an interval map of sets of strings, because of it's didactic advantages. The party example is used to give an immediate access to the basic ideas of interval maps and aggregate on overlap. For real world applications, an interval_map of sets is not necessarily recommended. It has the same efficiency problems as a std::map of std::sets. – Ryan Dec 29 '20 at 00:50
2

I tried boost interval_map and interval_set. They are very inefficient. The setup cost is very high because the implementation basically maps each subinterval (intersection) to all the intervals that contain it.

I think the implementation in CLRS "introduction to algorithms" based on red-black tree is far better. It is strange there is no red-black tree implementation out there that allows augmentation, even though std::set and std::map are based on RB tree.

user723145
  • 21
  • 1
1

I think you could use an interval_map<int, set<discrete_interval<int> > >. Whenever you want to add an interval I, just add make_pair(I, II) to the map, where II is a set containing only I. So for the example above, you would do:

#include <iostream>
#include <boost/icl/interval_map.hpp>

using namespace boost::icl;

typedef std::set<discrete_interval<int> > intervals;

intervals singleton(const discrete_interval<int> &value) {
    intervals result = { value };
    return result;
}

int main() {
    interval_map<int, intervals> mymap;
    discrete_interval<int> i1 = discrete_interval<int>(2, 7);
    discrete_interval<int> i2 = discrete_interval<int>(3, 8);
    mymap.add(make_pair(i1, singleton(i1)));
    mymap.add(make_pair(i2, singleton(i2)));

    for (int i = 0; i < 10; ++i) {
        std::cout << "i: " << i << ", intervals: " << mymap(i) << std::endl;
    }
}

Note that the boost documentation suggests that an interval_map of std::sets is not particularly efficient, at the bottom of this page. So this suggests you might want to write your own implementation of the set concept, or use a different one than std::set.

Erik P.
  • 1,517
  • 10
  • 16
  • I tried this code and it doesn't seem to compile. This is the error: no match for ‘operator+=’ (operand types are ‘boost::icl::interval_map >’ and ‘std::pair, boost::icl::closed_interval >’) mymap += make_pair(i1, i1); – user1701545 Dec 04 '13 at 21:27
  • Ah thanks -- I guess I should throw a set in there somewhere. Will edit. – Erik P. Dec 05 '13 at 23:39
  • Actually, I would be grateful to have a compiling example. See my question: http://stackoverflow.com/questions/20387669/using-boost-interval-map/20388063?noredirect=1#comment30444819_20388063 – user1701545 Dec 05 '13 at 23:54
  • Hm - you called me out, and I deserved it. I assumed the code higher up on the page would actually work, and just adapted it; but now I've rewritten the example and actually compiled and run it. – Erik P. Dec 06 '13 at 17:06
  • By "compiled and run it" do you mean successfully or unsuccessfully? – user1701545 Dec 06 '13 at 17:31
  • Successfully. gcc 4.6.3, called with g++ -std=c++0x -g -I boost_1_55_0 -L boost_1_55_0/libs/ foo.cpp. Then run with ./a.out. This is on Ubuntu 12.04. – Erik P. Dec 09 '13 at 15:04
  • It seems that the add operator creates open intervals despite adding only closed intervals. Any idea how to restrict this behavior so that the split intervals that are created during addition will be closed ones? – user1701545 Dec 10 '13 at 20:42
  • Sorry, don't know - not even if it's possible. I assume you are using discrete intervals as I'm doing above? For continuous intervals I'd expect the behaviour you're seeing. – Erik P. Dec 11 '13 at 22:05
  • No problem. It turns out that with discrete, closed intervals, interval_map will create intersected intervals as open intervals upon interval addition. The lower and upper functions will return the open bounds of an open interval but the first and last functions will return the closed bounds, meaning lower+1 and closed-1. – user1701545 Dec 12 '13 at 14:40