5

Given a set of N intervals: For each interval, which other interval has a maximal overlap?

e.g. { [0,5], [2,9], [2,3], [4,9] } :

[0,5]: [2,9] (Overlap of 4)

[2,9]: [4,9] (Overlap of 6)

[2,3]: [0,5] or [2,9] (Overlap of 2)

[4,9]: [2,9] (Overlap of 6)

N can be large, so I believe an interval tree is necessary. However, no posts or publications I've found describe an approach to this type of query. The result of a query can lie on any of the 3 paths from an interval tree node (left of center, overlapping center, right of center), as they may or may not include the center point of the query interval. As such, I cannot think of a log(N) traversal method to obtain a result.

Also, for the [2,3] case, I don't care which is picked. Any maximally intersecting interval can be picked arbitrarily. Only 1 result is returned per query.

Can each of these queries be answered in log(N), providing an Nlog(N) overall solution?

EDIT: Pseudocode I worked out:

query(ITnode node, Interval intrv){
    // s_list: start-sorted intervals overlapping center
    // e_list: end-sorted intervals overlapping center
    if intrv.end < node.center:
        node_max = search node.s_list for interval w/ closest start to intrv.start
        return max(node_max, query(node.left_tree, intrv))
    else if intrv.start > node.center:
        node_max = search node.e_list for interval w/ closest end to intrv.end
        return max(node_max, query(node.right_tree, intrv))
    else: // Query overlaps center
        // Still working this out but I get the picture now
}

for each Interval i:
    result[i.id] = query(root, i)
inc_shadow
  • 53
  • 5
  • 2
    +1, interesting question, I would have never given it up to community, would rather have egoistically played with it myself alone – bobah Mar 11 '14 at 08:25

2 Answers2

1

We can solve this problem using interval tree.

Algo:

  1. Construct a set of points S - merge all left and right points of the intervals. For the input from example S = {0, 2, 3, 4, 5, 9}. Complexity - O(NlogN)

  2. Build interval tree with the following structure

    struct tree { int max_overlap_value; int max_overlap_id; int second_max_overlap_value; int second_max_overlap_id; tree *left; tree *right; };

    Also set max_overlap_value = second_max_overlap_value = 0, max_overlap_id = second_max_overlap_id -1. Complexity - O(N)

  3. Update values in tree nodes using intervals from input. Complexity - O(NlogN)

  4. For each interval find max_overlap_value, max_overlap_id, second_max_overlap_value, second_max_overlap_id. If max_overlap_id is equal to interval_id, use second_max_overlap_value, otherwise max_overlap_value. Complexity - O(NlogN)

Total complexity would be O(NlogN)

helThazar
  • 126
  • 2
  • I can also provide c++ code if you need it. But I think that algo is enough. – helThazar Mar 11 '14 at 10:14
  • 1
    Thanks for laying it out for me. I didn't understand exactly how to implement it using the tree values you mentioned, but I believe I reached a similar solution. – inc_shadow Mar 13 '14 at 03:22
0

I believe it can be solved using O(RlogN) time, where R is the maximum number of overlapping intervals with a given query interval i. My approach would be to build an interval tree based on a Balanced Binary Search Tree implementation. Once we build a tree based on the intervals we have to find the set intervals which overlap with the query interval which can be done in O(RlogN) time because finding "a" overlapping interval is logN time. After we do this we can find the maximum overlapping interval in O(R) time. The worst case complexity of this approach is O(NlogN).

working
  • 811
  • 3
  • 11
  • 20