One solution (in addition to all the other different solutions that have been presented here) is to use an interval/segment tree (they are really the same thing):
http://en.wikipedia.org/wiki/Segment_tree
http://en.wikipedia.org/wiki/Interval_tree
One big advantage to doing it this way is that it is trivial to do arbitrary boolean operations (not just subtraction) using the same piece of code. There is a standard treatment of this data structure in de Berg. To perform any boolean operation on a pair of interval trees, (including subtraction) you just merge them together. Here is some (admittedly naive) Python code for doing this with unbalanced range trees. The fact that they are unbalanced has no effect on the time taken to merge the trees, however the tree construction here is the really dumb part which ends up being quadratic (unless the reduce is executed by partitioning, which I somehow doubt). Anyway here you go:
class IntervalTree:
def __init__(self, h, left, right):
self.h = h
self.left = left
self.right = right
def merge(A, B, op, l=-float("inf"), u=float("inf")):
if l > u:
return None
if not isinstance(A, IntervalTree):
if isinstance(B, IntervalTree):
opT = op
A, B, op = B, A, (lambda x, y : opT(y,x))
else:
return op(A, B)
left = merge(A.left, B, op, l, min(A.h, u))
right = merge(A.right, B, op, max(A.h, l), u)
if left is None:
return right
elif right is None or left == right:
return left
return IntervalTree(A.h, left, right)
def to_range_list(T, l=-float("inf"), u=float("inf")):
if isinstance(T, IntervalTree):
return to_range_list(T.left, l, T.h) + to_range_list(T.right, T.h, u)
return [(l, u-1)] if T else []
def range_list_to_tree(L):
return reduce(lambda x, y : merge(x, y, lambda a, b: a or b),
[ IntervalTree(R[0], False, IntervalTree(R[1]+1, True, False)) for R in L ])
I wrote this kind of quickly and didn't test it that much, so there could be bugs. Also note that this code will work with arbitrary boolean operations, not just differences (you simply pass them as the argument to op in merge). The time complexity of evaluating any of these is linear on the size of the output tree (which is also the same as the number of intervals in the result). As an example, I ran it on the case you provided:
#Example:
r1 = range_list_to_tree([ (1, 1000), (1100, 1200) ])
r2 = range_list_to_tree([ (30, 50), (60, 200), (1150, 1300) ])
diff = merge(r1, r2, lambda a, b : a and not b)
print to_range_list(diff)
And I got the following output:
[(1, 29), (51, 59), (201, 1000), (1100, 1149)]
Which seems to be in agreement with what you would expect. Now if you want to do some other boolean operations here is how it would work using the same function:
#Intersection
merge(r1, r2, lambda a, b : a and b)
#Union
merge(r1, r2, lambda a, b : a or b)
#Xor
merge(r1, r2, lambda a, b : a != b)