15

I was searching the last few days for a stable implementation of the R-Tree with support of unlimited dimensions (20 or so would be enough). I only found this http://sourceforge.net/projects/jsi/ but they only support 2 dimensions.

Another Option would be a multidimensional implementation of an interval-tree.

Maybe I'm completly wrong with the idea of using an R-Tree or Intervall-Tree for my Problem so i state the Problem in short, that you can send me your thoughts about this.

The Problem I need to solve is some kind of nearest-neighbour search. I have a set of Antennas and rooms and for each antenna an interval of Integers. E.g. antenna 1, min -92, max -85. In fact it could be represented as room -> set of antennas -> interval for antenna. The idea was that each room spans a box in the R-Tree over the dimension of the antennas and in each dimension by the interval.

If I get a query with N-Antennas and values for each antenna I then could just represent the Information as a query point in the room and retrieve the rooms "nearest" to the point.

Hope you got an Idea of the problem and my idea.

Majid
  • 12,271
  • 14
  • 71
  • 107
drame
  • 395
  • 1
  • 3
  • 14
  • 1
    nvm its an old thread: Note that there are data structures specifically designed to support nearest-neighbour querys like M-trees. https://en.wikipedia.org/wiki/M-tree – Manuel Arwed Schmidt Aug 10 '14 at 17:09

5 Answers5

4

Be aware that R-Trees can degrade badly when you have discrete data. The first thing you really need to find out is an appropriate data representation, then test if your queries work on a subset of the data.

R-Trees will only make your queries faster. If they don't work in the first place, it will not help. You should test your approach without using R-Trees first. Unless you hit a large amount of data (say, 100.000 objects), a linear scan in-memory can easily outperform an R-Tree, in particular when you need some adapter layer because it is not well-intergrated with your code.

The obvious approach here is to just use bounding rectangles, and linearly scan over them. If they work, you can then store the MBRs in an R-Tree to get some performance improvements. But if it doesn't work with a linear scan, it won't work with an R-Tree either (it will not work faster.)

Has QUIT--Anony-Mousse
  • 70,714
  • 12
  • 123
  • 184
  • Yeah. But for testing I'll first need a working Implementation. ;) – drame Dec 11 '11 at 14:35
  • Yeah, but not of an R-Tree. Just do it with a linear scan! Again; R-trees will only *speed up*, not solve any task you couldn't do before. – Has QUIT--Anony-Mousse Dec 11 '11 at 15:56
  • 1
    The speed up is exactly what I want. And therefore I'm searching for a generic, free, stable Implementation. Such as the native Implementations of TreeMap where a Red-Black-Tree is used in Background. – drame Dec 12 '11 at 08:41
  • 1
    Well, I have concerns that your method will not work, with or without speedups! So test it first before wasting time on adopting some external code. R-Trees *manage* rectangles, but that won't help you at all if your distance function is not helpful (which happens at high dimensionality... see "curse of dimensionality") – Has QUIT--Anony-Mousse Dec 12 '11 at 10:41
  • 1
    Note that R-Trees have gained popularity among game developers as a cache-oblivious multi-dimensional data structure. Especially when you can traverse the tree from a leaf (e.g. player attacking his _nearby_ enemies) or hardly update the tree, it's a huge performance gain against the "linear scan" of 100k objects. Thats my point of view on it. – Manuel Arwed Schmidt Aug 10 '14 at 17:06
  • I like R-trees a lot, but for games I would try quadtrees and gridfiles first. Because most of the time, I would need to scan less than 10 buckets, and these structures are just as cheap as it gets. – Has QUIT--Anony-Mousse Aug 10 '14 at 17:09
3

I have found this R*-Tree implementation in Java which seems to offer many features:

https://github.com/davidmoten/rtree

You might want to check it out!

Phil
  • 3,184
  • 3
  • 26
  • 43
3

I'm not entirely clear on what your exact problem is, but an R-Tree or interval tree would not work well in 20 dimensions. That's not a huge number of dimensions, but it is large enough for the curse of dimensionality to begin showing up.

To see what I mean, consider just trying to look at all of the neighbors of a box, including ones off of corners and edges. With 20 dimensions, you'll have 320 - 1 or 3,486,784,400 neighboring boxes. (You get that by realizing that along each axis a neighbor can be -1 unit, 0 unit, or +1 unit, but (0,0,0) is not a neighbor because it represents the original box.)

I'm sorry, but you either need to accept brute force searching, or else analyze your problem better and come up with a cleverer solution.

btilly
  • 35,214
  • 3
  • 46
  • 74
  • Y I'm aware of the curse of dimensionality. But I would have tried it though with an R-Tree since the 20 dimension are kind of worst case. Maybe I could even reduce the dimensions in some way. But I would like to test that and compare it to other maybe better solutions. – drame Dec 11 '11 at 00:53
  • 1
    It depends a lot on your data. I have successfully used R-Trees on 27+ dimensional color histograms. – Has QUIT--Anony-Mousse Dec 11 '11 at 13:22
0

Another good implementation in Java is ELKI: https://elki-project.github.io/.

user3282611
  • 770
  • 8
  • 8
0

You can use PostgreSQL’s Generalized Search Tree indexing facility.

GiST Quick demo

  • Links to external resources are encouraged, but please add context around the link so your fellow users will have some idea what it is and why it’s there. Always quote the most relevant part of an important link, in case the target site is unreachable or goes permanently offline. – Dexter Bengil May 08 '18 at 16:00