8

I test boost.geometry.index.rtree (boost 1.59 www.boost.org) and superliminal.RTree (http://superliminal.com/sources/sources.htm#C_Code).

To my surprise, superliminal.RTree is more quickly than boost.geometry.index.rtree.

Environment settings

  • add same spatial index data into the superliminal.RTree and boost.geometry.index.rtree object.
  • test same spatial index query for 100 times and get the time consumed.
  • GCC version is "gcc version 4.4.6 20110731 (Red Hat 4.4.6-4) (GCC) ",use "-g -o2 -Wall -finline-functions" compiler flags.
  • use RTree < uint64_t, int, 2, int64_t>

Boost code

namespace bg = boost::geometry; 
namespace bgi = boost::geometry::index; 

typedef bg::model::point < int, 2, bg::cs::cartesian > point_t; 
typedef bg::model::box < point_t > box_t; 
typedef std::pair < box_t, uint64_t > value_t; 
typedef bgi::rtree < value_t, bgi::quadratic < 8, 4 > > rtree_t; 

The result is :

superliminal.RTree 0.029s

boost.geometry.index.rtree 0.12s.

SymbolixAU
  • 22,021
  • 4
  • 47
  • 120
zadecn
  • 129
  • 1
  • 6
  • 2
    What compiler are you using? Are optimizations (-O2) enabled? What and how much data is inserted? Is the same balancing algorithm used in both cases? – Adam Wulkiewicz Oct 08 '15 at 02:55
  • And what are the effective page sizes? – Has QUIT--Anony-Mousse Oct 08 '15 at 06:04
  • GCC version is "gcc version 4.4.6 20110731 (Red Hat 4.4.6-4) (GCC) ". GCC use "-g -o2 -Wall -finline-functions" compiler flags. use RTree and boost as below: namespace bg = boost::geometry; namespace bgi = boost::geometry::index; typedef bg::model::point point_t; typedef bg::model::box box_t; typedef std::pair value_t; typedef bgi::rtree > rtree_t; – zadecn Oct 09 '15 at 01:55
  • If you want to compare the implementations you should create the same or at least similar trees so using the same balanding algorithm if possible and the same number of elements in a tree node or more specifically the same size of a node in memory. But let's keep it simple for now. RTree by default has MAXNODES = 8 and MINNODES = MAXNODES / 2, but bgi::rtree has max 32 in your case and by default min = 0.3*max. So you should make these settings the same in both cases. – Adam Wulkiewicz Oct 09 '15 at 11:47
  • 1
    The next thing is that in the case of bgi::rtree you're creating a R*-tree. And AFAIU in the case of RTree it's the "classic" R-tree with linear or quadratic splitting algorithm (I see corresponding files in the C code but I don't know which one is used in the C++ version). You should use the same algorithm with the `bgi::rtree` so either `bgi::linear` or `bgi::quadratic`. In the readme of the RTree it's also mentioned something about a modification of the original Gutman's balancing algorithms, the bgi::rtree uses original ones, but this probably shouldn't be a problem. – Adam Wulkiewicz Oct 09 '15 at 11:54
  • Is the time above measured for the creation and querying or only the querying? For bgi::rtree are you passing the values into the constructor or rather calling `insert()` method? I'm asking because the constructor doesn't use balancing algorithm but packing algorithm. In both the tree should be created the same way, e.g. by insertion of the elements one by one. Btw, you could edit the question so the future reader wouldn't have to read the comments. Have you considered sharing the code you're using for benchmarking in the question? – Adam Wulkiewicz Oct 09 '15 at 12:01
  • And of course the same `coord_t` should be used in both cases, so AFAIU `int`. – Adam Wulkiewicz Oct 09 '15 at 12:08
  • 1
    Do you plan to compare the implementations the correct way and share the results? I could do it myself but I'd most certainly store different data than you so our results would be different. – Adam Wulkiewicz Oct 21 '15 at 19:27
  • Thanks,I will test it again and show new result. – zadecn Oct 23 '15 at 06:39

1 Answers1

23

It's hard to tell why it's slower in your case because you haven't shared the code nor told how exactly both R-tree implementations are used. You also haven't provided any info about the data you're storing. Those things are very important when you're comparing algorithms or data structures.

I can only guess that you're using Superliminar R-tree the same way as it's used in the attached test file so you're passing a callback function into the Search member function. And I guess you're using Boost.Geometry R-tree the same way as it's showed in the documentation in the Quick Start section so you're passing an object of std::back_insert_iterator into the query member function. So let's check this out.

#include "RTree.h"

#include <vector>

#include <boost/geometry.hpp>
#include <boost/geometry/index/rtree.hpp>

#include <boost/timer.hpp>

struct Rect
{
  Rect()  {}

  Rect(int a_minX, int a_minY, int a_maxX, int a_maxY)
  {
    min[0] = a_minX;
    min[1] = a_minY;

    max[0] = a_maxX;
    max[1] = a_maxY;
  }

  int min[2];
  int max[2];
};

// used with Superliminar R-tree
std::vector<uint64_t> res;
bool MySearchCallback(uint64_t id, void* arg)
{
    res.push_back(id);
    return true;
}

int main()
{
    // randomize rectangles
    std::vector<Rect> rects;
    for (size_t i = 0 ; i < 300000 ; ++i)
    {
        int min_x = rand() % 10000;
        int min_y = rand() % 10000;
        int w = 1 + rand() % 100;
        int h = 1 + rand() % 100;
        rects.push_back(Rect(min_x, min_y, min_x+w, min_y+h));
    }

    // create the rectangle passed into the query
    Rect search_rect(4000, 4000, 6000, 6000);

    // create the Superliminar R-tree
    RTree<uint64_t, int, 2, int64_t> tree;

    // create the Boost.Geometry R-tree
    namespace bg = boost::geometry;
    namespace bgi = boost::geometry::index;
    typedef bg::model::point<int, 2, bg::cs::cartesian> point_t;
    typedef bg::model::box<point_t> box_t;
    typedef std::pair<box_t, uint64_t> value_t;
    bgi::rtree<value_t, bgi::quadratic<8, 4> > bg_tree;

    // Insert values
    for(size_t i = 0; i < rects.size(); i++)
    {
        Rect const& r = rects[i];
        tree.Insert(r.min, r.max, i);

        box_t b(point_t(r.min[0], r.min[1]), point_t(r.max[0], r.max[1]));
        bg_tree.insert(value_t(b, i));
    }

    // test Rtree
    {
        int sum = 0;
        boost::timer t;
        for (size_t i = 0 ; i < 100 ; ++i)
        {
            res.clear();
            sum += tree.Search(search_rect.min, search_rect.max, MySearchCallback, NULL);
        }
        double s = t.elapsed();
        std::cout << s << " " << sum << std::endl;
    }

    // test BG Rtree
    {
        box_t search_box(
            point_t(search_rect.min[0], search_rect.min[1]),
            point_t(search_rect.max[0], search_rect.max[1]));
        size_t sum = 0;
        boost::timer t;
        for (size_t i = 0 ; i < 100 ; ++i)
        {
            std::vector<value_t> res;
            sum += bg_tree.query(bgi::intersects(search_box), std::back_inserter(res));
        }
        double s = t.elapsed();
        std::cout << s << " " << sum << std::endl;
    }
}

The results (with GCC 4.8 -O2 -finline-functions) are:

0.014s for Superliminar R-tree
0.072s for Boost.Geometry R-tree

so they're similar to yours, i.e. one is ~5x faster. Note that in both cases I create a container storing the results (IDs for Superliminar and the whole values for Boost.Geometry). The problem is that R-trees are not used the same way.

Optimization 1

Let's try to get rid of the differences in usage of both R-trees by storing the same results the same way. In order to do this remove temporary std::vector<value_t>. To implement callback replace std::back_insert_iterator with a function object called boost::function_output_iterator. In both cases store only the IDs in std::vector<uint64_t> and hope the compiler will optimize the code.

#include "RTree.h"

#include <vector>

#include <boost/function_output_iterator.hpp>
#include <boost/geometry.hpp>
#include <boost/geometry/index/rtree.hpp>

#include <boost/timer.hpp>

struct Rect
{
  Rect()  {}

  Rect(int a_minX, int a_minY, int a_maxX, int a_maxY)
  {
    min[0] = a_minX;
    min[1] = a_minY;

    max[0] = a_maxX;
    max[1] = a_maxY;
  }

  int min[2];
  int max[2];
};

std::vector<uint64_t> res;

// used with Superliminar R-tree
bool MySearchCallback(uint64_t id, void* arg)
{
    res.push_back(id);
    return true;
}

// used with Boost.Geometry R-tree
struct MySearchCallback2
{
    template <typename Value>
    void operator()(Value const& v)
    {
        res.push_back(v.second);
    }
};

int main()
{
    // randomize rectangles
    std::vector<Rect> rects;
    for (size_t i = 0 ; i < 300000 ; ++i)
    {
        int min_x = rand() % 10000;
        int min_y = rand() % 10000;
        int w = 1 + rand() % 100;
        int h = 1 + rand() % 100;
        rects.push_back(Rect(min_x, min_y, min_x+w, min_y+h));
    }

    // create the rectangle passed into the query
    Rect search_rect(4000, 4000, 6000, 6000);

    // create the Superliminar R-tree
    RTree<uint64_t, int, 2, int64_t> tree;

    // create the Boost.Geometry R-tree
    namespace bg = boost::geometry;
    namespace bgi = boost::geometry::index;
    typedef bg::model::point<int, 2, bg::cs::cartesian> point_t;
    typedef bg::model::box<point_t> box_t;
    typedef std::pair<box_t, uint64_t> value_t;
    bgi::rtree<value_t, bgi::quadratic<8, 4> > bg_tree;

    // Insert values
    for(size_t i = 0; i < rects.size(); i++)
    {
        Rect const& r = rects[i];
        tree.Insert(r.min, r.max, i);

        box_t b(point_t(r.min[0], r.min[1]), point_t(r.max[0], r.max[1]));
        bg_tree.insert(value_t(b, i));
    }

    // test Rtree
    {
        int sum = 0;
        boost::timer t;
        for (size_t i = 0 ; i < 100 ; ++i)
        {
            res.clear();
            sum += tree.Search(search_rect.min, search_rect.max, MySearchCallback, NULL);
        }
        double s = t.elapsed();
        std::cout << s << " " << sum << std::endl;
    }

    // test BG Rtree
    {
        box_t search_box(
            point_t(search_rect.min[0], search_rect.min[1]),
            point_t(search_rect.max[0], search_rect.max[1]));
        size_t sum = 0;
        MySearchCallback2 callback;
        boost::timer t;
        for (size_t i = 0 ; i < 100 ; ++i)
        {
            res.clear();
            sum += bg_tree.query(bgi::intersects(search_box), boost::make_function_output_iterator(callback));
        }
        double s = t.elapsed();
        std::cout << s << " " << sum << std::endl;
    }
}

In this case the results are:

0.014s for Superliminar R-tree
0.033s for Boost.Geometry R-tree

Optimization 2

One more thing that can be done is disabling assertions. There are some in the Boost.Geometry R-tree. After compiling the code with -DNDEBUG the results are even closer:

0.014s for Superliminar R-tree
0.015s for Boost.Geometry R-tree

Conclusion

In this synthetic test case, for randomized data, etc. the results are more or less the same. Again, for you they may be different, I don't know what you're doing exactly so I'm unable to tell you what's the issue.

This is quite simple use case, the results could be different if more complex use case was tested. In other words one should profile a real-life application to see if there are some bottlenecks.

Furthermore Boost.Geometry R-tree's code is a lot more complex. The interfaces of both R-tree implementations are different, in particular both search/query functions requires different parameters and most certainly handles them differently. The compiler may choose to optimize the code differently.

P.S.

With Boost.Geometry R-tree it's possible to use different splitting algorithms and packing algorithm so you can experiment which one is the best in your case. To use packing algorithm you have to pass a range of values into the rtree constructor. For the same data and the numbers of elements and the rtree created using packing algorithm the queries time is 0.005s for me.

Adam Wulkiewicz
  • 1,831
  • 13
  • 18
  • Yes,I use Superliminar R-tree the same way as it's used in the attached test file. Because my data and code are used in the specific products, while the test code is almost copied and therefore can not show here. I use the methods you showed here(NDEBUG and only copy ids),and The result is : superliminal.RTree 0.028s boost.geometry.index.rtree 0.097s. Thanks a lot for your answer! – zadecn Oct 26 '15 at 05:06
  • @zadecn But in the case of BG rtree do you store the values in a container and then extract the IDs from there like in the original code above or fill the container of IDs in a semi-callback function like in the **optimization 1** section? – Adam Wulkiewicz Nov 02 '15 at 13:43
  • 4
    Please note that the `Optimization 1` section suggests as if there was any problem with std::back_inserter. Actually the big change in performance must be due to std::vector moved outside the loop and this results in much less memory allocations. – tamas.kenez Nov 14 '17 at 19:50
  • @tamas.kenez Thanks for clarification. I'll edit the answer. – Adam Wulkiewicz Nov 15 '17 at 12:23