3

I'm having trouble understanding this solution to a problem on HackerRank. Please see the solution code below, apparently by Kimiyuki Onaka.

The problem is: given a list of unique numbers, and m queries of the type, "move the current ith to jth elements (l,r) to the beginning", return the final arrangement of the numbers.

Onaka suggests that a treap data structure (one that maintains both priority and binary search) can help solve it in O(m log n). Since I'm not versed in C++, I've tried but failed to conceptualize how a treap could be used. My understanding is that to solve the problem you need log time access to the current ith to jth elements and log time update of the current first element/s and overall order. But I can't see how to conceptualize it.

Ideally, I'd like an explanation in words of how it could be done. Alternatively, just an explanation of what Onaka's code is doing.

Thanks!

#include <iostream>
#include <tuple>
#include <random>
#include <memory>
#define repeat(i,n) for (int i = 0; (i) < (n); ++(i))
using namespace std;

template <typename T>
struct treap {
    typedef T value_type;
    typedef double key_type;
    value_type v;
    key_type k;
    shared_ptr<treap> l, r;
    size_t m_size;
    treap(value_type v)
            : v(v)
            , k(generate())
            , l()
            , r()
            , m_size(1) {
    }
    static shared_ptr<treap> update(shared_ptr<treap> const & t) {
        if (t) {
            t->m_size = 1 + size(t->l) + size(t->r);
        }
        return t;
    }
    static key_type generate() {
        static random_device device;
        static default_random_engine engine(device());
        static uniform_real_distribution<double> dist;
        return dist(engine);
    }
    static size_t size(shared_ptr<treap> const & t) {
        return t ? t->m_size : 0;
    }
    static shared_ptr<treap> merge(shared_ptr<treap> const & a, shared_ptr<treap> const & b) { // destructive
        if (not a) return b;
        if (not b) return a;
        if (a->k > b->k) {
            a->r = merge(a->r, b);
            return update(a);
        } else {
            b->l = merge(a, b->l);
            return update(b);
        }
    }
    static pair<shared_ptr<treap>, shared_ptr<treap> > split(shared_ptr<treap> const & t, size_t i) { // [0, i) [i, n), destructive
        if (not t) return { shared_ptr<treap>(), shared_ptr<treap>() };
        if (i <= size(t->l)) {
            shared_ptr<treap> u; tie(u, t->l) = split(t->l, i);
            return { u, update(t) };
        } else {
            shared_ptr<treap> u; tie(t->r, u) = split(t->r, i - size(t->l) - 1);
            return { update(t), u };
        }
    }
    static shared_ptr<treap> insert(shared_ptr<treap> const & t, size_t i, value_type v) { // destructive
        shared_ptr<treap> l, r; tie(l, r) = split(t, i);
        shared_ptr<treap> u = make_shared<treap>(v);
        return merge(merge(l, u), r);
    }
    static pair<shared_ptr<treap>,shared_ptr<treap> > erase(shared_ptr<treap> const & t, size_t i) { // (t \ t_i, t_t), destructive
        shared_ptr<treap> l, u, r;
        tie(l, r) = split(t, i+1);
        tie(l, u) = split(l, i);
        return { merge(l, r), u };
    }
};

typedef treap<int> T;
int main() {
    int n; cin >> n;
    shared_ptr<T> t;
    repeat (i,n) {
        int a; cin >> a;
        t = T::insert(t, i, a);
    }
    int m; cin >> m;
    while (m --) {
        int l, r; cin >> l >> r;
        -- l;
        shared_ptr<T> a, b, c;
        tie(a, c) = T::split(t, r);
        tie(a, b) = T::split(a, l);
        t = T::merge(T::merge(b, a), c);
    }
    repeat (i,n) {
        if (i) cout << ' ';
        shared_ptr<T> u;
        tie(t, u) = T::erase(t, 0);
        cout << u->v;
    }
    cout << endl;
    return 0;
}
גלעד ברקן
  • 21,095
  • 3
  • 19
  • 57
  • 1
    As far as I can say, this code looks more like a [rope data structure](https://en.wikipedia.org/wiki/Rope_(data_structure)) than a binary search tree. Tree balance is maintained (probabilistically) using added heap property over random keys (but, there is _no_ binary search property over either keys or values). I did an [equivalent approach in Haskell](https://www.hackerrank.com/contests/zalando-codesprint/challenges/give-me-the-order/submissions/code/6018300). – behzad.nouri Jun 10 '16 at 15:19
  • 1
    @behzad.nouri thanks! I read about the rope structure. that makes a lot more sense now, although I have to go over exactly how it helps the solution. Thanks for pointing me in the right direction. – גלעד ברקן Jun 10 '16 at 20:10

1 Answers1

2

Perhaps some pictures of the data structure as it processes the sample input would be helpful.

First, the six numbers "1 2 3 4 5 6" are inserted into the treap. Each one is associated with a randomly generated double, which determines if it goes above or below other nodes. The treap is always ordered so that all of a node's left children come before it, and all its right children come after.

Insert 1

Insert 2

Insert 3

Insert 4

Insert 5

Insert 6

Then we start moving intervals to the beginning. The treap is split into three parts—one with the first l-1 nodes, one with the nodes in the interval, and the last nodes. Then they are re-merged in a different order.

First, the interval [4,5] is moved: Move 4,5

Now the treap's order is 4, 5, 1, 2, 3, 6. (The root 4 comes first, because it has no left child; 3 is preceded by its left child 2, which is preceded by its own left child 5; then comes 5's right child 1; then 2, then 3, then 6.) The nodes keep track of the size of each subtree (m_size).

Given [3,4], we first call split(t,4), which should return a pair: one treap with the first 4 elements, and another one with the rest.

The root node (4) does not have 4 things under its left subtree, so it recurses with split(t->r, 3). This node (3) does have 3 things under its left subtree, so it calls split(t->l, 3). Now we are at node (2). It calls split(t->r, 0), but it does not have a right child, so this returns a pair of null pointers. Thus from node (2) we return the unchanged subtree from (2), and a nullptr. Propagating up, node (3) sets its left child to null, and returns the subtree from (2), and the subtree at (3) itself (which is now just two elements, (3) and (6).) Finally, at node (4) we set the right subchild to (2), and return the tree at (4) (which now has four elements, as required) and the two-element tree rooted at (3).

Next a call is made to split(a,2), where a is the first, four-element, tree from the last call.

Again, the root (4) has no left child, so we recurse with split(t->r, 1).

The node (2) has a left subtree with size 2, so it calls split(t->l, 1).

The node (5) has no left child, so it calls split(t->r, 0).

At the leaf (1), 0 <= size(t->l) is vacuously true: it gets a pair of null pointers from split(t->l, 0) and returns a pair(null, (1)).

Up at (5), we set the right child to null, and return a pair((5), (1)).

Up at (2), we set the left child to (1), and return a pair((5), (2)->(1)).

Finally, at (4), we set the right child to (5), and return a pair((4)->(5), (2)->(1)).

Move 1,2

Finally the interval [2,3] (consisting of the elements 2 and 4) is moved: Move 2,4

Finally the nodes are popped in order, yielding 2, 4, 1, 5, 3, 6.

Perhaps you'd like to see the tree states given different input. I put a copy of the treap code, "instrumented" to produce the pictures, on GitHub. When run, it produces a file trees.tex; then running pdflatex trees produces pictures like those above. (Or if you like, I'd be happy to make pictures for different input: that would be easier than installing a whole TeX distribution if you don't have it.)

Nick Matteo
  • 3,976
  • 20
  • 30
  • Thanks for answering. Questions: (1) at `insert 3` (same with insert 5), since 0.785 > 0.775, why is 3 not added as the right child of 2? and (2) at `move [3,4]`, how does the code know that `[3,4]` now corresponds with the elements `1 and 2`? – גלעד ברקן Jun 12 '16 at 19:16
  • @גלעדברקן: Re (1): Bigger numbers go higher in the tree. Since 0.785 is bigger, 3 gets to be the parent. When 5 is inserted, its key is smaller than 4's so 5 becomes a child. (2) It splits the tree into the first 2 elements, the 3rd through 4th elements, and the 5th through nth. The values of the nodes don't matter. – Nick Matteo Jun 12 '16 at 19:20
  • regarding (1) duh, i was thinking min heap. regarding (2) i still don't understand why the program would associate the nodes labeled 1 and 2 with the current 3rd and 4th elements - could you explain? – גלעד ברקן Jun 12 '16 at 19:25
  • @גלעדברקן: The `split` function splits the treap into two parts, where the first has the given number of elements, using the total ordering which described at the top of my post. I went into the operation in that particular case in some depth in the answer. (But don't get hung up on the labels: they're just whatever happens to be in the input file.) – Nick Matteo Jun 12 '16 at 20:33
  • Thanks a lot! Now I get it. – גלעד ברקן Jun 12 '16 at 21:16
  • @גלעדברקן Any ideas on this tree problem:- https://stackoverflow.com/questions/58256987/how-to-update-nodes-of-a-tree-multiple-times – Firex Firexo Oct 06 '19 at 13:05