7

I'm looking for a data structure (array-like) that allows fast (faster than O(N)) arbitrary insertion of values into the structure. The data structure must be able to print out its elements in the way they were inserted. This is similar to something like List.Insert() (which is too slow as it has to shift every element over), except I don't need random access or deletion. Insertion will always be within the size of the 'array'. All values are unique. No other operations are needed.

For example, if Insert(x, i) inserts value x at index i (0-indexing). Then:

  • Insert(1, 0) gives {1}
  • Insert(3, 1) gives {1,3}
  • Insert(2, 1) gives {1,2,3}
  • Insert(5, 0) gives {5,1,2,3}

And it'll need to be able to print out {5,1,2,3} at the end.

I am using C++.

Ved
  • 7,667
  • 6
  • 33
  • 68
Peter
  • 385
  • 6
  • 12

6 Answers6

9

Use skip list. Another option should be tiered vector. The skip list performs inserts at const O(log(n)) and keeps the numbers in order. The tiered vector supports insert in O(sqrt(n)) and again can print the elements in order.

EDIT: per the comment of amit I will explain how do you find the k-th element in a skip list:

For each element you have a tower on links to next elements and for each link you know how many elements does it jump over. So looking for the k-th element you start with the head of the list and go down the tower until you find a link that jumps over no more then k elements. You go to the node pointed to by this node and decrease k with the number of elements you have jumped over. Continue doing that until you have k = 0.

Ivaylo Strandjev
  • 64,309
  • 15
  • 111
  • 164
  • 1
    I was also thinking amond the lines of skip-list, can you please elaborate how you modify the access-linked lists [those who guarantee the `O(logn)` search] after inserting an element in an arbitrary location? Won't it cause a need to change a lot of them? I believe it [skip-list] can be modified to fit here, but this point should be elaborated IMO – amit Apr 06 '12 at 12:39
  • No in fact the way i have implemented skip list a while ago you never change the hight of a node. This relies on the fact that if you insert each new node with uniformly distributed height the heights of the elements will be close enough to the perfect ones. There were some analysis on the internet on the amortized complexity of this approach that show it is not much worse then the best possible. – Ivaylo Strandjev Apr 06 '12 at 12:43
  • What I do not understand is how to modify not the height, but also indices, how can you tell the element is the k'th? If your "keys" are the indices, won't each arbitrary insertion requires changing the entire tail of the linked list? [it's not the height that worries me, using non-deterministic linked list solves this issue neatly] – amit Apr 06 '12 at 12:46
  • 3
    For each element you have a tower on links to next elements, right? For each link you know how many elements does it jump over. So looking for the k-th element you start with the head of the list and go down the tower until you find a link that jumps over no more then k elements. You go to a new node and decrease k with the number of elements you have jumped over. Continue doing that until you have k = 0. – Ivaylo Strandjev Apr 06 '12 at 12:50
  • Great, that explains it perfectly. +1. I suggest adding this explanation to the answer itself. [actually, I know feel dumb for asking, it is very similar to the idea of maintaining index in a BST by adding a "numberOfSons" field to each node] – amit Apr 06 '12 at 12:51
1

Did you consider using std::map or std::vector ?

You could use a std::map with the rank of insertion as key. And vector has a reserve member function.

Basile Starynkevitch
  • 1
  • 16
  • 251
  • 479
  • 1
    The OP wants faster then linear arbitrary insert, won't vector and map be both O(n)? – amit Apr 06 '12 at 12:38
  • Yes, `std::vector` insertion to position `i` will be O(`n`) because the elements `i` through `n` need to be be shifted. With `std::map`, something similar occurs because keys have to be updated. – Fred Foo Apr 06 '12 at 12:40
  • @Yavar: But you will have to modify the indices of all the following elements after each insert. assume you had the map=[(1,a),(2,b),(3,c)] and you want to add z in location 0, you will need to modify the map to [(1,z),(2,a),(3,b),(4,c)]. If there is a workaround - it should be elaborated.. – amit Apr 06 '12 at 12:44
  • @juanchopanza: yes, but it enforces unique keys. You need to do extra work to keep allow multiple insertions to the same index without wiping previous elements. – Fred Foo Apr 06 '12 at 13:11
1

You can use an std::map mapping (index, insertion-time) pairs to values, where insertion-time is an "autoincrement" integer (in SQL terms). The ordering on the pairs should be

(i, t) < (i*, t*)

iff

i < i* or t > t*

In code:

struct lt {
    bool operator()(std::pair<size_t, size_t> const &x,
                    std::pair<size_t, size_t> const &y)
    {
        return x.first < y.first || x.second > y.second;
    }
};

typedef std::map<std::pair<size_t, size_t>, int, lt> array_like;

void insert(array_like &a, int value, size_t i)
{
    a[std::make_pair(i, a.size())] = value;
}
Fred Foo
  • 328,932
  • 68
  • 689
  • 800
  • Suppose we insert 300 at 0, then 100 at 0, then 200 at 1. What should happen: `[]` then `[300]`, then `[100 300]`, then `[100 200 300]`. But what actually happens: `[]`, then `[((0, 1), 300)]`, then `[((0, 2), 100), ((0, 1), 300)]`, so far so good, but then `[((0, 2), 100), ((0, 1), 300), ((1, 3), 200)]`. The conclusion: without order statistics, this type of thing is usually hard to do. – Evgeni Sergeev May 03 '16 at 13:11
1

Regarding your comment:

List.Insert() (which is too slow as it has to shift every element over),

Lists don't shift their values, they iterate over them to find the location you want to insert, be careful what you say. This can be confusing to newbies like me.

nndhawan
  • 487
  • 5
  • 19
0

A solution that's included with GCC by default is the rope data structure. Here is the documentation. Typically, ropes come to mind when working with long strings of characters. Here we have ints instead of characters, but it works the same. Just use int as the template parameter. (Could also be pairs, etc.)

Here's the description of rope on Wikipedia.

Basically, it's a binary tree that maintains how many elements are in the left and right subtrees (or equivalent information, which is what's referred to as order statistics), and these counts are updated appropriately as subtrees are rotated when elements are inserted and removed. This allows O(lg n) operations.

Evgeni Sergeev
  • 18,558
  • 15
  • 94
  • 112
-1

In c++ you can just use a map of vectors, like so:

int main() {
  map<int, vector<int> > data;
  data[0].push_back(1);
  data[1].push_back(3);
  data[1].push_back(2);
  data[0].push_back(5);
  map<int, vector<int> >::iterator it;
  for (it = data.begin(); it != data.end(); it++) {
    vector<int> v = it->second;
    for (int i = v.size() - 1; i >= 0; i--) {
      cout << v[i] << ' ';
    }
  }
  cout << '\n';
}

This prints:

5 1 2 3 

Just like you want, and inserts are O(log n).

Running Wild
  • 2,835
  • 15
  • 15