0

"Therefore, inserting N elements takes O(N) work total. Each insertion is O(1) on average, each though some insertions take O(N) time in the worst case." This quote is found in Cracking the Coding Interview. I kind of understand this statement even though a little thing about it is irking me. The amortized insertion is O(1) on good days. This simply means that when the resizable array doesn't need to resized, then to insert something is simply O(1). That is clear. But, on a bad day, when we run out of space, we would need O(N) to insert that extra element. However, I don't agree with the statement above when it says some insertion take O(N) in the worst case. Shouldn't it say, ONE insertion take O(N) in the worst case.

To make this more clear, here's an example of what I'm saying.

Say we have a resizable array, but we that array is size 4. Let's now say to insert 5 elements, we would have O(1), O(1), O(1), O(1), but then once we get to the last element, we would have to copy all these guys into a new array and this process would give us a cost of O(N).

Can someone please clarify this for me. I don't understand why the book says some cases would take O(N) when when only need to copy all the elements into a new array one time when we run of of space in our old array.

  • each time you need to resize your array will be O(N). – Stargateur Dec 15 '16 at 02:01
  • See this answer on [Constant Amortized Time](http://stackoverflow.com/questions/200384/constant-amortized-time) – Brian Rodriguez Dec 15 '16 at 02:01
  • I don't get what's the difference with the quote and your understanding...isn't the book saying exact same thing: Insertion can be O(N) in the worst case – shole Dec 15 '16 at 02:04
  • The book is saying that inserting N elements is O(N) in the *average* case. Or in other words, each element is O(1) *on average*. To insert 5 elements you have O(1), O(1), O(1), O(1), O(6), but that's the same as if each one was O(2). And actually, no matter how many you insert, it always averages out to less than O(3) each, which is the same as O(1). (Big-O-notation doesn't work that way but it might be helpful for intuition). – user253751 Dec 15 '16 at 02:25

4 Answers4

2

I think you'd better understand above statement like this way.

At first. an array size is just 1. and insert it one element. Now the array is full!. you have to resize it as much as 2 times of the previous one.

Next, the array size is 2. Let this process progress. You can easily notice that the moment you have to resize an array is 1, 2, 4, 8, 16, 32, ... , 2^r.

I will give you questions.

  1. How many times are the moments you have to resize an array?
  2. How much the total cost until N(N>=0) steps?

1st answer is floor(lgN) times. you can figure it out easily I think. If you find the first answer, calculating the total cost of this N steps which is the second answer is pretty easy.(I don't know how I can express mathematical symbol:<)

1 + 2 + 4 + 8 + 16 + ... + 2^(floor(lgN)) = 2^(floor(lgN)+1) - 1 => O(n)

To get the average cost of each step, divide total cost into N => O(1)

I think worst case the reference mentions is when the array is needed to be resized. The cost of this readjustment is in proportional to the number of elements are in the array, O(N)

Daniel kim
  • 158
  • 1
  • 10
1

Let's split up all the insertions into "heavy" insertions that take time proportional to the number of elements and "light" insertions that only take a constant amount of time to complete. Then if you start with an empty list and keep appending and appending, you're going to have mostly light insertions, but every now and then you'll have a heavy insertion.

Let's say, for simplicity, that you double the size of the array every time you run out of space and that you start off with an array of size 4. Then the first resize will have to move four elements, the second will move eight, then sixteen, then thirty-two, then sixty-four, then 128, then 256, etc.

Notice that it's not just one single append that takes a long time - roughly speaking, if you have n total insertions, then roughly log n of them will be heavy (the time between them keeps growing) and the other roughly n - log n of them will be light.

templatetypedef
  • 328,018
  • 92
  • 813
  • 992
1

Think to the cost of N insertions inside a resizable array as (I will use tilde notation here):

  • cost of N insertions = cost of new element insertions + cost of resizes

Cost of new element insertions

This is simply the cost of inserting a new element in the array multiplied for how many times you insert a new element, i.e., N:

  • cost of new element insertions = 1 * N

Cost of resizes

Imagine you have a 64 cells array. Then, it means that the array has been resized for:

  • array size = 1 -> 2 -> 4 -> 8 -> 16 -> 32 -> 64
  • #resize done = 6

The 64 cell array has been resized 6 times, i.e., resize happens for log2(64) times. In general, now we know that for N insertion, we will perform log2(N) resize operations.

But what do we do inside each resize? We will copy the elements already present in the array in the new resized array: at resize "i", how many elements we wil copy? 2^i. With previous example:

  • resize number 1 = 1 -> 2: copy 1 elements
  • resize number 2 = 2 -> 4: copy 2 elements
  • resize number 3 = 4 -> 8: copy 4 elements
  • ......
  • resize number 6 = 32 -> 64: copy 32 elements

So:

  • Cost of resizes = summatory(from i=1 to log2(N)) 2^i = 2(N-1)

Conclusion

  • cost of N insertions = cost of new element insertions + cost of resizes = N + 2(N-1) ~ 3N
igol
  • 57
  • 6
0

std::vector will keep all its elements next to each other in memory for fast iteration - that's its thing, that's what it does, that's why everyone loves std::vector. Typically it will reserve a bit more space than it needs for the number of elements contained within it, or that memory happens to be free, so when you add a new element to the end of vector it's quick for vector to shove your new element there.

However, when vector doesn't have space to expand, it can't just leave its existing elements where they are and start a new list somewhere else - all the elements MUST be next to each other in memory! So it must find a free chunk of memory that's big enough for all the elements plus your new one, then copy all the existing elements over there, then add your new element to the end.

If it takes 1 unit of time to add 1 element, it takes N units of time to move N elements, broadly speaking. If you add a new element, that's one operation. If you add a new element and 1024 existing elements need to be relocated, that's 1025 operations. So how long the reallocation takes is proportional to the size of the vector, hence O(N).

Jack Deeth
  • 1,813
  • 16
  • 26