30

So I'm self teaching AVL trees and I understand the basic idea behind it, but I just want to make sure my intuition of actually implementing it is valid:

I'll examine it with the left rotation-

So, the following situation is simple:

      8
     / \
    7   10
   /
  6
 /
3

When we add the 3, the tree rebalances itself to:

    8
   / \
  6   10
 / \
3   7

But is the rotation based on the addition of the 3 or the imbalance of the subtree rooted at 7? Is it even based on the imbalance of the tree rooted at 8?

The following example is where things get a bit hairy, in my opinion:

      9
     / \
    7   10
   / \
  6   8
 /
3

So, in this case, the subtree at 7 is fine when the 3 is added, so that subtree doesn't need to rotate. However, the tree at 9 is imbalanced with the addition of 3, so we base the rotation at 9. We get:

      7
     / \
    6   9
   /   / \
  3   8   10

So in writing my code, which I will quite soon, would the following code, starting from small subtrees working up to bigger subtrees do the trick?

pseudocode:

function balanceTree(Node n){

  if (n is not null){

    balanceTree(n.rightchild);
    balanceTree(n.leftchild);
  }

  if (abs(balanceFactor(n))>1){

    rotateAsNeeded(n);// rotate based on balance factor

  }

}

Thanks in advance!

templatetypedef
  • 328,018
  • 92
  • 813
  • 992
Skorpius
  • 1,943
  • 2
  • 20
  • 29
  • Also consider *[AA trees](http://en.wikipedia.org/wiki/AA_tree)*, which have performance similar to red-black trees, but IMO rather simpler code than RB or AVL – James Waldby - jwpat7 Jun 21 '13 at 19:45
  • (Or a splay tree, which has excellent performance and is easier than both of them. Or a treap, which is even easier to code up.) :-) – templatetypedef Jun 21 '13 at 19:47
  • 1
    (The way I remember AVL was not to have a `balanceTree` operation, but to keep the tree balanced on updates (which may well be what your `if (outOfBalance) rotateAsNeeded()` does.) – greybeard Nov 02 '15 at 08:01
  • The rotations are indeed dependent on the first node that gets unbalanced due to a new insertion. – goelakash Nov 05 '15 at 21:55
  • You'd be rebalancing the whole tree.... I think you should pass the inserted node, then work your way up the parent chain from it. – Stijn de Witt Jan 16 '17 at 11:45

1 Answers1

33

The pseudocode that you've posted will correctly balance a tree. That said, it is too inefficient to be practical - notice that you're recursively exploring the entire tree trying to do rebalancing operations, which will make all insertions and deletions take O(n) time, eating away all the efficiency gains of having a balanced tree.

The idea behind AVL trees is that globally rebalancing the tree can be done by iteratively applying local rotations. In other words, when you do an insertion or deletion and need to do tree rotations, those rotations won't appear in random spots in the tree. They'll always appear along the access path you took when inserting or deleting the node.

For example, you were curious about inserting the value 3 into this tree:

      9
     / \
    7   10
   / \
  6   8

Let's start off by writing out the difference in balance factors associated with each node (it's critical that AVL tree nodes store this information, since it's what makes it possible to do insertions and deletions efficiently):

           9(+1)
         /       \
       7 (0)    10 (0)
      / \
  6(0)   8(0)

So now let's see what happens when we insert 3. This places the 3 here:

           9(+1?)
          /       \
        7 (0?)    10 (0)
       /   \
   6(0?)   8(0)
   /
 3(0)

Notice that I've marked all nodes on the access path with a ?, since we're no longer sure what their balance factors are. Since we inserted a new child for 6, this changes the balance factor for the 6 node to +1:

           9(+1?)
          /       \
        7 (0?)    10 (0)
       /   \
   6(+1)   8(0)
   /
 3(0)

Similarly, the left subtree of 7 grew in height, so its balance factor should be incremented:

           9(+1?)
          /       \
        7 (+1)    10 (0)
       /   \
   6(+1)   8(0)
   /
 3(0)

Finally, 9's left subtree grew by one, which gives this:

           9(+2!)
          /       \
        7 (+1)    10 (0)
       /   \
   6(+1)   8(0)
   /
 3(0)

And here we find that 9 has a balance factor of +2, which means that we need to do a rotation. Consulting Wikipedia's great table of all AVL tree rotations, we can see that we're in the case where we have a balance factor of +2 where the left child has a balance factor of +1. This means that we do a right rotation and pull the 7 above the 9, as shown here:

        7(0)
       /   \
   6(+1)     9(0)
   /       /   \
 3(0)    8(0)   10 (0)

Et voilà! The tree is now balanced.

Notice that when we did this fixup procedure, we didn't have to look over the entire tree. Instead, all we needed to do was look along the access path and check each node there. Typically, when implementing an AVL tree, your insertion procedure will do the following:

  • If the tree is null:
    • Insert the node with balance factor 0.
    • Return that the tree height has increased by 1.
  • Otherwise:
    • If the value to insert matches the current node, do nothing.
    • Otherwise, recursively insert the node into the proper subtree and get the amount that the tree height has changed by.
    • Update the balance factor of this node based on the amount that the subtree height changed.
    • If this mandates a series of rotations, perform them.
    • Return the resulting change in the height of this tree.

Since all these operations are local, the total work done is based purely on the length of the access path, which in this case is O(log n) because AVL trees are always balanced.

Hope this helps!


PS: Your initial example was this tree:

      8
     / \
    7   10
   /
  6
 /
3

Note that this tree isn't actually a legal AVL tree, since the balance factor of the root node is +2. If you consistently maintain tree balance using the AVL algorithm, you will never encounter this case.

templatetypedef
  • 328,018
  • 92
  • 813
  • 992
  • Ok great- so a question from all of this: To navigate through the access paths, would it be wise to use a kind of "parent" instance variable for nodes? For example: Node "3" would have Node "6" as a parent instance variable. Or is there a better way to go about finding these paths? – Skorpius Jun 21 '13 at 22:45
  • @user1840804- That's actually not necessary. If you implement insertion recursively, then each stack frame can do its own local processing on the nodes along the access path. You can put this pointer in if you'd like, though. – templatetypedef Jun 21 '13 at 22:47