14

Ok, this is another one in the theory realm for the CS guys around.

In the 90's I did fairly well in implementing BST's. The only thing I could never get my head around was the intricacy of the algorithm to balance a Binary Tree (AVL).

Can you guys help me on this?

Gustavo Carreno
  • 8,809
  • 13
  • 42
  • 72
  • 1
    Do you want the tree *perfectly* balanced? The most common algorithms guarantee that a tree is somewhat balanced. For instance, red-black trees guarantee that the depth of deepest leaf node is no more than twice the depth of the shallowest leaf node – Adam Tegen Sep 25 '08 at 14:23
  • Also, are you looking for an algorithm takes a tree and balances it, or one that balances as part of the tree operations, like insert, delete, etc. – Adam Tegen Sep 25 '08 at 14:24
  • “perfectly” must of course be defined. However, in the context of binary trees the only meaningful definition is that of a binary tree with logarithmic height, isn't it? – Konrad Rudolph Sep 25 '08 at 14:28
  • 1
    Are you asking people to vote for their favourite balanced tree algorithm? Doesn't the answer depend on the application (e.g. splay tree gives good performance where recent searches are repeated, AVL is perfectly balanced, red-black gives the prettiest explanatory diagrams, etc.) – Steve Jessop Sep 25 '08 at 14:29
  • 1
    Sorry, typo. I meant "red-black has good worst case performance for inserts" – Steve Jessop Sep 25 '08 at 14:35
  • Just thought of something - are you looking for algorithms to keep a tree balanced as insertions and deletions occur (which is what we've all been talking about)? Or do you just mean that you have built an unbalanced tree, and you want to balance it as a one-off operation? – Steve Jessop Sep 25 '08 at 15:09
  • @onebyone: An easy typo typo to make. :-D – Konrad Rudolph Sep 25 '08 at 15:19
  • I hope this edit clarifies things. – Gustavo Carreno Sep 25 '08 at 17:38

7 Answers7

16

I don't think it's good to post complete codes for node balancing algorithms here since they get quite big. However, the Wikipedia article on red-black trees contains a complete C implementation of the algorithm and the article on AVL trees has several links to high-quality implementations as well.

These two implementations are enough for most general-purpose scenarios.

Konrad Rudolph
  • 482,603
  • 120
  • 884
  • 1,141
  • I've unaccepted your answer because it does not explain, only links. My intent was to get a descriptive answer that would help. Thanks anyway! – Gustavo Carreno Sep 26 '08 at 17:04
  • Your question asked code, not explanations. Anyway, I didn't intend my answer to be accepted anyway (because it's not really an answer) … Your new question is much better! – Konrad Rudolph Sep 26 '08 at 17:36
15

A scapegoat tree possibly has the simplest balance-determination algorithm to understand. If any insertion causes the new node to be too deep, it finds a node around which to rebalance, by looking at weight balance rather than height balance. The rule for whether to rebalance on delete is also simple. It doesn't store any arcane information in the nodes. It's trickier to prove that it's correct, but you don't need that to understand the algorithm...

However, unlike an AVL it isn't height-balanced at all times. Like red-black its unbalance is bounded, but unlike red-black it's tunable with a parameter, so for most practical purposes it looks as balanced as you need it to be. I suspect that if you tune it too tightly, though, it ends up as bad or worse than AVL for worst-case insertions.

Response to question edit

I'll provide my personal path to understanding AVL trees.

First you have to understand what a tree rotation is, so ignore everything else you've ever heard the AVL algorithms and understand that. Get straight in your head which is a right rotation and which is a left rotation, and what each does to the tree, or else the descriptions of the precise methods will just confuse you.

Next, understand that the trick for balancing AVL trees is that each node records in it the difference between the height of its left and right subtrees. The definition of 'height balanced' is that this is between -1 and 1 inclusive for every node in the tree.

Next, understand that if you have added or removed a node, you may have unbalanced the tree. But you can only have changed the balance of nodes which are ancestors of the node you added or removed. So, what you're going to do is work your way back up the tree, using rotations to balance any unbalanced nodes you find, and updating their balance score, until the tree is balanced again.

The final part of understanding it is to look up in a decent reference the specific rotations used to rebalance each node you find: this is the "technique" of it as opposed to the high concept. You only have to remember the details while modifying AVL tree code or maybe during data structures exams. It's years since I last had AVL tree code so much as open in the debugger - implementations tend to get to a point where they work and then stay working. So I really do not remember. You can sort of work it out on a table using a few poker chips, but it's hard to be sure you've really got all the cases (there aren't many). Best just to look it up.

Then there's the business of translating it all into code.

I don't think that looking at code listings helps very much with any stage except the last, so ignore them. Even in the best case, where the code is clearly written, it will look like a textbook description of the process, but without the diagrams. In a more typical case it's a mess of C struct manipulation. So just stick to the books.

Steve Jessop
  • 257,525
  • 32
  • 431
  • 672
4

I've been doing some work with AVL trees lately.

The best help I was able to find on how to balance them was through searching google.

I just coded around this pseudo code (if you understand how to do rotations it is pretty easy to implement).

IF tree is right heavy
{
  IF tree's right subtree is left heavy
  {
     Perform Double Left rotation 
  }
  ELSE
  {
     Perform Single Left rotation
  }
}
ELSE IF tree is left heavy
{
  IF tree's left subtree is right heavy
  {
     Perform Double Right rotation
  }
  ELSE
  {
     Perform Single Right rotation
  }
}
mscccc
  • 2,082
  • 6
  • 23
  • 38
  • 1
    This part of AVL is fairly straightforward - what gets a bit trickier is updating the balance factors after the rotations. –  Mar 28 '09 at 11:26
1

I agree, a complete program would not be appropriate.

While classical AVL and RB tree use a deterministic approach, I would suggest to have a look at "Randomized binary search trees" that are less costly to keep balanced and guarantee a good balancing regardless the statistical distribution of the keys.

Remo.D
  • 15,217
  • 5
  • 41
  • 70
0

there's a complete implementation of a self balancing avl tree @ http://code.google.com/p/self-balancing-avl-tree/ . it also implements logarithmic time concatenate and split operations as well as a map/multimap collections

cos
  • 87
  • 1
0

The AVL tree is inefficient because you have to do potentially many rotations per insertion/deletion.

The Red-Black tree is probably a better alternative because insertions/deletions are much more efficient. This structure guarantees the longest path to a leaf is no more than twice the shortest path. So while less balanced than an AVL tree, the worst unbalanced cases are avoided.

If your tree will be read many times, and won't be altered after it's created, it may be worth the extra overhead for a fully-balanced AVL tree. Also the Red-Black tree requires one extra bit of storage for each node, which give the node's "color".

user20493
  • 5,244
  • 7
  • 31
  • 31
  • Personally speaking, I have never found an actual *explanation* of RB trees - merely listing of the rules. No one seems to understand the *why* of RB. OTOH, AVL was for me intuitive and straightforward to comprehend; and you should only write code you comprehend. –  Mar 28 '09 at 11:25
  • 1
    The "why": RB trees are a mapping of 2-3-4 trees onto binary trees, where red edges connect the portions of a split-up 2-3-4 tree node and black edges correspond to the edges in the original 2-3-4 tree. – Jeffrey Hantin Jul 09 '10 at 22:25
  • 2
    One thing, about the "potentially many rotations per insertion/deletion". AVL insertions only need one single rotation, or double rotation, to restore the balance. But yes, Delete might have up to O(log N) rotations. – otherchirps Oct 11 '10 at 07:38
  • IME, red-black trees are not very fast. Their chief virtue is that they give a low variability in operation times. For speed, a treap is nice - but they have a higher variability. http://stromberg.dnsalias.org/~strombrg/python-tree-and-heap-comparison/ – user1277476 Aug 10 '12 at 17:58
0

For balancing an AVL Tree I just wrote a bunch of functions which I thought of sharing with everyone here. I guess this implementation is flawless. Queries/questions/criticism are of course welcome:

http://uploading.com/files/5883f1c7/AVL_Balance.cpp/

Being a novice to Stackoverflow, I tried posting my code here but was stuck with bad formatting issues. So, uploaded the file on the above link.

Cheers.

Rocky
  • 121
  • 1
  • 1
  • 3