Python AVL Trees

Python AVL Trees

The main difference between a regular Binary Search Tree (BST) and an AVL Tree is that AVL Trees perform rotation operations to keep the tree balanced. By maintaining balance, an AVL Tree ensures a minimum possible height, which means that search, insert, and delete operations can be done very efficiently.

An AVL Tree (named after its inventors, Adelson-Velsky and Landis) solves this problem. It is a self-balancing binary search tree.

The two trees below contain the same nodes, but the AVL tree has balanced itself, resulting in a much smaller height.

Unbalanced BST (Height: 6)

      B
       \
        G
       /
      E
       \
        K
       /
      F
       \
        P
       /
      I
       \
        M
    

Balanced AVL Tree (Height: 3)

        G
      /   \
     E     K
    / \   / \
   B   F I   P
           /
          M
    

The Balance Factor

A Binary Search Tree is considered balanced if the difference in height between the left and right subtrees for every node is less than 2 (i.e., it can be 0, 1, or -1). This difference is called the Balance Factor.

The balance factor for a node X is calculated as:

BalanceFactor(X) = height(rightSubtree(X)) - height(leftSubtree(X))

If any node's balance factor is 2 or -2 (or more, though this is prevented by immediate rebalancing), the tree is unbalanced, and a rotation is required.


Left and Right Rotations

To restore balance, AVL Trees use rotations. A rotation is a simple operation that swaps nodes to decrease the tree's height while preserving the Binary Search Tree property (left child < parent < right child).

The animation below shows the general idea of a right rotation. The Y node becomes the new root of this subtree, and the old root X becomes Y's right child. The subtree T2 moves from being Y's right child to becoming X's left child to maintain the BST property.

      X           Y
     / \         / \
    Y   T3  ->  T1  X
   / \             / \
  T1  T2          T2  T3
(Right Rotation on X)

A left rotation is the mirror image of this process.


The Four "Out-of-Balance" Cases

When a tree becomes unbalanced after an insertion or deletion, it will match one of four cases. Each case requires a specific rotation to fix.

Case Description Rotation to Restore Balance
Left-Left (LL) The unbalanced node is left-heavy, and its left child is also left-heavy. A single right rotation on the unbalanced node.
Right-Right (RR) The unbalanced node is right-heavy, and its right child is also right-heavy. A single left rotation on the unbalanced node.
Left-Right (LR) The unbalanced node is left-heavy, but its left child is right-heavy. First, a left rotation on the left child, then a right rotation on the unbalanced node.
Right-Left (RL) The unbalanced node is right-heavy, but its right child is left-heavy. First, a right rotation on the right child, then a left rotation on the unbalanced node.

The Left-Left (LL) Case

This occurs when a node becomes left-heavy because of an insertion into its left child's left subtree. A single right rotation on the unbalanced node is enough to restore balance.

The Right-Right (RR) Case

This occurs when a node becomes right-heavy because of an insertion into its right child's right subtree. A single left rotation on the unbalanced node restores balance.

The Left-Right (LR) Case

This more complex case happens when a node is left-heavy, but its left child is right-heavy. It requires a double rotation: first a left rotation on the left child, which turns it into a Left-Left case, and then a right rotation on the original unbalanced node.

The Right-Left (RL) Case

This is the mirror of the LR case. The unbalanced node is right-heavy, but its right child is left-heavy. It requires a right rotation on the right child, followed by a left rotation on the original unbalanced node.


Retracing in AVL Trees

After an insertion or deletion, the algorithm must check for imbalances. This is done by retracing the path from the inserted/deleted node back up to the root. In a recursive implementation, this happens naturally as the function calls return.

At each ancestor node, the height is updated, and the balance factor is recalculated. The first unbalanced node encountered on this path is where the rotation is performed. This single rotation is often enough to restore balance to the entire tree.


AVL Tree Implementation in Python

The implementation requires adding a height attribute to each node and functions to get height, get balance, and perform rotations. The insert and delete methods are then expanded to include the balancing logic.

Insertion

The following code builds a complete AVL tree from a list of letters. The insert function performs a standard BST insert and then retraces to update heights and perform rotations if necessary.

AVL Tree Insertion

class TreeNode:
  def __init__(self, data):
    self.data = data
    self.left = None
    self.right = None
    self.height = 1

def getHeight(node): if not node: return 0 return node.height

def getBalance(node): if not node: return 0 return getHeight(node.left) - getHeight(node.right)

def rightRotate(y): print('Rotate right on node', y.data) x = y.left T2 = x.right x.right = y y.left = T2 y.height = 1 + max(getHeight(y.left), getHeight(y.right)) x.height = 1 + max(getHeight(x.left), getHeight(x.right)) return x

def leftRotate(x): print('Rotate left on node', x.data) y = x.right T2 = y.left y.left = x x.right = T2 x.height = 1 + max(getHeight(x.left), getHeight(x.right)) y.height = 1 + max(getHeight(y.left), getHeight(y.right)) return y

def insert(node, data): if not node: return TreeNode(data)

if data < node.data: node.left = insert(node.left, data) elif data > node.data: node.right = insert(node.right, data)

// Update the balance factor and balance the tree node.height = 1 + max(getHeight(node.left), getHeight(node.right)) balance = getBalance(node)

// Balancing the tree // Case 1: Left Left if balance > 1 and data < node.left.data: return rightRotate(node)

// Case 2: Right Right if balance < -1 and data > node.right.data: return leftRotate(node)

// Case 3: Left Right if balance > 1 and data > node.left.data: node.left = leftRotate(node.left) return rightRotate(node)

// Case 4: Right Left if balance < -1 and data < node.right.data: node.right = rightRotate(node.right) return leftRotate(node)

return node

def preOrder(node): if not node: return print("{0} ".format(node.data), end="") preOrder(node.left) preOrder(node.right)

// Inserting nodes root = None letters = ['C', 'B', 'E', 'A', 'D', 'H', 'G', 'F'] for letter in letters: root = insert(root, letter)

print("Preorder traversal of the constructed AVL tree is") preOrder(root)

Deletion

Deletion also requires rebalancing. The logic is similar to insertion: after deleting the node (using standard BST deletion logic), the code retraces up the tree, updating heights and performing rotations on any unbalanced nodes.

AVL Tree Deletion

// (Assumes TreeNode class and rotation functions from above)

def minValueNode(node): current = node while current.left is not None: current = current.left return current

def delete(root, key): // Step 1: Perform standard BST delete if not root: return root elif key < root.data: root.left = delete(root.left, key) elif key > root.data: root.right = delete(root.right, key) else: if root.left is None: temp = root.right root = None return temp elif root.right is None: temp = root.left root = None return temp temp = minValueNode(root.right) root.data = temp.data root.right = delete(root.right, temp.data) // If the tree has only one node, simply return it if root is None: return root // Step 2: Update the height of the ancestor node root.height = 1 + max(getHeight(root.left), getHeight(root.right)) // Step 3: Get the balance factor balance = getBalance(root) // Step 4: Balance the tree if needed // Left Left if balance > 1 and getBalance(root.left) >= 0: return rightRotate(root) // Left Right if balance > 1 and getBalance(root.left) < 0: root.left = leftRotate(root.left) return rightRotate(root) // Right Right if balance < -1 and getBalance(root.right) <= 0: return leftRotate(root) // Right Left if balance < -1 and getBalance(root.right) > 0: root.right = rightRotate(root.right) return leftRotate(root) return root


Time Complexity for AVL Trees

By keeping the tree's height to a minimum, AVL trees guarantee that operations like search, insert, and delete are very fast.

This is a significant improvement over the standard BST, which can have a worst-case time complexity of O(n) if the tree becomes unbalanced.

Deriving O(log n)

The height h of an AVL tree with n nodes is always logarithmic with respect to n. The relationship can be expressed as h ≈ c * log₂(n), where c is a constant (approximately 1.44).

For Big O notation, we ignore constants, so the height h is in O(log n). Since the time complexity of the main operations is proportional to the height of the tree, the time complexity is O(h), which simplifies to O(log n). This logarithmic performance is what makes AVL trees and other self-balancing trees so powerful for managing large, dynamic datasets.


Exercise

?

What is the guaranteed worst-case time complexity for search, insert, and delete operations in an AVL Tree?