An AVL tree is a self-balancingbinary search tree where the height difference between the left and right subtree of any node is at most 1.
AVL trees guarantee O(logn) for search, insert, and delete.
Named after its inventors Adelson-Velsky and Landis (1962).
Balance Factor
The balance factor of a node is defined as the height of its left subtree minus the height of its right subtree.
AVL property: ∣BF(node)∣≤1 for every node.
If this property is violated after an insertion or deletion, rotations are performed to restore balance.
Insertion
Standard BST insertion
Update heights of ancestors bottom-up
Find the first unbalanced node (if any) on the path from the inserted node to root
Rebalance using rotation(s)/cut-and-link algorithm
After fixing the lowest unbalanced node, the tree is guaranteed to be balanced
Deletion
Perform standard BST deletion
Udpate heights of ancestors bottom-up
Find the first unbalanced node (if any) on the path from the deleted node to root
Rebalance using rotation(s)/cut-and-link algorithm
May need to rebalance parents of the unbalanced node as well, continue up to root
Rebalancing Intuition
An imbalance always involves three nodes (x, y, z) and four subtrees (T0,T1,T2,T3). Example (“left left” case):
z (BF = +2)
/ \
y T3
/ \
x T2
/ \
T0 T1
The rebalancing algorithm rearranges these 7 parts into the canonical balanced form:
b
/ \
a c
/ \ / \
T0 T1 T2 T3
where (a, b, c) is the in-order sequence of (x, y, z), and the subtrees are placed in sorted order.
Rebalancing
Given three nodes: z (grandparent, unbalanced node with BF = ±2), y (taller child of z), x (taller child of y),
Let (a,b,c) be the in-order sequence of nodes x,y,z
Let (T0,T1,T2,T3) be the in-order sequence of their four subtrees
Rotation algorithm:
Replace subtree rooted at z with subtree rooted at b
Make a the left child of b with subtrees T0 (left) and T1 (right)
Make c the right child of b with subtrees T2 (left) and T3 (right)
The same algorithm works for all four possible imbalances - only the mapping to (a,b,c) differs:
Unbalanced node z with BF = +2 (left-heavy):
- Child y has BF ≥ 0 (straight line) → LL case: Single rotation
- Child y has BF < 0 (zigzag) → LR case: Double rotation
Unbalanced node z with BF = -2 (right-heavy):
- Child y has BF ≤ 0 (straight line) → RR case: Single rotation
- Child y has BF > 0 (zigzag) → RL case: Double rotation
All possible imbalances | Case distinctions visualized
def _is_balanced(node: TreeNode | None, recursive: bool = True) -> bool: if node is None: return True if abs(_balance_factor(node)) > 1: return False if not recursive: return True return _is_balanced(node.left) and _is_balanced(node.right) def _balance_factor(node: TreeNode) -> int: left_height, right_height = _children_heights(node) return left_height - right_height def _children_heights(node: TreeNode) -> tuple[int, int]: left_height = node.left.height if node.left else -1 right_height = node.right.height if node.right else -1 return left_height, right_height def _recompute_ancestry_heights(node: TreeNode) -> None: _recompute_height(node) if node.parent is None: return _recompute_ancestry_heights(node.parent) def _recompute_height(node: TreeNode) -> None: left_height, right_height = _children_heights(node) node.height = 1 + max(left_height, right_height) def _find_first_unbalanced(node: TreeNode | None) -> TreeNode | None: if node is None: return if not _is_balanced(node, recursive=False): return node return _find_first_unbalanced(node.parent) class AVLTree(BinarySearchTree): def height(self) -> int: return self.root.height if self.root else -1 def insert(self, key: int, value) -> TreeNode: new_node = super().insert(key, value) _recompute_ancestry_heights(new_node) unbalanced_node = _find_first_unbalanced(new_node) if unbalanced_node is None: return new_node b = self._rebalance(unbalanced_node) if b.parent: _recompute_ancestry_heights(b.parent) return new_node def remove(self, key: int): dying_node = super().find(key) maybe_unbalanced = super()._remove_node(dying_node) current = maybe_unbalanced while current is not None: _recompute_height(current) if not _is_balanced(current, recursive=False): current = self._rebalance(current) current = current.parent def is_valid(self) -> bool: return super().is_valid() and _is_balanced(self.root) def _rebalance(self, node: TreeNode) -> TreeNode: z = node y = z.left if _balance_factor(z) > 0 else z.right assert y is not None ybf = _balance_factor(y) if _balance_factor(z) > 0: # left heavy x = y.left if ybf > 0 else y.right # LL/LR else: x = y.right if ybf <= 0 else y.left # RR/RL assert x is not None a, b, c = sorted((x, y, z), key=lambda node: node.key) t0 = t1 = t2 = t3 = None for n in (x, y, z): for t in (n.left, n.right): if t is None or t in (a, b, c): continue k = t.key if k < a.key: t0 = t elif k < b.key: t1 = t elif k < c.key: t2 = t else: t3 = t self._replace_node(z, b) # connects b to z's parent b.left, b.right = a, c a.left, a.right = t0, t1 c.left, c.right = t2, t3 for n in (a, c, b): _recompute_height(n) return b ```
Logarithmic height bound
An AVL tree of height h has at least exponentially many nodes, which implies logarithmic height for n nodes.
The recurrence
Let n(h) = minimum nodes in an AVL tree of height h.
A minimal tree of height h has:
Root node
Minimal subtree of height h−1 (taller child)
Minimal subtree of height h−2 (shorter child, forced by balance constraint)
n(h)=1+n(h−1)+n(h−2)
with n(1)=1,n(2)=2.
This is the fibonacci recurrence shifted by 1, giving: n(3)=4,n(4)=7,n(5)=12,n(6)=20,… compared to Fibonacci Fn:1,1,2,3,5,8,13,21,…. Specifically, n(h)=Fh+2−1.
Exponential lower bound
From the recurrence, n(h)=n(h−1)+n(h−2)>2⋅n(h−2) for h≥3.
Applying this repeatedly:
n(h)>2⋅n(h−2)>4⋅n(h−4)>8⋅n(h−6)>2i⋅n(h−2i)
Choosing i=⌊h/2⌋−1 gives n(h)>2h/2−1.
Height bound
For a tree with n nodes and height h: n≥n(h)>2h/2−1
Taking logarithms: log2n>2h−1
Therefore: h<2log2n+2
The constant factor is tight: AVL height is at most ≈1.44log2n, compared to perfectly balanced trees at log2n. In practice, a tree with 230≈1 billion nodes has height ≤45.