Skip to content

In hash table, find the minimal value or successor of a value takes \(O(n)\) time. Can we do better?

Binary Search Tree

A data structure: For each node, all values in the left subtree are less than the value of this node; all values in the right subtree are greater than the value of this node.

Each node: value (or key-value pair), pointer to the left subtree, pointer to the right subtree.

Operations

Find(x): Start from the root, do

  • If x<cur, Find(x.left)
  • If x>cur, Find(x.right)
  • If x==cur, return cur. (If reach the leaf but not equal, return Not Found)

Find the minimal value: Always go left. The leftmost leaf.

Find the maximal value: Always go right, The rightmost leaf.

Find the successor of value \(x\):

  • If \(x\) has the right subtree, find the minimal value in the right subtree.
  • If \(x\) doesn't have the right subtree, go up and find the first ancestor on the right.

Insert a value \(x\): Similar to the find operation.

Problem of Binary Search Tree

All these operation calls for \(O(\text{height of the tree})\). In the worst cases the height is \(n\), i.e., the binary search tree becomes a linked list.

Can we insert/delete the element s.t. the height of the tree is always \(O(\log n)\)? Yes

Balanced Binary Search Tree

For example Red-black Tree.

  1. It is a binary search tree, nodes are either black or red.
  2. The root is black.
  3. Leaves are black and nil.
  4. Red nodes have \(2\) black childs.
  5. For every node \(v\), all path from \(v\) to leaves have the same number of black nodes.

Definition

The height of node \(v\): The maximum distance from \(v\) to a leaf.

Black height of node \(v\): The maximum number of back node on the path from \(v\) to the leaf. (Not count itself)

Proposition: \(\text{height}(v)\leq2\cdot BH(v)\).

Lower bound for the number of descendants (not count leafs) of a node as a function of its black height:

  • \(1+\) size of left subtree \(v\) + size of right subtree of \(v\geq 2^{BH(v)}-1\).

Proof by induction, we get the height of the tree is \(O(\log n)\).

Operations

Find in red-black tree: \(O(\log n)\)

Insert: \(O(\log n)\) We need to keep all properties satisfied

  1. Do a normal binary search tree insert, set the new node red.
  2. Rotate and recolor. This step takes \(O(1)\) time.

Delete: Similar to insert, but consider more cases. \(O(\log n)\).

Summary

Red-black tree implements dictionary. Find, insert, delete, find minimal, find maximal, find successor all takes \(O(\log n)\) time

Red-black Tree VS. Hash Table

  1. Do the benchmark.
  2. If we need to find minimal/maximal/successor, use red-black tree.
  3. If the whole dictionary is a key: use red-black tree.
  4. Real-time request? red-black tree.

Otherwise, use hash table.

Other Balanced Tree

AVL tree, B-tree...