CSS 343: Notes from Lecture 5 (DRAFT)
Administrivia
-
next hackathon: Sunday PM
-
2-3-tree code posted to course web site with minor changes
-
eliminated skeleton/impl concept (was artifact of prior
live-coding session)
-
added comments
-
renamed template parameter from
T
to
Thing
-
added monolithic version of node insert function
-
makes your brain hurt?
-
example of what not to do
-
insert calling insert_{leaf,left,middle,right} is a low-level
example of an abstraction layer
-
makes it easier to conceptualize complex function
-
additional B-tree variants: B*, B+
-
read up on them if you're interested
-
delete is similar idea
-
not required for this course
-
required for a practical system
-
demo 2-3 tree code uses templates
-
artifact of previous incarnation of course
-
hoping to get to discussion of templates in 2nd half
(post-midterm) of course
Tree Rotations
-
single rotation (left or right):
http://en.wikipedia.org/wiki/Tree_rotation
-
left and right rotations are inverse operations
-
in-order traversal is
invariant
under tree rotation
-
right rotation decreases α height by one, leaves β height
unchanged and raises γ height by one (and vise versa)
-
double rotation:
-
right-right: decreases height of α and β by one, leaves
γ unchanged, and increases height of δ by one
-
left-right: leaves height of α unchanged, decreases
height of β and γ by one, and increases height of
δ by one
-
left-left: mirror of right-right
-
right-left: mirror of left-right
-
a single rotation can be performed in
O(1)
time
AVL Tree
-
original self-balancing data structure
-
discovered/invented in 1962 by G. M. Adelson-Velski &
E. M. Landis
-
insert/lookup/delete:
O(log N)
for best, worst, & average cases
-
each node keeps track of difference between height of left and
right children; during insert/delete, whenever difference
exceeds +/- 1, a rebalancing operation is performed
-
O(log N) rotations per insert/delete
-
red-black trees have same asymptotic complexity but are less
rigidly balanced
-
AVL insert/delete is more expensive than RB
-
AVL lookup is cheaper than RB
BST Deletion
3 cases:
-
leaf node: delete
-
single-child node: replace with child
-
two-child node: replace node with immediate predecessor or
successor, then delete original predecessor/successor
-
predecessor is on left branch and has no left child;
successor is on right branch has no right child
-
therefore, deleting predecessor/successor reduces to case 1
or 2
Additional operations may be performed to maintain balance
(e.g. AVL & red-black).
Red-Black Tree
Example: red-black tree construction (click on image to enlarge)
or download
frame-by-frame
zip.
Also see the
wikipedia
article.
A red-black tree is a Binary Search Tree in which each node is
labeled by a color, red or black (it could be called up/down,
true/false, 0/1, but red/black was chosen). The tree has the
following rules (invariants) that must be maintained:
-
the root is black
-
synthetic black leaves are added to fill in every null child of
the original tree.
-
a single shared object may be used to save memory
-
all simple paths from a node to it's leaf descendants have the
same number of black nodes (black height)
-
the children of any red node must be black
Implication:
-
The longest path will be no more than double the length of the
shortest path giving O(log N) insert/lookup/delete, provided we
can rebalance in O(log N) time after insert/delete.
A parent pointer is not required, but makes coding so very much
easier.
Trying to hand-label some arbitrary binary tree is hard and
confusing
-
not every binary tree can be labeled as a red-black tree
-
rotations may be applied to Binary tree to
transform it into a red-black tree
Insertion
Considering only insertion; deletion is similar.
Insert a new entry in the tree in the location where it would
normally be inserted (ignoring the synthetic black leaves). Label
the new node red. It will have two black (synthetic) children.
Rebalance the tree to make sure the red-black conditions
(invariants) are maintained.
-
insertion may require O(log N) color changes.
-
insertion may require O(1) tree rotations (2 rotations for
insert, 3 for delete)
In keeping with the geneology theme, the parent node of a parent
node is a grandparent and the sibling of a parent node is an
uncle.
Case-by-case analysis:
-
node is root: relabel node black and return
-
parent is black: return
-
parent and uncle are both red: relabel parent and uncle black
and relabel Grandparent as red. Repeat with Grandparent node
-
if parent was red, grandparent must have been black
-
this is the only recursive step, which may propagate color
changes up to the root, giving O(log N) color changes.
-
parent is red but uncle is black
and
node is left child of parent but parent is right child of
grandparent or node is right child, but parent is left child
(left-right or right-left cases):
rotate so node is parent of original parent and original parent
is child of node and proceed to case 5 setting node to original
parent.
-
the grandparent must be black since it has a red child
-
the sibling of the node (which may be a synthetic leaf) must
be black since it's the child of a red node.
-
node is left child of parent and parent is left child of
grandparent or node is right child and parent is right child
(left-left or right-right cases):
relabel parent black and grandparent red, then rotate parent to
grandparent position and return.