Linear Heap algorithm, (buildheap) (Carol Zander, CSS 343)
---------------------
Consider the complexity of building a heap using the algorithm which puts all
values into the array, then uses heapify to turn each subtree starting at the
bottom level into a heap and works its way up level by level, until the entire
tree is a heap. This algorithm is linear. Since this algorithm does a swap at
each node for exactly the height of the node, proving the following theorem,
proves buildheap is O(N) where N = 2^(h+1) - 1 (2 to the power of (h+1),
then -1). N is the number of nodes in a complete binary tree.
(Assume h=0 for a tree with just a root.)
Theorem: For a complete binary tree of height h containing 2^(h+1) - 1 nodes,
the sum of the heights is 2^(h+1) -1 -(h+1).
(Note that since N = 2^(h+1) - 1 , 2^(h+1) -1 -(h+1) is O(N) .)
Reasoning: Each time we heapify (move a value up the tree), that count is
the height of the tree. That's why we sum them. When we get the sum
in the theorem, 2^(h+1) - 1 - (h+1), it is O(N) as the -1 -(h+1) is
insignificant relative to 2^(h+1) (formally by definition of big-oh).
Proof: Sum the height of the tree. Here we use the definition that says an
empty tree has height -1; a tree with just a root is height zero.
For example, when h is 3, the sum of the heights is:
1(3) + 2(2) + 4(1) + 8(0)
Now generalize and write as a formula:
1(3) + 2(2) + 4(1) + 8(0)
= 2^0(3-0) + 2^1(3-1) + 2^2(3-2) + 2^3(3-3)
= 2^0(h-0) + 2^1(h-1) + 2^2(h-2) + 2^3(h-3)
// \ / \ / \
// \ / \ // \ // \ / \ / \
// \ /\ /\ /\ // \ /\ // \ /\ //\\//\\ //\\//\\
1 branch, 2 branches, 4 branches,
of length 3 of length 2 of length 1
The double lines show the height count.
_h_
So, the sum S = \
/__ 2^i (h-i)
i=0
Multiply S times 2, then subtract S, and you're left with S:
2S= 2^1(h) + 2^2(h-1) + 2^3(h-2) +...+ 2^(h-2)(3) + 2^(h-1)(2) + 2^h
-S= -2^0(h) - 2^1(h-1) - 2^2(h-2) - 2^3(h-3) -...- 2^(h-2)(2) - 2^(h-1)(1) - 0
--------------------------------------------------------------------------------
S= -h + 2^1 + 2^2 * 2 + 2^3 * 3 +...+ 2^(h-2)(3-2) + 2^(2-1) + 2^h
- 2^2 * 1 - 2^3 * 2
\_______/ \_______/
= 2^2 = 2^3 ...
Subtracting term by term, the terms with 'h' drop out and a power of 2 is left.
_h_
= -h + \
/__ 2^j
j=1
_h_
= -h - 1 + 1 + \ we're just adding and subtracting one, trick
/__ 2^j (which we do so the summation can start at zero)
j=1
_h_
= -h - 1 + \ since 1=2^0, we can start the sum at zero
/__ 2^j
j=0
= -h - 1 + 2^(h+1) - 1
= 2^(h+1) - 1 - (h+1)
So, we have proven that buildheap is O(2^(h+1)-1-(h+1)) = O(N) .