Linear Heap algorithm, (buildheap) (Carol Zander, CSS 343) --------------------- Consider the complexity of building a heap using the algorithm which puts all values into the array, then uses heapify to turn each subtree starting at the bottom level into a heap and works its way up level by level, until the entire tree is a heap. This algorithm is linear. Since this algorithm does a swap at each node for exactly the height of the node, proving the following theorem, proves buildheap is O(N) where N = 2^(h+1) - 1 (2 to the power of (h+1), then -1). N is the number of nodes in a complete binary tree. (Assume h=0 for a tree with just a root, i.e., count edges to compute height.) Theorem: For a complete binary tree of height h containing 2^(h+1) - 1 nodes, the sum of the heights is 2^(h+1) -1 -(h+1). (Note that since N = 2^(h+1) - 1 , 2^(h+1) -1 -(h+1) is O(N) .) Reasoning: Each time we heapify (move a value up the tree), that count is the height of the tree. That's why we sum them. When we get the sum in the theorem, 2^(h+1) - 1 - (h+1), it is O(N) as the -1 -(h+1) is insignificant relative to 2^(h+1) (formally by definition of big-oh). Proof: Sum the height of the tree. Here we use the definition that says an empty tree has height -1; a tree with just a root is height zero. For example, when h is 3, the sum of the heights is: 1(3) + 2(2) + 4(1) + 8(0) Now generalize and write as a formula: 1(3) + 2(2) + 4(1) + 8(0) = 2^0(3-0) + 2^1(3-1) + 2^2(3-2) + 2^3(3-3) = 2^0(h-0) + 2^1(h-1) + 2^2(h-2) + 2^3(h-3) // \ / \ / \ // \ / \ // \ // \ / \ / \ // \ /\ /\ /\ // \ /\ // \ /\ //\\//\\ //\\//\\ 1 branch, 2 branches, 4 branches, of length 3 of length 2 of length 1 The double lines show the height count. _h_ So, the sum S = \ /__ 2^i (h-i) i=0 Multiply S times 2, then subtract S, and you're left with S: 2S= 2^1(h) + 2^2(h-1) + 2^3(h-2) +...+ 2^(h-2)(3) + 2^(h-1)(2) + 2^h -S= -2^0(h) - 2^1(h-1) - 2^2(h-2) - 2^3(h-3) -...- 2^(h-2)(2) - 2^(h-1)(1) - 0 -------------------------------------------------------------------------------- S= -h + 2^1 + 2^2 * 2 + 2^3 * 3 +...+ 2^(h-2)(3-2) + 2^(2-1) + 2^h - 2^2 * 1 - 2^3 * 2 \_______/ \_______/ = 2^2 = 2^3 ... Subtracting term by term, the terms with 'h' drop out and a power of 2 is left. _h_ = -h + \ /__ 2^j j=1 _h_ = -h - 1 + 1 + \ we're just adding and subtracting one, trick /__ 2^j (which we do so the summation can start at zero) j=1 _h_ = -h - 1 + \ since 1=2^0, we can start the sum at zero /__ 2^j j=0 = -h - 1 + 2^(h+1) - 1 = 2^(h+1) - 1 - (h+1) So, we have proven that buildheap is O(2^(h+1)-1-(h+1)) = O(N) .