# Binomial Heaps

For simplicity, we'll continue to work with heaps of `Int`s. Our implementation (`BinomialHeaps.elm`) exports the same type signatures as the previous implementations of min-heaps.

## Binomial Trees: Representation and Invariants

``````type alias Rank = Int
type Tree = Node Rank Int (List Tree)

rank (Node r _ _) = r
root (Node _ x _) = x``````

Binomial trees of rank `r` are defined inductively as follows.

• `Node 0 n []` is a binomial tree of rank `0`.
• `Node r n ts` is a binomial tree of rank `r > 0` if `ts` is a list of `r` binomial trees with rank `r-1` through `0`, respectively.

A binomial tree of rank `r` has 2`r` nodes.

A binomial tree of rank `r + 1` is formed by linking together two trees of rank `r`, making one the leftmost child of the other.

The `link` function below links two binomial trees, choosing to keep the smaller of the two elements at the root. Therefore, if `t1` and `t2` are both heap-ordered, then so is the result.

``````link : Tree -> Tree -> Tree
let (Node r x1 c1) = t1
(Node _ x2 c2) = t2
in
if x1 <= x2
then Node (1+r) x1 (t2::c1)
else Node (1+r) x2 (t1::c2)``````

## Binomial Heaps: Representation and Invariants

``````type alias InternalHeap = List Tree
type Heap = WrapHeap InternalHeap``````

A binomial heap is list of heap-ordered binomial trees, kept in strictly-increasing order of rank. A binomial heap containing n elements is represented using at most O(log n) binomial trees, analogous to the binary representation of n.

## Heap Interface

``````empty : Heap
empty = WrapHeap []

isEmpty : Heap -> Bool
isEmpty h = h == empty``````

The `findMin` function searches for the smallest root among all of the binomial trees, taking O(log n) time.

``````findMin (WrapHeap ts) =
case List.map root ts of
[]    -> Nothing
n::ns -> Just (List.foldl min n ns)``````

See Homework 3 for a way to implement `findMin` so that it runs in O(1) time, as for other heap implementations.

## Insertion

Inserting into a binomial heap requires pairwise `link`ing of `Trees` with equal rank. Think "sum" and "carry" bits as in arithmetic addition. This is analogous to arithmetic addition.

``````insert : Int -> Heap -> Heap
insert x (WrapHeap ts) = WrapHeap (insertTree (Node 0 x []) ts)

insertTree : Tree -> InternalHeap -> InternalHeap
insertTree t ts = case ts of
[]      -> [t]
t'::ts' ->
if rank t == rank t' then insertTree (link t t') ts'
else if rank t < rank t' then t :: ts
else Debug.crash "insertTree: impossible"``````

There are O(m) recursive calls to `insertTree` (where m is the length of `ts`), each of which performs O(1) work (to `link` two trees). Thus, `insert` runs in O(m) = O(log n) time.

## Merging

``````merge : Heap -> Heap -> Heap
merge (WrapHeap ts1) (WrapHeap ts2) = WrapHeap (merge_ ts1 ts2)

merge_ : InternalHeap -> InternalHeap -> InternalHeap
merge_ ts1 ts2 = case (ts1, ts2) of
([], _) -> ts2
(_, []) -> ts1
(t1::ts1', t2::ts2') ->
if rank t1 < rank t2 then t1 :: merge_ ts1' ts2
else if rank t2 < rank t1 then t2 :: merge_ ts2' ts1
else insertTree (link t1 t2) (merge_ ts1' ts2')``````

To analyze the running time of `merge_`, let m be the total number of trees in both `ts1` and `ts2`. The first two cases run in O(1) time. Each recursive call to `merge_` decreases m by one (in the third and fourth cases) or two (in the fifth case). The cons operations in the third and fourth cases require O(1) work. In the fifth case, `link` requires O(1) time, `insertTree` requires O(m) time, and the recursive call to `merge_` requires T(m-2) time. There are O(m) recursive calls, each of which requires at most O(m) time. The result is a O(m2) running time, which is O(log2(n)) where n is the total number of elements described in the two heaps being merged.

A more subtle analysis, however, can be used to argue that the implementation of `merge_` runs in O(log n) time. The argument requires a more careful accounting of how many times `link` is called (which is the crux of both `insertTree` and `merge_`) based on the analogy between merging lists and adding two numbers in binary representation.

For our purposes, we will consider an alternative, one-pass definition (also drawn from this post) that is slightly easier to analyze. (UPDATE 2/9: New `r == r1 == r2` case for `merge_wc`.)

``````merge' : InternalHeap -> InternalHeap -> InternalHeap
merge' ts1 ts2 = case (ts1, ts2) of
([], _) -> ts2
(_, []) -> ts1
(t1::ts1', t2::ts2') ->
if rank t1 < rank t2 then t1 :: merge' ts1' ts2
else if rank t2 < rank t1 then t2 :: merge' ts1 ts2'
else merge_wc (link t1 t2) ts1' ts2'

merge_wc : Tree -> InternalHeap -> InternalHeap -> InternalHeap
merge_wc t ts1 ts2 = case (ts1, ts2) of
([], _) -> insertTree t ts2
(_, []) -> insertTree t ts1
(t1::ts1', t2::ts2') ->
let (r,r1,r2) = (rank t, rank t1, rank t2) in
if      r <  r1 && r <  r2 then t :: merge' ts1 ts2
else if r <  r1 && r == r2 then merge_wc (link t t2) ts1 ts2'
else if r == r1 && r <  r2 then merge_wc (link t t1) ts1' ts2
else if r == r1 && r == r2 then t :: merge_wc (link t1 t2) ts1' ts2'
else Debug.crash "merge_wc: impossible"``````

Let T(m) and S(m) be the running times of `merge'` and `merge_wc`, respectively, where m is an upper bound on the number of trees in both input lists combined. Consider each of the five cases of `merge'`:

• Cases 1 and 2: T(m) = O(1)
• Cases 3 and 4: T(m) = O(1) + T(m-1)
• Case 5: T(m) = O(1) + S(m-2)

Consider each of the six cases of `merge_wc`:

• Cases 1 and 2: S(m) = O(m)
• Case 3: S(m) = O(1) + T(m)
• Cases 4, 5, and 6: S(m) = O(1) + S(m-1)

There are at most O(m) mutually recursive calls between the two functions. The last call to `merge_wc` may take O(m) time, but all other calls take O(1) time. Thus, the worst-case running time for each of these two functions is O(m)+O(m) = O(m) time. That is, O(log n) time.

## Deletion

``````removeMinTree : InternalHeap -> (Tree, InternalHeap)
removeMinTree ts = case ts of
[t]    -> (t, [])
t::ts' ->
let (t',ts'') = removeMinTree ts' in
if root t < root t'
then (t, ts')
else (t', t::ts'')

deleteMin : Heap -> Maybe Heap
deleteMin (WrapHeap ts) = case ts of
[] -> Nothing
_  -> let (Node _ x ts1, ts2) = removeMinTree ts in
Just (WrapHeap (merge_ (List.reverse ts1) ts2))``````

## Finding (Revisited)

We can reuse the `removeMinTree` helper function to reimplement `findMin`.

``````findMin (WrapHeap ts) =
case ts of
[] -> Nothing
_  -> Just (root (fst (removeMinTree ts)))``````

## Pre- and Post-Conditions

Let's revisit the `merge` function above and think a bit more carefully about why the output heap satisfies the intended invariants.

First, assume we have implemented a couple of predicates that identify valid binomial trees and binomial heaps, respectively:

``````binTree : Tree -> Bool
binHeap : Internal Heap -> Bool``````

Next, we can define the following function to compute the rank of the first binomial tree in a binomial heap, if any:

``````rankFst : InternalHeap -> Int
rankFst ts =
case ts of
[]   -> Random.maxInt   -- pretend this is "infinity"
t::_ -> rank t``````

Now let's reason about the functions on which `merge` depends, namely, `link` and `insertTree`.

In comments, we'll write some pre-conditions on arguments that we expect all callers to satisfy and some post-conditions on the return value that our implementation intends to satisfy.

``````   link :     Tree  ->     Tree  ->      Tree
-- link : (t1:Tree) -> (t2:Tree) -> (ret:Tree)
--
--   @pre: binTree t1 && binTree t2 && rank t1 == rank t2
--  @post: binTree ret && rank ret == 1 + rank t1``````

A first attempt for `insertTree`:

``````   insertTree :    Tree ->      InternalHeap ->       InternalHeap
-- insertTree : (t:Tree) -> (ts:InternalHeap) -> (ret:InternalHeap)
--
--   @pre: binTree t && binHeap ts && rank t <= rankFst ts
--  @post: binHeap ret``````

And a first attempt for `merge_`:

``````   merge_ : InternalHeap -> InternalHeap -> InternalHeap
-- merge_ : (ts1:InternalHeap) -> (ts2:InternalHeap) -> (ret:InternalHeap)
--
--   @pre: binHeap ts1 && binHeap ts2
--  @post: binHeap ret``````

In the case where `t1` is cons-ed onto the recursive call to `merge_`, the `binHeap ret` predicate follows from (i) the post-condition established by the recursive call, (ii) the conditional check, and (iii) from the relationship between `t1` and `ts1'` established by the assumption `binHeap ts1`. The case where `t2` is cons-ed onto the recursive call is similar.

Now consider the case when `insertTree` is called with the return value from the recursive call to `merge_`. The post-condition of `merge_` above is not sufficient to satisfy the pre-conditions required by `insertTree`. So, we need to establish a stronger post-condition:

``````   merge_ : InternalHeap -> InternalHeap -> InternalHeap
-- merge_ : (ts1:InternalHeap) -> (ts2:InternalHeap) -> (ret:InternalHeap)
--
--   @pre: binHeap ts1 && binHeap ts2
--  @post: binHeap ret &&
--         rankFst ret >= min (rankFst ts1) (rankFst ts2)    (added)``````

This will satisfy the pre-conditions on `insertTree`, but now the post-condition of `insertTree` is not sufficient to satisfy the new, stronger post-condition of `merge_`. So, we need to strengthen the post-condition of `insertTree`:

``````   insertTree :    Tree ->      InternalHeap ->       InternalHeap
-- insertTree : (t:Tree) -> (ts:InternalHeap) -> (ret:InternalHeap)
--
--   @pre: binTree t && binHeap ts && rank t <= rankFst ts
--  @post: binHeap ret &&
--         rankFst ret >= rank t                (added)``````

This is the kind of informal reasoning we need to do on paper to argue why `merge` does indeed produce a valid binomial heap.

To gain more confidence, we could add defensive, dynamic checks at run-time to make sure that we haven't missed something. Or we could use powerful theorem provers to make certain that we've written the code correctly. Both of these are topics for another day.