# Heaps

The min-heap abstraction below is defined for any element type that is ordered. For simplicity, however, we will work specifically with heaps of `Int`s, described by the type `Heap`; we will discuss more general, polymorphic `Heap` types below. The following type signatures define our `Heap` interface:

``````empty     : Heap
isEmpty   : Heap -> Bool
findMin   : Heap -> Maybe Int
insert    : Int -> Heap -> Heap
deleteMin : Heap -> Maybe Heap
merge     : Heap -> Heap -> Heap``````

Max-heaps are defined similarly.

## Heaps as Arrays

One way to represent a min-heap is as a complete binary tree that satisfies the min-heap property, which requires that every node be no bigger than its children. Complete binary trees can be implemented using arrays.

Recall the scheme (from Homework 2) for ordering nodes in a complete binary tree using breadth-first search:

``````    Level 1              0

Level 2        1           2

Level 3     3     4     5     6

. .   . .   . .   . .``````

The nodes of a complete binary tree can be stored in breadth-first order in an array `a`...

``````      i:     0   1   2   3   4   5   6   ...
----------------------------------
a[i]:   | 0 | 1 | 2 | 3 | 4 | 5 | 6 | ...
----------------------------------``````

And then navigating the parent and child edges for a node stored in `a[i]` is easy:

• the parent is stored in `a[(i-1)//2]`,
• the left child (if any) is stored in `a[(2*i)+1]`, and
• the right child (if any) is stored in `a[(2*i)+2]`.

In imperative languages, retrieving an arbitrary element in an array takes O(1) time. In functional languages, purely functional arrays are often implemented in terms of balanced search trees, which typically provide O(log n) access to arbitrary elements.

Although we have not yet talked about how to implement balanced search trees, we will take as given an implementation of balanced search trees. In Elm, the `Array` library is implemented with a data structure called Relaxed Radix Trees, which provides near constant-time performance in practice for getting and setting arbitrary indices.

## Internal Representation vs. External Interface

We will make use of the `Array` library (which we will qualify with an `A`) to implement the heap abstraction. If we choose to define and export

``type alias Heap = A.Array Int``

then clients of our module will be able to see that we have implemented `Heap`s using `Array`s and will have access to the `Array` values that we will use to represent `Heap`s. If we wish to hide the representation from clients, we can instead define a new type

``type Heap = WrapHeap (A.Array Int)``

and export only those functions that we want clients to use:

``````module Heaps
(Heap, empty, isEmpty, findMin, insert, deleteMin, merge) where
...``````

Because none of these API functions "reveal" the underlying representation of the `Heap` type to the client (there are no functions of type `Heap -> A.Array Int`), clients will not be able to manipulate `Heap`s directly with `Array` operators.

Creating abstraction boundaries like this facilitate the software engineering process, by preventing clients from (intentionally or accidentally) violating invariants that the correctness of the module implementation depends on, as well as facilitating changes to the implementation by limiting and making explicit the boundaries between modules.

To implement the `Heap` abstraction, we will use the following `Array` operators:

``````A.empty  : Array a
A.length : Array a -> Int
A.push   : a -> Array a -> Array a
A.get    : Int -> Array a -> Maybe a
A.set    : Int -> a -> Array a -> Array a
A.slice  : Int -> Int -> Array a -> Array a``````

The `A.push` function creates a new array that is the same as in the input array except that it contains an additional element at the end. For convenience, we will define an analogous function called `pop` that creates a new array with the last element removed.

``````pop : A.Array a -> A.Array a
pop a = A.slice 0 (A.length a - 1) a ``````

``````empty : Heap
empty = WrapHeap A.empty

isEmpty : Heap -> Bool
isEmpty (WrapHeap a) = A.length a == 0``````

Notice that the definition of `isEmpty` uses pattern matching to, at once, deconstruct the argument value and bind the underlying array value to the variable `a`. This definition is equivalent to all of the following:

``````isEmpty h = case h of WrapHeap a -> A.length a == 0

isEmpty = \h -> case h of WrapHeap a -> A.length a == 0

isEmpty = \h -> let (WrapHeap a) = h in A.length a == 0

isEmpty = \(WrapHeap a) -> A.length a == 0``````

Because values of type `Heap` can only be constructed by the `WrapHeap` data constructor, we can use patterns in bindings (rather than `case` expressions) and be sure that we have covered all patterns.

The `findMin` implementation is also straightforward.

``````findMin : Heap -> Maybe Int
findMin (WrapHeap a) = A.get 0 a``````

## Insertion

Let's now look at the first non-trivial operator, inserting an `Int` into a `Heap`. The idea is to add the element to the next position in the complete binary tree (conveniently represented as the last element in the array) and then "bubble" or "percolate" the element up the tree until it is no longer bigger than its parent.

``````insert : Int -> Heap -> Heap
insert x (WrapHeap a) =
let
n  = A.length a
a' = A.push x a
in
WrapHeap (bubbleUp n a')``````

As we manipulate the underlying `Array` representations, we will make sure to always access elements within bounds. So we define an "unsafe" version of `get`:

``````fromJust : Maybe a -> a
fromJust mx =
case mx of
Just x  -> x
Nothing -> Debug.crash "fromJust"

justGet : Int -> A.Array a -> a
justGet i a = fromJust (A.get i a)``````

We also define a helper function to swap two elements in an `Array`:

``````swap : Int -> Int -> A.Array a -> A.Array a
swap i j a =
a |> A.set i (justGet j a) |> A.set j (justGet i a)``````

The `bubbleUp` function is defined to swap node `i` with its parent node `(i-1)//2` if the parent is larger and, if so, recursively traverses up the tree. We use the type alias `InternalHeap` to refer to `A.Array Int` within our implementation.

``````bubbleUp : Int -> InternalHeap -> InternalHeap
bubbleUp i a =
if i == 0 then a
else
let
j   = (i-1) // 2
ch  = justGet i a
par = justGet j a
in
if par <= ch
then a
else a |> swap i j |> bubbleUp j``````

Let's now consider the worst-case time cost of the insertion algorithm. The `insert` function computes the length of the `Array` representation (which runs in O(1) time, by looking at the documentation and implementation), `push`es an element on the end of it (which runs in worst-case O(log n) time) and calls `bubbleUp`.

The `bubbleUp` function makes use of `justGet` and `swap`. Because `A.get` runs in O(log n) time, so does the wrapper function `justGet`. The `swap` function makes several calls to `A.set` and `justGet`, each of which takes O(log n) time. Thus, `swap` takes O(log n) time. The `bubbleUp` function visits at most O(log n) elements because the index `i` is divided in half before each recursive call. Therefore, there are O(log n) calls to `bubbleUp`, each of which performs O(log n) work. So the running time of `bubbleUp`, and hence `insert`, is O(log2n). In an imperative language, where array operations take worst-case O(1) time, the insertion algorithm runs in worst-case O(log n) time.

## Deletion

To delete the minimum element, which is stored at index `0`, we first overwrite the root with the value currently stored in the last position (conveniently stored in the last element of the `Array`). We pop the last element because its value is now stored in the root and then "bubble" or "percolate" this value down as long as necessary. When recursively bubbling down, we choose the child tree whose root is smaller in order to maintain the heap order property.

``````deleteMin : Heap -> Maybe Heap
deleteMin (WrapHeap a) =
let n = A.length a in
if n == 0 then Nothing
else
let x = justGet (n-1) a in
a |> pop
|> A.set 0 x
|> bubbleDown 0
|> WrapHeap
|> Just``````

For a given index `i`, the index of the left child is `j = 2*i + 1` and of the right child as `k = 2*i + 2`. If the value at index `i` is smaller than both children, the output array is unchanged. Otherwise, the value at index `i` is swapped with the smaller root among the two subtrees and `bubbleDown` recurses on that subtree.

``````bubbleDown : Int -> InternalHeap -> InternalHeap
bubbleDown i a =
let n = A.length a in
if i >= n then a
else
let (j, k) = (2*i + 1, 2*i + 2) in
let i'  = if j < n && justGet j a < justGet i  a then j else i  in
let i'' = if k < n && justGet k a < justGet i' a then k else i' in
if i == i''
then a
else a |> swap i i'' |> bubbleDown i''``````

The analysis of `bubbleDown` and `deleteMin` is similar to the insertion algorithm, resulting in a O(log2n) worst-case time cost.

## Merging

We will not go through the algorithm for merging heaps that are as arrays in this course. But you may be curious to go read about it if you have not seen it before.

In our Elm implementation, we will pretend that we implement the `merge` function faithfully, but instead we will always trigger a run-time error.

``merge _ _ = Debug.crash "merge not implemented"``

The `Heaps.elm` file contains the implementation above.

# Polymorphic Types for Heaps

We defined the `Heap` type to store `Int`s, but the same abstraction exists for any kind of comparable values. And, of course, we don't want to duplicate the implementation for every different type of heap that we may want to work with.

Recall that Elm provides the special polymorphic type variable `comparable` to describe such types. So we can generalize the heap abstraction as follows.

``````type Heap a = WrapHeap (A.Array a)

empty     : Heap comparable
isEmpty   : Heap comparable -> Bool
findMin   : Heap comparable -> Maybe comparable
insert    : comparable -> Heap comparable -> Heap comparable
deleteMin : Heap comparable -> Maybe (Heap comparable)
merge     : Heap comparable -> Heap comparable -> Heap comparable``````

The `PolyHeaps.elm` file contains the modifications required to implement this more general interface. It is worth noting that compared to `Heaps.elm`, type signatures are changed but no value definitions are changed.

We may want to define interfaces in terms of types that come with operators besides those specified in the `comparable` "type class." But Elm does not provide a way for the programmer to define an interface that describes a set of types.

In Haskell, user-defined type classes can be used to specify the heap abstraction as follows.

``````class Ord a where
(<)  :: a -> a -> Bool
(>=) :: a -> a -> Bool
(>)  :: a -> a -> Bool
(<=) :: a -> a -> Bool
max  :: a -> a -> a
min  :: a -> a -> a

class Ord a => Heap a where
empty     : Heap a
isEmpty   : Heap a -> Bool
findMin   : Heap a -> Maybe a
insert    : a -> Heap a -> Heap a
deleteMin : Heap a -> Maybe (Heap a)
merge     : Heap a -> Heap a -> Heap a``````

The second class definition says that a type `a` "is a" `Heap` if (1) `a` "is a" `Ord` and (2) the six functions specified can be defined for that type. Particular implementations are then defined to "implement" the `Heap` "interface".

``````type MyHeap a = WrapHeap (Array a)

instance Heap (MyHeap a) where
insert = ...
...``````

For simplicity, the definition above does not mention the Haskell `Eq` type class.

## ML-style Modules

In Standard ML and OCaml, signatures are used to describe modules, which are structures that contain values and types.

``````signature ORDERED =
sig
type T

val eq  : T -> T -> bool
val lt  : T -> T -> bool
val leq : T -> T -> bool
end

signature HEAP =
sig
structure Elem : ORDERED

type H

val empty     : H
val isEmpty   : H -> bool
val insert    : Elem.T -> H -> H
val merge     : H -> H -> H
val findMin   : H -> Elem.t option
val deleteMin : H -> H option
end``````

An implementation satisfies the `Heap` signature by defining a functor that takes a module as an argument and returns another module. Note that these ML functors are the not same thing as Haskell `Functor`s.

``````functor MyHeap (Element : ORDERED) : HEAP =
struct
structure Elem = Element

datatype H = ...

fun insert = ...
...
end``````