# Trees (Redux)

Let's revisit the trees as defined before.

``type Tree a = Empty | Node a (Tree a) (Tree a)``

In Homework 2, height was implicitly defined as follows:

``````height t =
case t of
Empty        -> 0
Node _ t1 t2 -> 1 + max (height t1) (height t2)``````

For example:

``````height Empty = 0
height (Node 0 Empty Empty) = 1
height (Node 0 (Node 1 Empty Empty) Empty) = 2``````

With this terminology, the height of a `Tree` `t` is the number of `Node`s along the longest path from the root (i.e. `t`) to an `Empty` tree. Equivalently, the height is the number of edges along the longest path from the root to an `Empty` tree.

Instead, we could have defined height to be the number of edges along the longest path from the root to a "leaf" node (in our case, a `Node` with two `Empty` children). Perhaps this definition is more common and intuitive. In a functional language, however, both definitions are, arguably, reasonable since empty nodes are represented explicitly (i.e. `Empty`) instead of implicitly as a `null` pointer to a tree. (In a language like Java, a value of a reference type `T` is either `null` or a reference to a value of type `T` — akin to `Maybe (Ref T)`).

In any case, let's stick with the definition we used before. Furthermore, let's refer to the depth of a `Tree` `t` (with respect to a root `Tree`) as the number of edges from the root to `t`. With this terminology, depth in a `Tree` of height `h` ranges from `0` to `h`.

`MoreTrees.elm` contains the code below.

## Full and Almost Complete Trees

Let's write a function that determines whether a `Tree` is full.

``isFull : Tree a -> Bool``

There are many ways to implement this. Let's start with the following version:

``````isFull t =
let h = height t in
let foo depth tree =
if depth == h then True
else {- depth < h -}
case tree of
Empty -> False
Node _ t1 t2 ->
let b1 = foo (depth+1) t1 in
let b2 = foo (depth+1) t2 in
b1 && b2
in
h == 0 || foo 0 t``````

Here, we first compute the height `h` and then use a helper function `foo` to recursively walk the tree, keeping track of the `depth` of the `tree` under consideration. `Empty` is allowed only at the bottom-most level; the `then` branch returns `True` without discriminating `tree`, because it must be `Empty` based on how `height` has computed `h`. At all other levels (handled by the `else` branch), the `tree` is required to be a `Node` and, if so, its children are recursively traversed.

Now let's write a function that determines whether a `Tree` is almost complete (according to the definition in Homework 2). Again, there are many ways to implement `isAlmostComplete`, but here's an approach that is almost identical to the one above:

``````isAlmostComplete : Tree a -> Bool
isAlmostComplete t =
let h = height t in
let foo depth tree =
if depth == h-1 then True       -- only difference is guard
else {- depth < h-1 -}
case tree of
Empty -> False
Node _ t1 t2 ->
let b1 = foo (depth+1) t1 in
let b2 = foo (depth+1) t2 in
b1 && b2
in
h == 0 || foo 0 t``````

The approach is analogous to `isFull`, except that `Empty` is allowed only at depths `h-1` and `h`. Notice that the `then` branch returns `True` without discriminating `tree` and traversing its children if it is a `Node`. Why does this work? Consider three cases for the entire row of `Tree`s at level `h-1`: (i) all `Empty`s, (ii) at least one `Empty`, and (iii) all `Node`s. The first case is impossible, since otherwise the height of the overall tree `t` would be `h-1`, not `h`. The latter two cases do not need to be distinguished by discriminating `tree`, hence, the `then` branch simply returns `True`.

### Refactoring Common Code

These versions of `isFull` and `isAlmostComplete` share a lot in common. So it is tempting to factor out all the common code and just add a way to choose between checking `depth == h` and `depth == h-1` in the conditional. (Go ahead and do this for a bit of practice.)

There's also another, more roundabout way of factoring the differences. We could define the `then` branch in `isFull` to have the same guard as in `isAlmostComplete` (in particular, `depth == h-1`), and then check that all `Tree`s at this level be non-`Empty`:

``````isFull t =
...
if depth == h-1 then
case tree of
Node _ _ _ -> True
Empty      -> False
...``````

Then, we can factor out the differences into a single predicate function that gets applied only to `Tree`s at level `h-1`:

``````processTree : (Tree a -> Bool) -> Tree a -> Bool
processTree p t =
let h = height t in
let foo depth tree =
if depth == h-1 then p tree
else
case tree of
Empty -> False
Node _ t1 t2 ->
let b1 = foo (depth+1) t1 in
let b2 = foo (depth+1) t2 in
b1 && b2
in
h == 0 || foo 0 t

isFull =
let p t =
case t of
Node _ _ _ -> True
Empty      -> False in
processTree p

isAlmostComplete =
let p t = True in
processTree p``````

And with a bit of cleaning up:

``````isFull = processTree ((/=) Empty)

isAlmostComplete = processTree (always True)``````

Okay, so is that way of factoring `isFull` and `isComplete` better than the first suggestion above. Probably not. Oh well...

# More Functional Programming

Let's turn our attention back to the first, intuitive version of `isFull` above and improve it in two ways. First, we will eliminate the second recursive call when the first returns `False`:

``````...
let b1 = foo (depth+1) t1 in
let b2 = foo (depth+1) t2 in
b1 && b2
...``````

Second, we will avoid the need to call `height` first before making a "second pass" over the `Tree` to check for fullness:

``````...
let h = height t in
...``````

## Short-Circuiting

We might attempt the first improvement as follows:

``(foo (depth+1) t1) && (foo (depth+1) t2)``

Elm is an eager (a.k.a. strict) language, however, so all arguments to a function are fully evaluated before evaluating the function body. So, won't both recursive calls be evaluated before the `(&&)` function is called?

To answer this question, we could look for guidance in the language documentation and then, to answer any lingering questions, we could either dive into the language implementation or we could devise some hypotheses and make queries to test them. Let's do the latter.

### Digression: Investigative Journalism

To observe when an expression really does get evaluated, let's define the following function:

``````> false () = let _ = Debug.log "thunk evaluated " () in False
<function> : () -> Bool``````

Evaluating the `false` function will print a message and return the `Bool`ean `False`:

``````> false ()
thunk evaluated : ()
False
: Bool``````

We have "delayed" the evaluation of the expression by wrapping it in a function that takes the dummy argument `()`. Functions like are called "thunks" and turn out to be extremely useful for programming in (and compiling) functional languages. We will put thunks to good use later in the course.

Now let's write some tests to exercise the behavior of `(&&)`. As a first sanity check, let's make sure that evaluate `false` twice forces it to evaluate twice (akin to our two, in-sequence recursive calls to `isFull` above):

``````> let b1 = false () in \
| let b2 = false () in \
| b1 && b2
thunk evaluated : ()
thunk evaluated : ()
False
: Bool``````

That's not surprising. We might expect the same effects on...

``````> false () && false ()
thunk evaluated : ()
False
: Bool``````

... but this time `false` is forced to evaluated only once! What's going on? Elm is not lazy, but if we wanted to run a sanity test anyway just to make sure:

``````> let _ = false () in 0
thunk evaluated : ()
0
: number``````

``````> and = (&&)
<function> : Bool -> Bool -> Bool

> false () `and` false ()
thunk evaluated : ()
thunk evaluated : ()
False
: Bool``````

Interesting! When `(&&)` masquerades as `and`, both arguments are evaluated.

What's going on is that when `(&&)` is called syntactically, the expression `e1 && e2` gets parsed and translated (essentially) to the expression `if e1 then e2 else False` which does capture the essence of short-circuiting. But when `(&&)` is "hidden" inside another expression, it is no longer parsed and treated specially; instead, it is an ordinary function call, where the arguments are fully evaluated before evaluating the function itself. (Note: `(||)` is treated similarly).

This is not a surprising compromise actually. The short-circuiting semantics is often what one wants when using `(&&)`. But to provide the same when it is computed from other expressions would require solving a very hard problem known as higher-order control flow analysis, which aims to determine (rather, approximate) the set of function definitions that may "flow into" a particular call-site expression. Such analysis is an important part of compiler optimizations and other static analysis but would be overkill for this particular use. Instead, if one wants the short-circuiting semantics, it seems reasonable to have to write `(&&)` or `if` explicitly.

Let's write one more test to continue to develop our understanding:

``````> (&&) (false ()) (false ())
thunk evaluated : ()
thunk evaluated : ()
False
: Bool``````

So it turns out that using `(&&)` as a prefix function (instead of an infix operator) does not result short-circuiting. I would have expected this to be treated the same as the infix case, but it's no big deal in any case.

For what it's worth, OCaml treats prefix calls to `(&&)` the same way it treats `(||)`. If you're curious and don't have `ocaml` installed, you can try the following out at TryOCaml and watch the Web Console for the logging messages.

``````# let returnFalse () = print "thunk evaluated\n"; false;;
val returnFalse : unit -> bool = <fun>

# returnFalse () ;;
- : bool = false

# (returnFalse ()) && (returnFalse ()) ;;
- : bool = false

# (&&) (returnFalse ()) (returnFalse ()) ;;
- : bool = false``````

### </Digression>

Okay, so we can short-circuit the second recursive call by simplying using `&&` after all:

``(foo (depth+1) t1) && (foo (depth+1) t2)``

## Single-Pass

We should be able to determine whether a `Tree` is full without having to first compute its height. What we need to do is check that the depth of each `Empty` subtree is the same. This is a good exercise in functional programming: we need to maintain some information across multiple recursive calls, but of course we do not have access to any mutable variables to help.

There are three "states" we need to track as we traverse a tree:

1. Whether we have already deemed the tree not to be full;
2. Whether we have not yet reached any `Empty` subtree; and
3. Whether all `Empty` subtrees seen so far have the same depth;

If we are in the third state after processing the entire tree, then the tree is full.

We can use the type

``Maybe (Maybe Int)``

to represent these three states, respectively, as `Nothing`, `Just Nothing`, and `Just maxDepth`. (Besides not having to define a custom datatype with three data constructors just for the purpose of this function, another benefit of this encoding is that we sometimes get to type "`Just Nothing`".)

We will define a function called `traverse` that will do the heavy lifting, transforming the input state `acc` into an output state based on the given `tree` at depth `depth`:

``````traverse : Maybe (Maybe Int) -> Int -> Tree a -> Maybe (Maybe Int)
traverse acc depth tree =
...``````

Once finished, we will define `isFull` by calling `traverse` and checking the final state:

``````isFullOnePass : Tree a -> Bool
isFullOnePass t =
case traverse (Just Nothing) 0 t of
Just (Just _) -> True
_             -> False``````

Okay, let's start implementing the different cases for `traverse`:

``````traverse acc depth tree =
case (acc, tree) of``````

Let's first handle the case where the input `acc` is already the "error" state, `Nothing`, which we just propagate.

``    (Nothing, _) -> Nothing``

Next are two cases for reaching an `Empty` tree. If have not reached a leaf `Node` yet (`acc` = `Nothing`), then we record the current `depth` that all paths from the root to `Empty` need to have. If we have seen an `Empty` subtree already, then we compare the `maxDepth` of previous paths to the current `depth`.

``````    (Just Nothing, Empty) -> Just (Just depth)
(Just (Just maxDepth), Empty) ->
if depth == maxDepth then acc else Nothing``````

Lastly is the case for `Node`, where we recursively process the children and, if they both are full subtrees, check that their depths `n` and `m` are equal.

``````    (acc, Node _ t1 t2) ->
let recurse = traverse acc (depth+1) in
case recurse t1 of
Nothing -> Nothing
Just Nothing -> Nothing
Just (Just n) ->
case recurse t2 of
Nothing -> Nothing
Just Nothing -> Nothing
Just (Just m) -> if n == m then Just (Just n) else Nothing``````

Excellent, we have traversed the `Tree` in a single pass and checked for fullness.

### And Then

In the recursive case above, the "error plumbing" code is untidy. One attempted remedy is the following:

``````      case (recurse t1, recurse t2) of
(Just (Just n), Just (Just m)) -> if n == m
then Just (Just n)
else Nothing
_                              -> Nothing``````

Perhaps this version is more readable, but it no longer shortcuts the second recursive call when its result is not needed.

Instead, we can factor out a general pattern of "chaining" together operations on `Maybe` values:

``````maybeAndThen : Maybe a -> (a -> Maybe b) -> Maybe b
maybeAndThen mx f =
case mx of
Nothing -> Nothing
Just x  -> f x``````

This would help get rid of one level of the plumbing code above. To help with the second, we can essentially repeat the process:

``````maybeMaybeAndThen : Maybe (Maybe a) -> (a -> Maybe (Maybe b)) -> Maybe (Maybe b)
maybeMaybeAndThen mmx f =
case mmx of
Nothing -> Nothing
Just mx -> mx `maybeAndThen` f``````

All this abstraction then allows us to chain multiple `Maybe` computations together using what looks like straight-line code:

``````    (acc, Node _ t1 t2) ->
let recurse = traverse acc (depth+1) in
recurse t1 `maybeMaybeAndThen` \n ->
recurse t2 `maybeMaybeAndThen` \m ->
if n == m then Just (Just n) else Nothing``````

A couple more helper functions...

``````maybeMaybeGuard : Bool -> Maybe (Maybe ())
maybeMaybeGuard b =
if b
then Just (Just ())
else Nothing

maybeMaybeReturn : a -> Maybe (Maybe a)
maybeMaybeReturn x = Just (Just x)``````

... allow us to refactor once more in style:

``````    (acc, Node _ t1 t2) ->
let recurse = traverse acc (depth+1) in
recurse t1                `maybeMaybeAndThen`  \n ->
recurse t2                `maybeMaybeAndThen`  \m ->
maybeMaybeGuard (n == m)  `maybeMaybeAndThen`  \_ ->
maybeMaybeReturn n``````

Much better (especially if we chose some shorter names)!

This "and then" pattern for chaining together expressions is rather common and comes with some library support:

``````Maybe.andThen        : Maybe a    -> (a -> Maybe b)    -> Maybe b
Result.andThen       : Result x a -> (a -> Result x b) -> Result x b