final1_sol.txt
CMSC 22300, Spring 2012
Solutions for Take Home Final (for graduating students)
Due Midnight, Wednesday, May 30.

1. [20] Prove that the Maybe and Reader monads satisfy the algrebraic
laws for monads.  The proofs proceed by expanding the definitions
of the monad operations and symbolic evaluation.  The laws are:

  1. return x >>= f    ==  f x
  2. mv >>= return     ==  mv
  3. (mv >>= f) >>= g  ==  mv >>= (\x -> (f x >>= g))

----------------------------------------------------------------------
  An alternate composition operator for monads is:

    (>=>) :: (a -> m b) -> (b -> m c) -> (a -> m c)
    f >=> g  =  \x -> (f x >>= g)

  In terms of this operator, the monad rules turn out to have the
  simpler forms:

  1. return >=> f       ==    f
  2. f >=> return       ==    f
  3. (f >=> g) >=> h    ==    f >=> (g >=> h)

  If you prefer, you can verify these versions of the laws, using
  definitions of >=> derived from the corresponding definition of
  >>= for Maybe and Reader.
----------------------------------------------------------------------
Solution
----------------------------------------------------------------------

Maybe monad:
  The definition of Maybe is:

data  Maybe a  =  Nothing | Just a
   deriving (Eq, Ord, Read, Show)

instance  Monad Maybe  where
    (Just x) >>= k   =  k x      -- (1)
    Nothing  >>= k   =  Nothing  -- (2)
    return           =  Just     -- (3)
    fail s           =  Nothing  -- (4)

1. return x >>= f
   == (Just x) >>= f (by 3)
   == f x            (by 1)


2. (a) Case mv = Just x

   (Just x) >>= return
   == return x             (1)
   == Just x               (3)

   (b) Case mv = Nothing
   Nothing >>= return
   == Nothing              (2)

   In either case, mv >>= return == mv.


3. (a) Case mv = Just x
   ((Just x) >>= f) >>= g
   == (f x) >>= g                        (1)
   == (\y -> (f y >>= g)) x              (beta-expansion)
   == (Just x) >>= (\y -> (f y >>= g))   (1)

   (b) Case mv = Nothing
   (Nothing >>= f) >>= g
   == Nothing >>= g                      (2)
   == Nothing                            (2)
   == Nothing >>= (\x -> (f x >>= g))    (2)

----------------------------------------------------------------------

Reader monad:
  The definition of Reader is:

type Instream = [Char]

data Reader a = R (Instream -> Maybe (a, Instream))

instance Monad Reader where
  return v = R (\ins -> Just (v, ins))           -- (1)
  R p >>= f = R (\ins ->
                  case p ins of
                    Nothing -> Nothing
                    Just (v, ins') ->
                      let R q = f v in q ins')   -- (2)
  fail s = R (\ins -> Nothing)

1. return x >>= f
   == R (\s -> Just (x, s)) >>= f                (1)
   == R (\ins -> case (\s -> Just (x, s)) ins of
                   Nothing -> Nothing
                   Just (v, ins') ->
                     let R q = f v in q ins')    (2)
   == R (\ins -> case Just (x, ins) of
                   Nothing -> Nothing
                   Just (v, ins') ->
                     let R q = f v in q ins')    (beta-reduction)
   == R (\ins -> let R q = f x in q ins)         (symbolic case reduction)
   == R (\ins -> (let R q = f x in q) ins)       (distributing applicaton across let)
   == R (let R q = f x in q)                     (eta-contraction)
   == let R q = f x in R q                       (distributing R into a let)
   == f x                                        (a "beta-reduction" variant)


2. (R p) >>= return
   == R (\ins ->
           case p ins of
             Nothing -> Nothing
             Just (v, ins') ->
               let R q = return v in q ins')
   == R (\ins ->
           case p ins of
             Nothing -> Nothing
             Just (v, ins') ->
               let R q = R(\s -> Just(v,s)) in q ins')
   == R (\ins ->
           case p ins of
             Nothing -> Nothing
             Just (v, ins') ->
               let q = (\s -> Just(v,s)) in q ins')
   == R (\ins ->
           case p ins of
             Nothing -> Nothing
             Just (v, ins') -> Just(v,ins')
   == R (\ins -> p ins)
   == R p

 
3. ((R p) >>= f) >>= g
   == R (\ins ->
           case p ins of
             Nothing -> Nothing
             Just (v, ins') ->
               let R q = f v in q ins')
      >>= g
   == R (\ins ->
          let p1 = \s ->
	             case p s of
          	        Nothing -> Nothing
			Just (v, s') ->
			  let R q = f v in q s'
           in case p1 ins of
                Nothing -> Nothing
                Just (v, ins') ->
                  let R q' = g v in q' ins')
   == R (\ins ->
          let x = case p ins of
          	    Nothing -> Nothing
		    Just (v, s') ->
		      let R q = f v in q s'
           in case x of
                Nothing -> Nothing
                Just (v, ins') ->
                  let R q' = g v in q' ins')
   == R (\ins ->
          case p ins of
            Nothing -> Nothing
            Just (v, s') ->
	      case (let R q = f v in q s') of
	        Nothinig -> Nothing
                Just (v', ins') ->
                  let R q' = g v' in q' ins')
   == R (\ins ->
          case p ins of
            Nothing -> Nothing
            Just (v, s') ->
	      let R q = f v
               in case q s' of
	            Nothinig -> Nothing
                    Just (v', ins') ->
                      let R q' = g v' in q' ins')
      -- this step is a bit nonobvious, but can be justified
   == R (\ins ->
          case p ins of
            Nothing -> Nothing
            Just (v, s') ->
	      let R q = (f v >> g) 
               in q s')
   == R (\ins ->
          case p ins of
            Nothing -> Nothing
            Just (v, s') ->
	      let R q = (\x -> (f x >> g)) v
               in q s')
   == R p >>= (\x -> (f x >> g))


2. [15] Option monad (option/Maybe chaining)

In SML, define monad operatons (>>= (or "chain") and return) for the
option type constructor (the ML analogue of the Maybe monad). Use
these to rewrite the following nested explicit case analysis in a
streamlined form using option monad operations:

  fun repl env =
      case TextIO.inputLine TextIO.stdIn
	 of NONE => NONE
	  | SOME line =>
	    (case (T.tokenize(S.fromString line))
	       of NONE => NONE 
		 | SOME (toks,_) =>
		   (case (P.stmt toks)
		      of NONE => NONE
		       | SOME(s,_) =>
			  let val (res,env') = V.evalStmt env s
			   in printStmt res;
			      repl env'
			  end))

where

   TextIO.inputLine : TextIO.instream -> string option
   T.tokenize : token list reader
   P.stmt : expr parser

   type 'a reader = char stream -> ('a * char stream) option
   type 'a parser = token list -> ('a * token list) option

[This is a simplified version of the REPL function for the
calculator. You do not need to compile and test your solution.]
----------------------------------------------------------------------
  (* version of chain for the option tycon *)
  (* chainO : 'a option * ('a -> 'b option) -> 'b option
  fun chainO (NONE, f) = NONE
    | chainO (SOME x, f) = f x

The translation of the repl function would then look something like:

  (* repl : env -> unit option *)
  fun repl env : unit option =
      chainO (TextIO.inputLine TextIO.stdIn) (fn line => 
        chainO (T.tokenize(S.fromString line)) (fn (toks,_) =>
          chainO (P.stmt toks) (fn (s, _) =>
	    let val (res,env') = V.evalStmt env s
	     in printStmt res;
	        repl env'
	    end)))


3. [30] Regular expressions (over characters) can be represented by the
following data type:

   data RE = Empty       -- the empty r.e., matches immediately
                         --   without consuming any characters
           | C Char      -- singleton character: C c matches the
                         --   character c
           | Seq RE RE   -- sequencial composition: Seq A B matches A
	                  --   followed by B  (normally written AB)
           | Alt RE RE   -- alternative composition: Alt A B matches 
                         --   either A or B (normally written A|B)
           | Star RE     -- Kleene star: Star R matches zero or more
                         --   repetitions of R (normally written R*)

We can define various utility functions to build regular expression
values, for instance, (reChars cs) matches any character in the list cs:

   reChars :: [Char] -> RE
   reChars = foldr (\c -> \re -> Alt (C c) re) Empty

   digits :: RE
   digits = reChars ['0'..'9']

   reString :: String -> RE
   reString = foldr (\c -> \re -> Seq (C c) re) Empty

Write a function

   recognize :: RE -> Reader String

translating an RE value into a reader monad action such that
(recognize re) returns a string that is the longest prefix of
the input stream that matches the regular expression. Thus if
the input stream starts with ['a','b','c','d',...] and the
regular expression is (a | ab), the reader should finish with
Just("ab",['c',...]).

Note that this is rather similar to a tokenizer reader, except that we
are looking for the longest match rather than the first match in
alternatives.
-------------
Use file name "prob3.hs" (or "prob3.sml" if you prefer to do the
problem in SML).
----------------------------------------------------------------------
Solution:

Any regular expression can be translated into a grammar of
the sort that we have dealt with before, with a rule form for
each kind of regular expression:

   S ::=                 -- Empty

   S ::= c               -- C c

   S ::= S1 S2           -- Seq S1 S2

   S ::= S1
   S ::= S2
or
   S ::= S1 | S2         -- Alt S1 S2

   S ::= S1 S            -- S1*
   S ::= 

These rules (or, more directly, the corresponding RE terms) can be
translated into Reader actions as we have seen in the definition of a
tokenizer:

   rdr :: RE -> Reader [Char]
   rdr Empty  =  R(\instream -> Just("",instream))
   rdr (C c)  =  checkString [c]
   rdr (Seq re1 re2) =
     do s1 <- rdr re1
        s2 <- rdr re2
        return (s1 ++ s2)
   rdr (Alt re1 re2) =
     choice (rdr re1) (rdr re2)
   rdr (Star re) = 
     do strings <- star (rdr re)
        return (concat strings)

This seems very simple and it does produce a reader action that
will produce a string (a prefix of the initial instream) that matches
the regular expression, if possible.  But if the problem is to produce
the longest possible match, things get much more interesting.

The issue is that our reader actions are sometimes too lazy, and
sometimes to eager to produce the longest possible match.

Consider the regular expression

   ab | abc  ( = Alt (Seq (C a) (C b)) (Seq (C a) (Seq (C b) (C c))) )

which translates into something like

   checkString "ab" +++ checkString "abc"

When run on the input "abcd", this reader will succeed, producing 
Just ("ab", "cd").  But it is clear that the longest input prefix
matching the r.e. is "abc". The problem is that the choice operator
is accepting the first successful match, even though the second option
could match a longer segment.

On the other hand, consider the r.e.

    (a*)(ab)

This translates into a reader equivalent to the following pseudo-code:

    do s1 <- rdr(a*)
       s2 <- rdr(ab)
       return (s1 ++ s2)

The problem here is that the first reader will read as many a's as
possible, leaving none to match "ab".  For instance, if the input
is aaab, the first reader will read aaa, leaving b, and then the
second reader (rdr(ab)) will fail because it doesn't find an a.

So how do we manage to find the longest possible match for any r.e.?
Essentially, we have to resort to a new "nondeterministic" kind of
reader which produces all possible matches for a r.e., and then
show how to combine these (chaining, alternating, staring).

See the file final1_prob3.hs for a definition of a nondeterministic
r.e. matcher based on a nondeterministic version of the reader monad
(NReader).


4. [15] Give a fully detailed manual typing of the following
definition (in the style of "working-typing-examples.txt" on
the Handouts page):

   fun f (x,p,u) =
       let fun g (y,z) = if p(y) then z else x::nil
	in g (3,u)
       end;

----------------------------------------------------------------------
Solution:

[@ f function header]  -- new tyvars A, X, P, U
env: F f: A; F x: X; F p: P; F u: U

Note: f is included in env because in general a "fun" declaration
is recursive, so f could potentially occur on the RHS of its
definition. We give it an F-style binding because it is notionally
a lambda-bound varible in the generator functional for the
fixed-point (Y) combinator. We will eventually give it a let-style
binding (with a generalized, polymorphic type, when we have completed
the typing of its definition.

[@ (f (x,p,u))]
(x,p,u) : X * P * U  -- tuple rule
f (x,p,u): V         -- new V, the type of the function header
A = X*P*U -> V       -- application rule
env: F f: X*P*U -> V; F x: X; F p: P; F u: U

[@ g function header] 
env: F f: X*P*U -> V; F x: X; F p: P; F u: U;
     F g: G; F y: Y; F z: Z
     -- new G, Y, Z as types of g, y, z

[@ (g (y,z))]
(y,z) : Y * Z  -- variable rule, pair rule
g (y,z): W     -- new W
G = Y*Z -> W   -- application rule (for pattern)
env: F x: X; F p: P; F u: U; F f: X*P*U -> V; F g: Y*Z -> W; F y: Y; F z: Z

[@ p(y)]
p(y): R        -- new R, the unknown type of p(y)
P = Y -> R     -- application rule

[@ z]
z : Z          -- variable rule, env(z) = Z

[@ x::nil]
:: : S * S list -> S list   -- instantiate polymorphic type of ::
                               (∀α. α * α list → α list) in base environment
nil : T list                -- instantiate polymorphic type of nil : ∀α.α list
                               from base environment
x::nil: X list              -- application rule, pairing rule
  S = X
  T = X

[@ if p(y) then z else x::nil]
(1) Equate type of p(y) and bool
  R = bool
(2) Equate types of z and x::nil
  Z = X list
if p(y) then z else x::nil : X list    -- if rule
env: F f: X*(Y->bool)*U -> V; F x: X; F p: Y -> bool; F u: U;
     F g: Y*(X list) -> W; F y: Y; F z: (X list) 

[@ g definition]
  W = X list       -- return type of g equated with type of rhs
G : (Y * X list) -> X list
env: F x: X; F p: Y -> bool; F u: U; F f: X*(Y -> bool)*U -> V
     [drop g, y, z bindings]

[@ g let-binding]
Generalize type of g
G : (Y * X list) -> X list  [X and Y occur in env, so can't be generalized]
env: F f: X*(Y -> bool)*U -> V; F x: X; F p: Y -> bool; F u: U;
     L g: Y*(X list) -> X list

[@ (g (3,u))]
(3,u) : int * U   -- const rule, var rule (for u), pair rule
  int * U = Y * (X list)   -- equate arg type with domain type of g
  --> Y = int
      U = X list
(3,u) : int * X list
g(3,u) = X list   -- application rule
env: F f: X*(int -> bool)*(X list) -> V; F x: X; F p: int -> bool; F u: X list;
     L g: int*(X list) -> (X list)

[@ let exp]
let ... end : X list   -- let rule
env: F f: X*(int -> bool)*(X list) -> V; F x: X; F p: int -> bool; F u: X list;
  [drop g binding]

[@ f definition]
  V = X list   -- equate return type of f with type of rhs
  f: X*(int -> bool)*(X list) -> X list; 
env:  

[@ f let-binding]
generalize type of f, switch to L binding:
env: L f: ∀α. α  * (int->bool) * (α list) -> α list
----------------------------------------------------------------------


5. [30] Lambda Calculus
The pure lambda calculus can be represented as follows:

type var = string

datatype exp
  = Var of var
  | Abs of var * exp
  | App of exp * exp

Examples are 

    x       (Var "x")
    λx.x    (Abs ("x", Var "x"))
    x y     (App (Var "x") (Var "y"))
    (λx.λy.x)(λu.u)
            (App(Abs("x",Abs("y",Var "x")), Abs("u",Var "u")))

A β-redex is an expression of the form (λx.M)N, and it can be
reduced by "β-reduction" to [N/x]M (substitute the argument
expression N for x in the body expression M):

    (λx.M)N  -->β  [N/x]M

Care must be taken when defining substitution to avoid free
variable capture, as in:

   [λx.y / u]λy.uy  =  λy.(λx.y)y   (Wrong!)

We use α-conversion (change of bound variables) to avoid the
capture:

   [λx.y / u]λy.uy  =  [λx.y / u]λz.uz    (λy.uy == λz.uz)
                    =  λz.(λx.y)z

An expression is in normal form if it contains no β-redexes
(i.e. subexpressions of the form (λx.M)N).

Implement the following (in SML):

a) capture avoiding substitution:

    val subst : var * exp * exp -> exp  (subst(x,N,M) = [N/x]M)

b) test for normal form:

    val isNormal : exp -> bool

And, for 10 points extra credit:

c) an evaluator that uses β-reduction repeatedly to reduce
an expression to normal form:

    val eval : exp -> exp

This should use "normal order reduction", which means that the
left-most, outermost β-redex should be reduced at each step.
[Note: The eval function will not always terminate.  Can you
give an example which does not?]

----------------------------------------------------------------------
Solution:

See final1_prob5.sml.