final1_sol.txt CMSC 22300, Spring 2012 Solutions for Take Home Final (for graduating students) Due Midnight, Wednesday, May 30. 1. [20] Prove that the Maybe and Reader monads satisfy the algrebraic laws for monads. The proofs proceed by expanding the definitions of the monad operations and symbolic evaluation. The laws are: 1. return x >>= f == f x 2. mv >>= return == mv 3. (mv >>= f) >>= g == mv >>= (\x -> (f x >>= g)) ---------------------------------------------------------------------- An alternate composition operator for monads is: (>=>) :: (a -> m b) -> (b -> m c) -> (a -> m c) f >=> g = \x -> (f x >>= g) In terms of this operator, the monad rules turn out to have the simpler forms: 1. return >=> f == f 2. f >=> return == f 3. (f >=> g) >=> h == f >=> (g >=> h) If you prefer, you can verify these versions of the laws, using definitions of >=> derived from the corresponding definition of >>= for Maybe and Reader. ---------------------------------------------------------------------- Solution ---------------------------------------------------------------------- Maybe monad: The definition of Maybe is: data Maybe a = Nothing | Just a deriving (Eq, Ord, Read, Show) instance Monad Maybe where (Just x) >>= k = k x -- (1) Nothing >>= k = Nothing -- (2) return = Just -- (3) fail s = Nothing -- (4) 1. return x >>= f == (Just x) >>= f (by 3) == f x (by 1) 2. (a) Case mv = Just x (Just x) >>= return == return x (1) == Just x (3) (b) Case mv = Nothing Nothing >>= return == Nothing (2) In either case, mv >>= return == mv. 3. (a) Case mv = Just x ((Just x) >>= f) >>= g == (f x) >>= g (1) == (\y -> (f y >>= g)) x (beta-expansion) == (Just x) >>= (\y -> (f y >>= g)) (1) (b) Case mv = Nothing (Nothing >>= f) >>= g == Nothing >>= g (2) == Nothing (2) == Nothing >>= (\x -> (f x >>= g)) (2) ---------------------------------------------------------------------- Reader monad: The definition of Reader is: type Instream = [Char] data Reader a = R (Instream -> Maybe (a, Instream)) instance Monad Reader where return v = R (\ins -> Just (v, ins)) -- (1) R p >>= f = R (\ins -> case p ins of Nothing -> Nothing Just (v, ins') -> let R q = f v in q ins') -- (2) fail s = R (\ins -> Nothing) 1. return x >>= f == R (\s -> Just (x, s)) >>= f (1) == R (\ins -> case (\s -> Just (x, s)) ins of Nothing -> Nothing Just (v, ins') -> let R q = f v in q ins') (2) == R (\ins -> case Just (x, ins) of Nothing -> Nothing Just (v, ins') -> let R q = f v in q ins') (beta-reduction) == R (\ins -> let R q = f x in q ins) (symbolic case reduction) == R (\ins -> (let R q = f x in q) ins) (distributing applicaton across let) == R (let R q = f x in q) (eta-contraction) == let R q = f x in R q (distributing R into a let) == f x (a "beta-reduction" variant) 2. (R p) >>= return == R (\ins -> case p ins of Nothing -> Nothing Just (v, ins') -> let R q = return v in q ins') == R (\ins -> case p ins of Nothing -> Nothing Just (v, ins') -> let R q = R(\s -> Just(v,s)) in q ins') == R (\ins -> case p ins of Nothing -> Nothing Just (v, ins') -> let q = (\s -> Just(v,s)) in q ins') == R (\ins -> case p ins of Nothing -> Nothing Just (v, ins') -> Just(v,ins') == R (\ins -> p ins) == R p 3. ((R p) >>= f) >>= g == R (\ins -> case p ins of Nothing -> Nothing Just (v, ins') -> let R q = f v in q ins') >>= g == R (\ins -> let p1 = \s -> case p s of Nothing -> Nothing Just (v, s') -> let R q = f v in q s' in case p1 ins of Nothing -> Nothing Just (v, ins') -> let R q' = g v in q' ins') == R (\ins -> let x = case p ins of Nothing -> Nothing Just (v, s') -> let R q = f v in q s' in case x of Nothing -> Nothing Just (v, ins') -> let R q' = g v in q' ins') == R (\ins -> case p ins of Nothing -> Nothing Just (v, s') -> case (let R q = f v in q s') of Nothinig -> Nothing Just (v', ins') -> let R q' = g v' in q' ins') == R (\ins -> case p ins of Nothing -> Nothing Just (v, s') -> let R q = f v in case q s' of Nothinig -> Nothing Just (v', ins') -> let R q' = g v' in q' ins') -- this step is a bit nonobvious, but can be justified == R (\ins -> case p ins of Nothing -> Nothing Just (v, s') -> let R q = (f v >> g) in q s') == R (\ins -> case p ins of Nothing -> Nothing Just (v, s') -> let R q = (\x -> (f x >> g)) v in q s') == R p >>= (\x -> (f x >> g)) 2. [15] Option monad (option/Maybe chaining) In SML, define monad operatons (>>= (or "chain") and return) for the option type constructor (the ML analogue of the Maybe monad). Use these to rewrite the following nested explicit case analysis in a streamlined form using option monad operations: fun repl env = case TextIO.inputLine TextIO.stdIn of NONE => NONE | SOME line => (case (T.tokenize(S.fromString line)) of NONE => NONE | SOME (toks,_) => (case (P.stmt toks) of NONE => NONE | SOME(s,_) => let val (res,env') = V.evalStmt env s in printStmt res; repl env' end)) where TextIO.inputLine : TextIO.instream -> string option T.tokenize : token list reader P.stmt : expr parser type 'a reader = char stream -> ('a * char stream) option type 'a parser = token list -> ('a * token list) option [This is a simplified version of the REPL function for the calculator. You do not need to compile and test your solution.] ---------------------------------------------------------------------- (* version of chain for the option tycon *) (* chainO : 'a option * ('a -> 'b option) -> 'b option fun chainO (NONE, f) = NONE | chainO (SOME x, f) = f x The translation of the repl function would then look something like: (* repl : env -> unit option *) fun repl env : unit option = chainO (TextIO.inputLine TextIO.stdIn) (fn line => chainO (T.tokenize(S.fromString line)) (fn (toks,_) => chainO (P.stmt toks) (fn (s, _) => let val (res,env') = V.evalStmt env s in printStmt res; repl env' end))) 3. [30] Regular expressions (over characters) can be represented by the following data type: data RE = Empty -- the empty r.e., matches immediately -- without consuming any characters | C Char -- singleton character: C c matches the -- character c | Seq RE RE -- sequencial composition: Seq A B matches A -- followed by B (normally written AB) | Alt RE RE -- alternative composition: Alt A B matches -- either A or B (normally written A|B) | Star RE -- Kleene star: Star R matches zero or more -- repetitions of R (normally written R*) We can define various utility functions to build regular expression values, for instance, (reChars cs) matches any character in the list cs: reChars :: [Char] -> RE reChars = foldr (\c -> \re -> Alt (C c) re) Empty digits :: RE digits = reChars ['0'..'9'] reString :: String -> RE reString = foldr (\c -> \re -> Seq (C c) re) Empty Write a function recognize :: RE -> Reader String translating an RE value into a reader monad action such that (recognize re) returns a string that is the longest prefix of the input stream that matches the regular expression. Thus if the input stream starts with ['a','b','c','d',...] and the regular expression is (a | ab), the reader should finish with Just("ab",['c',...]). Note that this is rather similar to a tokenizer reader, except that we are looking for the longest match rather than the first match in alternatives. ------------- Use file name "prob3.hs" (or "prob3.sml" if you prefer to do the problem in SML). ---------------------------------------------------------------------- Solution: Any regular expression can be translated into a grammar of the sort that we have dealt with before, with a rule form for each kind of regular expression: S ::= -- Empty S ::= c -- C c S ::= S1 S2 -- Seq S1 S2 S ::= S1 S ::= S2 or S ::= S1 | S2 -- Alt S1 S2 S ::= S1 S -- S1* S ::= These rules (or, more directly, the corresponding RE terms) can be translated into Reader actions as we have seen in the definition of a tokenizer: rdr :: RE -> Reader [Char] rdr Empty = R(\instream -> Just("",instream)) rdr (C c) = checkString [c] rdr (Seq re1 re2) = do s1 <- rdr re1 s2 <- rdr re2 return (s1 ++ s2) rdr (Alt re1 re2) = choice (rdr re1) (rdr re2) rdr (Star re) = do strings <- star (rdr re) return (concat strings) This seems very simple and it does produce a reader action that will produce a string (a prefix of the initial instream) that matches the regular expression, if possible. But if the problem is to produce the longest possible match, things get much more interesting. The issue is that our reader actions are sometimes too lazy, and sometimes to eager to produce the longest possible match. Consider the regular expression ab | abc ( = Alt (Seq (C a) (C b)) (Seq (C a) (Seq (C b) (C c))) ) which translates into something like checkString "ab" +++ checkString "abc" When run on the input "abcd", this reader will succeed, producing Just ("ab", "cd"). But it is clear that the longest input prefix matching the r.e. is "abc". The problem is that the choice operator is accepting the first successful match, even though the second option could match a longer segment. On the other hand, consider the r.e. (a*)(ab) This translates into a reader equivalent to the following pseudo-code: do s1 <- rdr(a*) s2 <- rdr(ab) return (s1 ++ s2) The problem here is that the first reader will read as many a's as possible, leaving none to match "ab". For instance, if the input is aaab, the first reader will read aaa, leaving b, and then the second reader (rdr(ab)) will fail because it doesn't find an a. So how do we manage to find the longest possible match for any r.e.? Essentially, we have to resort to a new "nondeterministic" kind of reader which produces all possible matches for a r.e., and then show how to combine these (chaining, alternating, staring). See the file final1_prob3.hs for a definition of a nondeterministic r.e. matcher based on a nondeterministic version of the reader monad (NReader). 4. [15] Give a fully detailed manual typing of the following definition (in the style of "working-typing-examples.txt" on the Handouts page): fun f (x,p,u) = let fun g (y,z) = if p(y) then z else x::nil in g (3,u) end; ---------------------------------------------------------------------- Solution: [@ f function header] -- new tyvars A, X, P, U env: F f: A; F x: X; F p: P; F u: U Note: f is included in env because in general a "fun" declaration is recursive, so f could potentially occur on the RHS of its definition. We give it an F-style binding because it is notionally a lambda-bound varible in the generator functional for the fixed-point (Y) combinator. We will eventually give it a let-style binding (with a generalized, polymorphic type, when we have completed the typing of its definition. [@ (f (x,p,u))] (x,p,u) : X * P * U -- tuple rule f (x,p,u): V -- new V, the type of the function header A = X*P*U -> V -- application rule env: F f: X*P*U -> V; F x: X; F p: P; F u: U [@ g function header] env: F f: X*P*U -> V; F x: X; F p: P; F u: U; F g: G; F y: Y; F z: Z -- new G, Y, Z as types of g, y, z [@ (g (y,z))] (y,z) : Y * Z -- variable rule, pair rule g (y,z): W -- new W G = Y*Z -> W -- application rule (for pattern) env: F x: X; F p: P; F u: U; F f: X*P*U -> V; F g: Y*Z -> W; F y: Y; F z: Z [@ p(y)] p(y): R -- new R, the unknown type of p(y) P = Y -> R -- application rule [@ z] z : Z -- variable rule, env(z) = Z [@ x::nil] :: : S * S list -> S list -- instantiate polymorphic type of :: (∀α. α * α list → α list) in base environment nil : T list -- instantiate polymorphic type of nil : ∀α.α list from base environment x::nil: X list -- application rule, pairing rule S = X T = X [@ if p(y) then z else x::nil] (1) Equate type of p(y) and bool R = bool (2) Equate types of z and x::nil Z = X list if p(y) then z else x::nil : X list -- if rule env: F f: X*(Y->bool)*U -> V; F x: X; F p: Y -> bool; F u: U; F g: Y*(X list) -> W; F y: Y; F z: (X list) [@ g definition] W = X list -- return type of g equated with type of rhs G : (Y * X list) -> X list env: F x: X; F p: Y -> bool; F u: U; F f: X*(Y -> bool)*U -> V [drop g, y, z bindings] [@ g let-binding] Generalize type of g G : (Y * X list) -> X list [X and Y occur in env, so can't be generalized] env: F f: X*(Y -> bool)*U -> V; F x: X; F p: Y -> bool; F u: U; L g: Y*(X list) -> X list [@ (g (3,u))] (3,u) : int * U -- const rule, var rule (for u), pair rule int * U = Y * (X list) -- equate arg type with domain type of g --> Y = int U = X list (3,u) : int * X list g(3,u) = X list -- application rule env: F f: X*(int -> bool)*(X list) -> V; F x: X; F p: int -> bool; F u: X list; L g: int*(X list) -> (X list) [@ let exp] let ... end : X list -- let rule env: F f: X*(int -> bool)*(X list) -> V; F x: X; F p: int -> bool; F u: X list; [drop g binding] [@ f definition] V = X list -- equate return type of f with type of rhs f: X*(int -> bool)*(X list) -> X list; env: [@ f let-binding] generalize type of f, switch to L binding: env: L f: ∀α. α * (int->bool) * (α list) -> α list ---------------------------------------------------------------------- 5. [30] Lambda Calculus The pure lambda calculus can be represented as follows: type var = string datatype exp = Var of var | Abs of var * exp | App of exp * exp Examples are x (Var "x") λx.x (Abs ("x", Var "x")) x y (App (Var "x") (Var "y")) (λx.λy.x)(λu.u) (App(Abs("x",Abs("y",Var "x")), Abs("u",Var "u"))) A β-redex is an expression of the form (λx.M)N, and it can be reduced by "β-reduction" to [N/x]M (substitute the argument expression N for x in the body expression M): (λx.M)N -->β [N/x]M Care must be taken when defining substitution to avoid free variable capture, as in: [λx.y / u]λy.uy = λy.(λx.y)y (Wrong!) We use α-conversion (change of bound variables) to avoid the capture: [λx.y / u]λy.uy = [λx.y / u]λz.uz (λy.uy == λz.uz) = λz.(λx.y)z An expression is in normal form if it contains no β-redexes (i.e. subexpressions of the form (λx.M)N). Implement the following (in SML): a) capture avoiding substitution: val subst : var * exp * exp -> exp (subst(x,N,M) = [N/x]M) b) test for normal form: val isNormal : exp -> bool And, for 10 points extra credit: c) an evaluator that uses β-reduction repeatedly to reduce an expression to normal form: val eval : exp -> exp This should use "normal order reduction", which means that the left-most, outermost β-redex should be reduced at each step. [Note: The eval function will not always terminate. Can you give an example which does not?] ---------------------------------------------------------------------- Solution: See final1_prob5.sml.