CS221/321
Lecture 11, Nov 22, 2011

Section 6. Mutable Storage

We could consider mutable storage and assignments using three
different approaches:

(1) Move to a new elementary language: a "simple imperative
language". [You can read about this approach, including the addition of
constrol structures, variable bindings, and procedures, in Huttel,
"Transitions and Trees: An Introduction to Operational Semantics",
Chapters 3 through 7. A feature of this approach is the separation
of syntax constructions into expressions and "statements" or "commands".]

(2) Add mutable store and assignments using ref values (references)
as found in Standard ML. This is obviously the most direct way of
dealing with stores starting with the language PTFun that we already
have.

(3) Model stores and assignments using monads, as is done in "pure"
functional languages like Haskell.

I'll start with approach (2) and then talk about (3).

----------------------------------------------------------------------

References:

First lets treat Ref as a basic syntactic form, as we treated Fst,
Inl, etc. before introducing polymorphism. We will call this extension
TFun[*,+,Ref].

Syntactically, we add three new expression forms, corresponding to the
concrete syntax:

   ref(e)       -- creation of an initialized ref cell
   !e           -- dereferencing, returning the contents of a ref cell
   e1 := e2     -- assignment, updating the contents of a ref cell

For small-step semantics, we need to also add a new syntactic category
of "locations" (or memory addresses). 

   l ∈ Locations

These location expressions (actually constants) will not occur in
original programs, but will be introduced by reductions in the
small-step semantics.  Locations designate places in the mutable
memory where values can be stored.

A memory M will be a finite map from locations to values:

  M: Locations → Values

(Memories are often called "states", or sometimes "stores"). Evaluation
of expression will be done with respect to a current state, and the
evaluation of certain expressions, namely those involving assignments,
may modify the current state.

The notation M[l=v] denotes a modified memory M1 such that

    M'(l) = v
    M'(l') = M(l')  if l ≠ l'

The location l may be in the domain of M, in which case M[l=v] is 
a modified memory with the same domain as M, or l may be a "new"
location, l ∉ dom(M), in which case M[l=v] extends M to a larger
domain {l} ⋃ dom(M).


Example: achieving recursion by updating memory

    f = ref(λx:Int. x)       (f : Ref(Int→Int))

    fact = λx:Int. if x = 0 then 1 else x * !f(x - 1)  (fact: Int→Int)

    fact(4) = 4 * (λx.x)3 = 12

    f := fact

    fact(4) ==> 24


----------------------------------------------------------------------
Fig 6.1: Abstract Syntax of TFun[*,+,Ref]
----------------------------------------------------------------------

   l  ::=  _ (constants denoting members of Locations)

   τ  ::=  ... |  Ref(τ)

   e  ::=  ... |  Ref(e)  |  Dref(e) |  Set(e1,e2)  |  l

   v  ::=  ... |  l

----------------------------------------------------------------------


Typing expressions that include locations requires that we be able to
type locations. Locations resemble free variables (or primitive
constants), and so we will introduce a new kind of typing environment
for locations:

   Λ : Locations → Types

In any memory M compatible with Λ, given location l can only can
only contain values of type Λ(l):

   ⊦ M(l) : Λ(l)

Typing judgements must be modified to include a memory typing Λ
as well as a (free) variable typing Γ:

   Λ; Γ ⊦ e : τ 


----------------------------------------------------------------------
Fig 6.2: Typing Rules for TFun[*,+,Ref]
----------------------------------------------------------------------

	    Λ; Γ ⊦ e : τ
RT(1) ------------------------
       Λ; Γ ⊦ Ref(e) : Ref(τ)

	Λ; Γ ⊦ e : Ref(τ)
RT(2) --------------------
       Λ; Γ ⊦ Dref(e) : τ

	Λ; Γ ⊦ e1 : Ref(τ)    Λ; Γ ⊦ e2 : τ 
RT(3) -------------------------------------
           Λ; Γ ⊦ Set(e1,e2) : Unit

           Λ(l) = τ 
RT(4) -------------------
       Λ; Γ ⊦ l : Ref(τ)


Plus modified versions of previous typing rules with Λ added to the
contexts. For Instance:

          Γ(x) = τ 
RT(5)  --------------
        Λ; Γ ⊦ x : τ 


              Λ; Γ[x:τ1] ⊦ e : τ2
RT(6)  --------------------------------
         Λ; Γ ⊦ Fun(x,τ1,e2) : τ1 → τ2


         Λ; Γ ⊦ e1 : τ1 → τ    Λ; Γ ⊦ e2 : τ1
RT(7)  ----------------------------------------
               Λ; Γ ⊦ App(e1,e2) : τ


         Λ; Γ ⊦ e1 : Bool    Λ; Γ ⊦ e2 : τ    Λ; Γ ⊦ e3 : τ
RT(8)  -----------------------------------------------------
                     Λ; Γ ⊦ If(e1,e2,e3) : τ


----------------------------------------------------------------------
Notes:

In a derivation under these rules, the Λ context remains fixed for all
rules throughout the derivation. Another way of putting this is that
the scope of Λ is the whole expression. All location constants
appearing in the expression must be in the domain of Λ.
----------------------------------------------------------------------


Evaluation:
-----------

A small-step dynamic semantics must use a transition relation that
involves memories as well as expressions:

   (M, e) ↦ (M', e')

Transitions will always modify the expression (e' != e), and sometimes
the memory will also be modified.


----------------------------------------------------------------------
Fig 6.3: TFun[*,+,Ref][SSv] - Small-step CBV evaluation
----------------------------------------------------------------------

These are the new rules involving the new operators
Ref, DRef, and Set.

Search rules:

             (M,e) ↦ (M',e')
RE(1)  ------------------------------
        (M, Ref(e)) ↦ (M', Ref(e'))


               (M,e) ↦ (M',e')
RE(2)  --------------------------------
        (M, DRef(e)) ↦ (M', DRef(e'))


              (M,e1) ↦ (M',e1')
RE(3)  -------------------------------------
        (M, Set(e1,e2)) ↦ (M', Set(e1',e2))


               (M,e2) ↦ (M',e2')
RE(4)  -------------------------------------
        (M, Set(v1,e2)) ↦ (M', Set(v1,e2'))


Redex rules:

             (l = fresh(M))
RE(5)  --------------------------
        (M,Ref(v)) ↦ (M[l=v],l)


              (l ∈ dom(M))
RE(6)  -------------------------
        (M,DRef(l)) ↦ (M,M(l))


              (l ∈ dom(M))
RE(7)  ----------------------------
        (M,Set(l,v)) ↦ (M[l=v],())


We also inherit modified versions of the standard transition
rules for TFun, such as these rules for App:

         (v1 = Fun(x,τ,e); v2 ∈ Value)
RE(8)  ---------------------------------
        (M, App(v1,v2)) ↦ (M, [v2/x]e)


                 (M,e1) ↦ (M',e1')
RE(9)  -------------------------------------
         (M,App(e1,e2)) ↦ (M',App(e1',e2))


                  (M,e2) ↦ (M',e2')
RE(10)  -------------------------------------  (v1 ∈ Value)
          (M,App(v1,e2)) ↦ (M',App(v1,e2'))


----------------------------------------------------------------------


Type Soundness
--------------

We need to define a relation between memories M and location typings
Λ, that expresses the property that M "conforms to" Λ.

Defn 6.1: ⊦ M : Λ  iff 
  (1) dom(M) = dom(Λ)
  (2) ∀l ∈ dom(Λ). Λ;∅ ⊦ M(l): Λ(l)

That is, ⊦ M:Λ if they have the same set of locations as their
domains, and at each location, the value stored in M has the type
specified by Λ.

Defn 6.2: Λ ⊦ (M,e) : τ  iff  ⊦ M:Λ  &  Λ;∅ ⊦ e: τ 


Theorem 6.1 [Preservation]:
   Λ ⊦ (M,e): τ  ∧  (M,e) ↦ (M',e')  =>
     ∃Λ'. Λ ⊆ Λ' ∧  Λ' ⊦ (M',e'): τ.


Theorem 6.2 [Progress]: 
  Λ ⊦ (M,e) : τ  => e a value or ∃M',e'. dom(M) ⊆ dom(M') & (M,e) ↦ (M',e').


Note that in both of these statements, it is assumed that e is
closed w.r.t. variables, but e may contain "free" location names.
(In fact, all location names are free, since there is no construct
that "binds" a location name.)

We will need the usual Inversion Lemma for the new typing judgements,
which we will assume without stating it in detail.

Proof of Preservation:
----------------------
We assume the hypotheses:

(H1) (M,e) ↦ (M',e')
(H2) Λ ⊦ (M,e): τ   

and proceed by induction on the derivation of (H1).

Base Case: (H1) is derived using rule RE(5).
Then 
  (1) ∃v. e = Ref(v)             [source2 RE(5)]
  (2) ∃l. e' = l                 [target2 RE(5)]
  (3) l ∉ dom(M)  (l is fresh)   [constraint RE(5)]
  (4) M' = M[l=v]                [target1 RE(5)]

  (5) Λ ⊦ (M,Ref(v)) : τ         [(1),(H2)]

  (6) ∃τ'. τ = Ref(τ')           [Inversion RT(1)]
  (7) Λ ⊦ (M,v): τ'              [(5),(6)]
  (8) Λ ⊦ v: τ'                  [Defn 6.2]

  (9) Let Λ' = Λ[l: τ']         [defn Λ']
  (10) Λ ⊆ Λ'                  [(9), (3)]
  (11) ⊦ M : Λ                  [(5), Defn 6.2]
  (12) ⊦ M' : Λ'                [(9), Defn 6.1]

  (13) Λ' ⊦ l : Ref(τ')         [(9), RT(4)]
  (14) Λ' ⊦ (M',l): Ref(τ')     [(12), (13), Defn 6.1]

  (15) ∃Λ'. Λ ⊆ Λ' &  Λ' ⊦ (M',e'): τ   [(14),(2); QED]


Ind Case: (H1) is derived using rule RE(1).
Then 
  (1) ∃e1. e = Ref(e1)          [source2 RE(1)]
  (2) ∃e1'. e' = Ref(e1')       [target2 RE(1)]
  (3) (M,e1) ↦ (M',e1')         [premise RE(1)]

  (IH) ∀(Λ1,τ1).
       Λ1 ⊦ (M,e1): τ1 => ∃Λ1'. Λ1 ⊆ Λ1' &  Λ1' ⊦ (M',e1'): τ1

  (4) Λ ⊦ (M,Ref(e1)) : τ           [(1),(H2)]

  (5) ∃τ'. τ = Ref(τ')  ∧ 
  (6) Λ ⊦ (M,e1): τ'                [(4), Inversion RT(1)]

  (7) ∃Λ1'. Λ ⊆ Λ1' &  Λ1' ⊦ (M',e1'): τ'  [(6),(IH)]
  (8) Let Λ' be a witness for (7)  [∃ elim]
  (9) Λ ⊆ Λ'                       [(7),(8)]
  (10) Λ' ⊦ (M',e1'): τ'            [(7),(8)]

  (11) Λ' ⊦ (M',Ref(e1')): Ref(τ')  [(10),RT(1)]
  (12) Λ' ⊦ (M',e'): τ              [(1),(5)]

  (13) ∃Λ'. Λ ⊆ Λ' &  Λ' ⊦ (M',e'): τ  [(12),(2); QED]


The other cases are similar.  [XX]

----------------------------------------------------------------------

Proof of Theorem 6.2: Progress
------------------------------

We start with the hypothesis:

  (H) Λ ⊦ (M,e) : τ

By Definition 6.2, this expands into a pair of hypotheses:

  (H1) ⊦ M : Λ 
  (H2) Λ; ∅ ⊦ e : τ

The proof proceeds by induction on the derivation of (H2).

Base Case: (H2) by rule RT(4).
  (1) e = l for some location l, by Case Hyp.
  (2) e is a value, by defn of value [X]

Base Case: (H2) by rule RT(5).
  This is impossible, since Γ = ∅.

Ind. Case: (H2) by rule RT(1).
  (1) e = Ref(e1), and
  (2) τ = Ref(τ1)  by Case Hyp., where
  (3) Λ; ∅ ⊦ e1 : τ1  by Inversion of RT(1)

  (IH) e1 is a value or  (M,e1) ↦ (M',e1')

  Case (IH1): e1 is a value.
    (5) e1 = l1 for some location l1, by Canonical Forms Lemma(*)
    (6) (M,Ref(e1)) ↦ (M,v)  where v = M(l).  [X]

  Case (IH2): (M,e1) ↦ (M',e1').
    (7) (M, Ref(e1)) ↦ (M', Ref(e1')), by RE(1)
    (8) (M, e) ↦ (M',e') where e' = Ref(e1'), by (1), (7). [X]

Ind. Case: (H2) by RT(2).
   This is similar to the RT(1) case.

Ind. Case: (H2) by RT(7).
  (1) e = App(e1,e2) by Case Hyp.
  (2) Λ; ∅ ⊦ e1 : τ1 → τ for some τ1, and
  (3) Λ; ∅ ⊦ e2 : τ1 by Inversion of RT(7)

  (IH1) e1 a value or (M,e1) ↦ (M',e1')
  (IH2) e2 a value or (M,e2) ↦ (M',e2')

  Case (IH1a) e1 a value
    (4) e1 = Fun(x,τ1,e3), by Cannonical Forms Lemma
    Case (IH2a) e2 a value.
      (5) (M, App(e1,e2)) ↦ (M, [e2/x]e3) by RE(8)
      (6) (M, e) ↦ (M, e') where e' = [e2/x]e3 by (1), (5). [X]
    Case (IH2b) (M,e2) ↦ (M',e2')
      (7) (M, App(e1,e2)) ↦ (M', App(e1,e2'))
      (8) (M, e) ↦ (M', e') where e' = App(e1,e2') by (1), (7). [X]

  Case (IH1b) (M,e1) ↦ (M',e1')
      (9) (M, App(e1,e2)) ↦ (M', App(e1',e2)) by RE(9)
      (10) (M, e) ↦ (M', e') where e' = App(e1',e2) by (1), (9). [X]

Other inductive cases are similar to RT(1) or RT(7). 
                                                       [XX]


----------------------------------------------------------------------

Polymorphic Typings for State primitives.

In PTFun, we can treat Ref, DRef, and Set as primitive functions
with the following polymorphic types:

     Ref :  ∀t. t → Ref(t)

     DRef : ∀t. Ref(t) → t

     Set :  ∀t. Ref(t) * t → Unit

E.g.  Ref[Int](Num 3)


Polymorphic typings in ML:

     ref : 'a -> 'a ref

     !   : 'a ref -> 'a 

     :=  : 'a ref * 'a -> unit

 E.g. ref 3
    
Example:

   let val r = ref(fn x => x)          [ r : ('a -> 'a) ref ]
    in r := (fn x: int => x + 1);      [ r : (int -> int) ref ]
       !r true                         [ r : (bool -> bool) ref ]
   end

References have introduced unsoundness in the type system!!!

After years of experimentation with fixes for this problem, the ML
community settled on the "value restriction":

   A variable declaration (like "val r = ref(fn x => x)") can
   only have its type generalized (made polymorphic) if the
   definients is a value expression (which it is not, in this
   case).


This issue does not affect PTFun with polymorphically typed
Ref, DRef, and Set primitives, because to make r polymorphic
we will have to explicitly abstract over a type parameter, 
as in:

   let r = Λt.Ref[t → t](λx: t.x)    [ r : ∀t.Ref(t → t) ]
    in Set(r[Int], (λx: int.x + 1));  [ r[Int] : Ref(Int → Int) ]
       DRef(r[Bool]) true             [ r[Bool] : Ref(Bool → Bool) ]
   end

Since the Λ-abstraction defining r is a value, the application
of the Ref constructor is suspended, and so the actual allocation
of the ref-cell does not take place until r is applied to a type.
There are two such applications: r[Int] and r[Bool]; these produce
two {\em different} ref-cells, one containing Ints and the other
Bools. So there is not type conflict.

======================================================================


And now for something different -- monads!

See state-monad.sml.