Lazy Queues

Recall the FastQueue.elm implementation (referred to in the textbook as the "batched queue" implementation) that has O(1) amortized cost for each of the operations, including dequeue, which has an O(n) worst-case cost. The amortized analysis, however, assumes that no Queue object is used as the input for more than one Queue operation. We will now see queue implementations that employ laziness in order to achieve O(1) amortized bounds despite the possibility that each queue is used persistently.

Consider the following example from the textbook, where q_0 is a queue whose front and back lists have equal length m.

q_0     = { front = [1, 2, ..., m]          -- |front| = |back| = m
          , back  = [2m, ..., m+2, m+1]
          }
q_1     =  dequeue q_0
q_2     =  dequeue q_1
       ... 
q_m     =  dequeue q_{m-1}    -- expensive, calls List.reverse
q_{m+1} =  dequeue q_m

Using the FastQueue implementation, the operation dequeue q_{m-1} triggers a call to List.reverse, which runs in O(m) time; all other calls to dequeue run in O(1) time.

In the absence of persistence, the amortized analysis is able to reason that this expensive operation happens infrequently compared to the cheap ones and that any sequence of n FastQueue operations takes O(n) time overall.

But, the program is free to use the "old" version q_{m-1} of "the" queue elsewhere in the program; repeatedly dequeue-ing from it will break the reasoning that the expensive operation happens infrequently.

As a naive attempt, we might try to use lazy lists and memoize the computation that involves the expensive call to List.reverse. However, if the operation dequeue q_{m-1} is evaluated again, a completely different reverse suspension is created.

(Note: Having worked through LazyLists with explicit caching, below we use Thunks and ThunkLists and pretend the system performs memoization.)

A Clever Approach — BankersQueue.elm

A more clever approach is based on the idea of having dequeue q_0 create a suspension involving the reverse that is not forced until the dequeue q_m operation. Separating the creation and evaluation of the suspension allows O(m) time to pay for the expensive O(m) cost of the reverse.

To realize this strategy, we do not wait until the front list is about to become empty before reversing the back list. Instead, the back list is reversed as soon as it becomes longer than the front and is then appended to the front in order to maintain the correct order of elements in the Queue. The key is that the ThunkList defined by

ThunkList.append front (ThunkList.reverse back)

does not immediately perform the monolithic call to reverse because append is an incremental function. Only after sufficient calls to dequeue exhaust the front list is back reversed. Let's go through the implementation of this strategy, and then discuss how it fares with respect to the problematic sequence of operations above.

The representation maintains the explict Integer sizes of the front and back streams.

-- invariant:
--   frontSize >= backSize
--
type Queue a =
  Q { frontSize : Int         -- frontSize = |front|
    , front : ThunkList a
    , backSize : Int          -- backSize  = |back|
    , back : ThunkList a
    }

When enqueue-ing to back and dequeue-ing from front...

enqueue : a -> Queue a -> Queue a
enqueue x (Q {frontSize, front, backSize, back}) =
  check frontSize front (backSize+1) (lazy (\_ -> Cons x back))

dequeue : Queue a -> Maybe (Queue a)
dequeue (Q {frontSize, front, backSize, back}) =
  if frontSize == 0 then
    Nothing
  else
    Just (check (frontSize-1) (ThunkList.drop 1 front) backSize back)

... the following re-establishes the invariant, if necessary:

check frontSize front backSize back =
  if frontSize >= backSize then
    Q { frontSize = frontSize, front = front, backSize = backSize, back = back }
  else
    Q { frontSize = frontSize + backSize
      , front = ThunkList.append front (ThunkList.reverse back)
      , backSize = 0
      , back = lazy (\() -> Nil)
      }

When the frontSize is greater than 0, peek forces the front stream:

peek : Queue a -> Maybe a
peek (Q {frontSize, front, backSize, back}) =
  if frontSize == 0 then
    Nothing
  else
    case force front of  
      Cons a _ -> Just a
      Nil      -> Debug.todo "peek"

To see how this approach fares well even with persistent data structures, consider the sequence of m dequeue operations from before. The first one, dequeue q_0, creates a suspension involving reverse that is forced by the dequeue q_m operation. No other operation in the sequence creates a suspension. Therefore, to force another expensive call to reverse requires another call to dequeue q_0 followed by m-1 calls to dequeue. So, the O(m) cost of the reverse can be amortized over the sequence of O(m) operations that must precede it.

Sections 6.1, 6.2, and 6.3 of the textbook show how to formalize this argument by adapting the banker's method to account for lazy evaluation.

Another Clever Approach — PhysicistsQueue.elm

The textbook describes another way to implement a strategy similar to the one employed by BankersQueue. In that version, the (incremental) append function waits until front becomes empty before applying the (monolithic) reverse function to back.

Because the back list is only ever processed by monolithic functions, there is no need for it to be represented using a ThunkList. Thus, one change to the representation is to use an ordinary List for back.

type Queue a =
  Q { ...
    , back : List a
    }

Using this representation, the key operation from before becomes

appendThunkListAndList front (List.reverse back)

where

appendThunkListAndList : ThunkList a -> List a -> ThunkList a

is an incremental function. (Exercise — Implement appendThunkListAndList.)

Because this function, like append, is incremental, the resulting list is not entirely evaluated right away. As it turns out, because of the expensive reverse within the appended list, the amortized analysis can be made to work even if this concatenation is performed eagerly. Thus, a second change to the representation is to store front as a Thunk (List a) (an ordinary List that is suspended) rather than a ThunkList a (a stream).

type Queue a =
  Q { ...
    , front : Thunk (List a)
    , back : List a
    }

Using this representation, the key operation from before is simply:

force front ++ List.reverse back

A consequence of this representation is that peek and dequeue must force the entire suspended front list. To reduce the costs of these operations, the final change to the representation is to keep an additional (evaluated) List that is a prefix of (the suspended List) front to facilitate fast access to the front of front.

-- invariants:
--   frontSize >= backSize
--   prefix == [] ==> front == []
--
type Queue a =
  Q { prefix : List a
    , frontSize : Int
    , front : Thunk (List a)
    , backSize : Int
    , back : List a
    }

Like in the BankersQueue, the size of the back is never allowed to become larger than the front. In addition, pre is allowed to be empty only if front is empty. The check and checkPrefix functions enforce these invariants.

check prefix frontSize front backSize back =
  if frontSize >= backSize then
    checkPrefix prefix frontSize front backSize back
  else
    let newFront = lazy (\_ -> force front ++ List.reverse back) in
    checkPrefix prefix (frontSize + backSize) newFront 0 []

checkPrefix prefix frontSize front backSize back =
  let
    newPrefix =
      case prefix of
        [] -> force front
        _  -> prefix
  in
  Q { prefix = newPrefix
    , frontSize = frontSize
    , front = front
    , backSize = backSize
    , back = back
    }

The enqueue operation adds to the back and peek pulls from prefix, the partially evaluated front of the front.

enqueue : a -> Queue a -> Queue a
enqueue x (Q {prefix, frontSize, front, backSize, back}) =
  check prefix frontSize front (backSize+1) (x::back)

peek : Queue a -> Maybe a
peek (Q {prefix}) =
  List.head prefix

The dequeue operation "tails" both prefix and front:

dequeue : Queue a -> Maybe (Queue a)
dequeue (Q {prefix, frontSize, front, backSize, back}) =
  if frontSize == 0 then
    Nothing
  else
    case (prefix, force front) of
      (_ :: prefix_, _ :: front_) ->
        Just (check prefix_ (frontSize-1) (lazy (\() -> front_)) backSize back)
      _ ->
        Debug.todo "dequeue"

Section 6.4 of the textbook shows how to adapt the physicist's method to account for lazy evaluation and use to argue that this implementation, like the BankersQueue, has O(1) amortized costs even in the face of persistent access.


Reading

Optional

  • Okasaki, Chapter 6.1—6.4. Although we will not cover the accounting techniques in this class, you are encouraged to read through this material a few times to help understand the basic mechanisms.