(Video part one and part two
)
Recall the FastQueue.elm
implementation (referred to in the textbook as the "batched queue" implementation) that has O(1) amortized cost for each of the operations, including dequeue
, which has an O(n) worst-case cost. And recall that the amortized analysis assumes that the Queue
objects are not used persistenly. We will now see queue implementations that employ laziness in order to achieve O(1) amortized bounds despite the possibility that each queue is used persistently.
Consider the following example from the textbook (slightly different from the one above), where q_0
is a queue whose front
and back
lists have equal length m
. Using the FastQueue
implementation, the operation dequeue q_{m-1}
triggers a call to List.reverse
, which runs in O(m
) time; all other calls to dequeue
run in O(1) time.
q_0 = { front = [1, 2, ..., m] -- |front| = |back| = m
, back = [2m, ..., m+2, m+1]
}
q_1 = dequeue q_0
q_2 = dequeue q_1
...
q_{m-1} = dequeue q_{m-2}
q_m = dequeue q_{m-1} -- expensive, calls List.reverse
q_{m+1} = dequeue q_m
As a naive attempt, we might try to use lazy lists and memoize the computation that involves the expensive call to List.reverse
. However, if the operation dequeue q_{m-1}
is evaluated again, a completely different reverse
suspension is created.
(Note: Below we use Thunk
s and ThunkList
s and pretend the system performs memoization.)
BankersQueue.elm
A more clever approach is based on the idea of having dequeue q_0
create a suspension involving the reverse
that is not forced until the dequeue q_m
operation. Separating the creation and evaluation of the suspension allows O(m) time to pay for the expensive O(m) cost of the reverse
.
To realize this strategy, we do not wait until the front
list is about to become empty before reversing the back
list. Instead, the back
list is reverse
d as soon as it becomes longer than the front
and is then append
ed to the front
in order to maintain the correct order of elements in the Queue
. The key is that the ThunkList
defined by
ThunkList.append front (ThunkList.reverse back)
does not immediately perform the monolithic call to reverse
because append
is an incremental function. Only after sufficient calls to dequeue
exhaust the front
list is back
reversed. Let's go through the implementation of this strategy, and then discuss how it fares with respect to the problematic sequence of operations above.
The representation maintains the explict Int
eger sizes of the front
and back
streams.
-- invariant:
-- frontSize >= backSize
--
type Queue a =
Q { frontSize : Int -- frontSize = |front|
, front : ThunkList a
, backSize : Int -- backSize = |back|
, back : ThunkList a
}
When enqueue
-ing to back
and dequeue
-ing from front
...
enqueue : a -> Queue a -> Queue a
enqueue x (Q {frontSize, front, backSize, back}) =
check frontSize front (backSize+1) (lazy (\_ -> Cons x back))
dequeue : Queue a -> Maybe (Queue a)
dequeue (Q {frontSize, front, backSize, back}) =
case force front of
Nil ->
Nothing
Cons _ newFront ->
Just (check (frontSize-1) newFront backSize back)
... the following re-establishes the invariant, if necessary:
check frontSize front backSize back =
if frontSize >= backSize then
Q { frontSize = frontSize, front = front, backSize = backSize, back = back }
else
Q { frontSize = frontSize + backSize
, front = ThunkList.append front (ThunkList.reverse back)
, backSize = 0
, back = lazy (\() -> Nil)
}
The peek
operation forces the front
stream:
peek : Queue a -> Maybe a
peek (Q {frontSize, front, backSize, back}) =
case force front of
Nil -> Nothing
Cons a _ -> Just a
To see how this approach fares well even with persistent data structures, consider the sequence of m dequeue
operations from before. The first one, dequeue q_0
, creates a suspension involving reverse
that is forced by the dequeue q_m
operation. No other operation in the sequence creates a suspension. Therefore, to force another expensive call to reverse
requires another call to dequeue q_0
followed by m-1 calls to dequeue
. So, the O(m) cost of the reverse
can be amortized over the sequence of O(m) operations that must precede it.
Sections 6.1, 6.2, and 6.3 of the textbook show how to formalize this argument by adapting the banker's method to account for lazy evaluation.
PhysicistsQueue.elm
The textbook describes another way to implement a strategy similar to the one employed by BankersQueue
. In that version, the (incremental) append
function waits until front
becomes empty before applying the (monolithic) reverse
function to back
.
Because the back
list is only ever processed by monolithic functions, there is no need for it to be represented using a ThunkList
. Thus, one change to the representation is to use an ordinary List
for back
.
type Queue a =
Q { ...
, back : List a
}
Using this representation, the key operation from before becomes
appendThunkListAndList front (List.reverse back)
where
appendThunkListAndList : ThunkList a -> List a -> ThunkList a
is an incremental function. (Exercise — Implement appendThunkListAndList
.)
Because this function, like append
, is incremental, the resulting list is not entirely evaluated right away. As it turns out, because of the expensive reverse
within the append
ed list, the amortized analysis can be made to work even if this concatenation is performed eagerly. Thus, a second change to the representation is to store front
as a Thunk (List a)
(an ordinary List
that is suspended) rather than a ThunkList a
(a stream).
type Queue a =
Q { ...
, front : Thunk (List a)
, back : List a
}
Using this representation, the key operation from before is simply:
force front ++ List.reverse back
A consequence of this representation is that peek
and dequeue
must force
the entire suspended front
list. To reduce the costs of these operations, the final change to the representation is to keep an additional (evaluated) List
that is a prefix
of (the suspended List
) front
to facilitate fast access to the front of front
.
-- invariants:
-- frontSize >= backSize
-- prefix == [] ==> front == []
--
type Queue a =
Q { prefix : List a
, frontSize : Int
, front : Thunk (List a)
, backSize : Int
, back : List a
}
Like in the BankersQueue
, the size of the back
is never allowed to become larger than the front
. In addition, pre
is allowed to be empty only if front
is empty. The check
and checkPrefix
functions enforce these invariants.
check prefix frontSize front backSize back =
if frontSize >= backSize then
checkPrefix prefix frontSize front backSize back
else
let newFront = lazy (\_ -> force front ++ List.reverse back) in
checkPrefix prefix (frontSize + backSize) newFront 0 []
checkPrefix prefix frontSize front backSize back =
let
newPrefix =
case prefix of
[] -> force front
_ -> prefix
in
Q { prefix = newPrefix
, frontSize = frontSize
, front = front
, backSize = backSize
, back = back
}
The enqueue
operation adds to the back
and peek
pulls from prefix
, the partially evaluated front of the front
.
enqueue : a -> Queue a -> Queue a
enqueue x (Q {prefix, frontSize, front, backSize, back}) =
check prefix frontSize front (backSize+1) (x::back)
peek : Queue a -> Maybe a
peek (Q {prefix}) =
List.head prefix
The dequeue
operation "tails" both prefix
and front
:
dequeue : Queue a -> Maybe (Queue a)
dequeue (Q {prefix, frontSize, front, backSize, back}) =
if frontSize == 0 then
Nothing
else
case (prefix, force front) of
(_ :: prefix_, _ :: front_) ->
Just (check prefix_ (frontSize-1) (lazy (\() -> front_)) backSize back)
_ ->
Debug.todo "dequeue"
Section 6.4 of the textbook shows how to adapt the physicist's method to account for lazy evaluation and use to argue that this implementation, like the BankersQueue
, has O(1) amortized costs even in the face of persistent access.