==============
Polling places
==============

**Due: Friday, November 9, 4pm**

*You may work alone or in a pair on this assignment.*


.. image:: images/voting_booths.png
    :width: 3in
    :align: right

Every two years, many people around the country wait in long lines to
vote due, in part, to inadequate provisioning of voting booths.  In
this assignment you will write code to simulate the flow of voters
through precincts.  The information generated by your
simulator could be used by an election official as a crude lower bound
to help determine the number of voting booths needed at a precinct.

We will view this problem as an instance of the more general problem of
simulating an :math:`M`/:math:`M`/:math:`N` queue, where:

- the first :math:`M` indicates that arrivals (voters arriving at the polls) obey a Markov process;
- the second :math:`M` indicates that the service times obey a Markov process (the amount of time a voter takes to complete a ballot); and finally
- the :math:`N` indicates that there are :math:`N` servers (the number of voting booths).

We will tell you how to structure your implementation at a high level,
but you will be responsible for the design details.

Getting started
---------------

See `these start-up instructions <startup-alone.html>`__  if you intend to work alone.

See `these start-up instructions <startup-pair-continuing.html>`__ if you intend to work with the same partner as in a previous assignment.

See `these start-up instructions <startup-pair.html>`__ if you intend to work in a *NEW* pair.


The ``pa4`` directory includes a file named ``simulate.py``, which you
will modify, a file named ``util.py`` that contains a few useful
functions, a directory named ``data``, which contains precint
configuration files (described below), and a series of tests
contained in files ``test_avg_wait_time.py``, ``test_simulate_election_day.py``
and ``test_find_number_of_booths.py``.

Please put all of your code for this assignment in ``simulate.py``.
Do not add extra files for the required classes.

Precinct Configuration
----------------------

The configuration for a precinct is specified using a dictionary
containing this information:

- The name of the precinct (``name``)
- number of voters assigned to the precinct (``num_voters``), 
- number of hours the polls are open (``hours_open``),
- The number of voting booths (``num_booths``)
- The voter distribution (``voter_distribution``). 

For example::

    {'name': 'Downtown',
     'num_voters': 5,
     'hours_open': 1,
     'num_booths': 1,
     'voter_distribution': {'type': 'poisson',
                            'arrival_rate': 0.11,                            
                            'voting_duration_rate': 0.1}

Notice how the ``voter_distribution`` field contains another dictionary containing
these fields:

- The type of distribution (``type``). In this assignment,
  we will only deal with one type of distribution: ``poisson``
- The rate at which voters arrive (``arrival_rate``) and the
  rate at which voters complete their voting (``voting_duration_rate``).
  The significance of these rates is explained later.

Information about precincts is stored in a JSON file that contains
a dictionary with two keys: ``precincts``, containing a list of
precinct configurations as specified above, and ``seed``, containing a random seed.
Similar to PA1, the random seed will be used to ensure that a
simulation involving randomness produces the same result (which
will allow us to test whether your code produces the expected
results). We recommend reviewing PA1 (and how it used random seeds)
if you're unsure of how random seeds are used in the context of
running a simulation.

We have provided a function, ``load_precincts`` in ``util.py`` that
takes the name of a JSON precincts file and returns two values:
a list of precinct configurations (as specified above) and the random seed.

This function is already called from the code we provide for you,
so you should not make any calls to ``load_precincts`` in the code
you'll add to ``simulate.py``. However, it can be a useful 
function when testing your code. In particular,
you should take a moment to familiarize yourself with the
data stored in these JSON files, and how to access each value once
you've loaded the file with ``load_precincts``. We suggest you start
by looking at ``data/config-single-precinct-0.json``, and loading that file
from IPython::

    In [1]: import util

    In [2]: precincts, seed = util.load_precincts("data/config-single-precinct-0.json")

    In [3]: precincts
    Out[3]: 
    [{'hours_open': 1,
      'name': 'Downtown',
      'num_booths': 1,
      'num_voters': 5,
      'voter_distribution': {'arrival_rate': 0.11,
       'type': 'poisson',
       'voting_duration_rate': 0.1}}]

    In [4]: seed
    Out[4]: 1468604453

    In [5]: len(precincts)
    Out[5]: 1

    In [6]: p = precincts[0]

    In [7]: p["num_voters"]
    Out[7]: 5

    In [8]: p["voter_distribution"]["arrival_rate"]
    Out[8]: 0.11


Representing Voters
-------------------

In this assignment you will need to model voters using a ``Voter`` class.
This class must include the time the voter arrives at the
polls, voting duration, that is, the amount of time the voter takes to
vote, and the time the voter is assigned to a voting booth (aka, the
start time). For simplicity, we will model time as a floating point number,
representing the number of minutes since the polls opened. So, if we say 
a voter arrived at the polls at time 60.5, that means they arrived an
hour and thirty seconds after the polls opened.

While not strictly necessary, we found it convenient for debugging purposes
to store the time the voter leaves the 
voting booth (i.e., the departure time) as well.

The start time for a voter, which is not available at construction
time, should be initialized to ``None`` when the voter object is constructed.
This value should be reset to the correct value when the voter is
assigned to a voting booth.  Similarly, voters' departure times cannot
be determined until we know their start times.

Your implementation **must** include at a minimum: a constructor and
public attributes named ``arrival_time``, ``voting_duration``, and
``start_time``.  Our utility function for
printing voters, described below, relies on the existence of these
attributes. Please note that you are welcome to use properties instead
of attributes for some or all of these required attributes, but are
not required to do so.

This class is straightforward.  It merely serves as a way to
encapsulate information about a voter.  Our implementation is roughly
30 lines, although it is possible to come up with implementations
that are much shorter than that.

Please note, the voter generator class, described next, is the only class
that is allowed to call the voter constructor.

Generating voters
-----------------

Your implementation must include a ``VoterGenerator`` class for generating voters.
You may ask yourself why this merits its own
class, instead of simply constructing the voter objects in the precinct
class we describe below. One reason for this is that it allows us to
define a class for generating voters according to some random distribution,
and then create multiple instances with different parameters to that
random distribution. For example, if the time it takes a voter to vote
followed a normal distribution, we could create a voter generator with
mean = 5 minutes, stdev = 2 minutes, and another voter generator with
mean = 15 minutes, stdev = 5 minutes. We could then use those two instances
to explore the behaviour of voters
according to those two distributions (both normal, but each with different
parameters).

We will revisit the usefulness of having a voter generator class once
we've discussed how to represent precincts and their booths; if this section feels
too abstract or unclear, you may want to re-read it once we've discussed
precincts and voting booths.

For now, let's focus on discussing how the voters are generated.
Both the arrival time and the voting duration will follow a Poisson
process. From Wikipedia: a Poisson process, named
after the French mathematician Simeon-Denis Poisson (1781-1840), is
a stochastic process in which the time to the next event is independent
of past events.  We can use the following fact to
generate a voter sample: if the number of arrivals in a given time
interval [0,t] follows the Poisson distribution, with mean
:math:`t*\lambda`, then the gap between arrivals times follows the
exponential distribution with rate parameter :math:`\lambda`.

So, our voter generator must keep track of time. We will assume that we
keep track of time in minutes starting at time zero.
To generate a voter, you will need to generate an arrival time and a
voting duration.  To generate the arrival time of the next voter, you will
draw an inter-arrival time (:math:`gap`) from the exponential
distribution using the specified arrival rate as :math:`\lambda` and
add it to the current time (:math:`t`).  You'll then move time forward
by updating the time from :math:`t` to :math:`t+gap`.  

For example, let's say we create a voter generator (the exact parameters
of the exponential distribution are not relevant to this simple example).
We generate a voter, and the random number generator returns ``5.7`` for that
voter's gap. Since this is the first voter we generate, their arrival
time will be ``5.7``, and we advance the voter generator's current time from ``0`` to
``5.7``. We then generate another voter, and the random number generator
returns ``2.4`` for the voter's gap. This means that this second voter
arrives ``2.4`` minutes *after* the previous voter, so the second voter's
arrival time will be ``8.1`` (``5.7 + 2.4``) and we advance the voter generator's
current time to ``8.1``.

When we generate a voter, we also need to assign a voting duration to that voter.
You will draw the voter's voting duration from the exponential 
distribution using the specified voting duration rate as :math:`\lambda`.

For testing purposes, it is important that you draw the gap and the
voting duration in the same order as our code.  To reduce the likelihood
that the testing process fails simply because the values were drawn in
the wrong order, we have provided a function,
``gen_poisson_voter_parameters``, in ``util.py`` that takes the arrival rate
and the voting duration rate as arguments and returns a pair of floats to use
for the gap and voting duration (in that order).

Your implementation should, at the very least, have a constructor and a ``next``
method that returns the next voter.


*Printing a voter sample*

We have provided a function named ``print_voters`` in ``util.py`` that
takes a list of ``Voter`` instances and an optional file name.  This
function will either print a representation of the voters to
screen or to a file, depending on whether a filename is specified.

*Testing*

The code for generating the voter sample should be fairly simple
(our implementation is only 15 lines long, not including comments).
That said, it is important that you generate the voters correctly
for the rest of your code to work. We strongly urge you to test your 
voter generator class carefully before you move on to the next step.  

When we were working on our code, we wrote a short function (8 lines)
called ``generate_and_print`` to test whether we were generating
voters correctly. This is a standalone function that is not part
of any of the classes you are designing.

The function takes the name of a precincts configuration file and then:

1. Loads the configuration using ``util.load_precincts``
2. Creates a ``VoterGenerator`` object with the arrival rate and
   voting duration rate specified for the first precinct in the file
3. Generates voters one by one, placing them in a list.
4. Calls ``util.print_voters`` with the list.

While writing this function is not required (and will not be graded),
you may want to consider writing it to aid in your testing.

To help with testing your voter generator class, we have provided two
sets of sample parameters.  The first set, (see
``data/config-single-precinct-0.json``), includes the following:

- number of voters in the sample: 5
- hours open: 1
- arrival rate: 0.11 (corresponds to mean time between voters of approximately 9 minutes)
- voting duration rate: 0.1 (corresponds to a mean voting duration of 10 minutes)
- seed for random number generator: 1468604453

and corresponds to a sample in which all the voters arrive before the
polls close.  

Here is the result of calling our sample testing function with
``data/config-single-precinct-0.json``::

    In [4]: generate_and_print("data/config-single-precinct-0.json")
    Arrival Time   Voting Duration   Start Time    Departure Time
          6.78         2.11             None              None

          7.01         5.13             None              None

         25.27         2.68             None              None

         26.89         1.10             None              None

         33.48        10.08             None              None


The second set, ``data/config-single-precinct-1.json``, is the same except the arrival
rate is set to 0.04, which corresponds to a mean gap of 25 minutes::

    In [5]: generate_and_print("data/config-single-precinct-1.json")
    Arrival Time   Voting Duration   Start Time    Departure Time
         18.64         2.11             None              None

         19.28         5.13             None              None

         69.49         2.68             None              None

         73.94         1.10             None              None

         92.08        10.08             None              None

Notice how, in the above set, only the first two voters would arrive
before the polls close. However, the voter generator is unconcerned
with this. This leads to a nice separation of concerns, where we can
test the generation of voters separate from the more complex logic
of determining if and when a voter can start voting, which will
reside in a separate precinct class that will use the voter generator
class. Your results may be different from the results
above if you do not set the random seed before you start generating voters.


Task 0: Implement the Voter and VoterGenerator classes
------------------------------------------------------

Before moving on to the rest of the assignment, you should implement
a ``Voter`` class and a ``VoterGenerator`` class as described above.
We do not provide tests for either, so it is up to you to ensure
that these classes are correctly implemented before moving on to
the rest of the assignment. We *strongly* encourage you to test
your ``VoterGenerator`` class in the manner described above,
even if you only do so informally by writing a small testing
function like the one we used.


Task 1: Simulate election day for one precinct
----------------------------------------------

In this task, you will be simulating an election for a single precinct.
You will do this by implementing a ``Precinct`` class that stores
information about a precinct, and which can simulate an election day
on that precinct.

Modeling Precincts
~~~~~~~~~~~~~~~~~~

Cities are broken up into voting precincts, each of which has a number
of voting booths.  Your ``Precinct`` class will model a precinct
with a specified number of voting booths and must include, at least, 
one public ``simulate`` method that runs a simulation of an election day
on that precinct.
The exact parameters and return value(s)
of this method are up to you; later on, we describe how we will be testing
your implementation, and that may provide some hints on what parameters and
return value(s) to use in your simulation method.

The simulation itself will be similar to the M/D/1 queue example from class.
In the M/D/1 example, we have a single server, such as a toll booth, 
that serves a queue of clients, such as cars, that
arrive with a specified distribution and have a fixed service time.
In this assignment, your simulation will need to simulate multiple servers 
(voting booths) and variable service times (voting durations), but the 
structure of your simulation will be similar.

There are three main differences, though. First, instead of generating arrival
times for the clients directly in the simulation code, you will
be using a ``VoterGenerator`` object, and get voters from the generator as needed.
Second, you should implement private methods in the ``Precinct`` class to determine whether
a voting booth is available and, if not, when a booth will become available. This
will be necessary to determine when a given voter will finish voting (since it depends
on when they can access a booth). Third, you will need to limit the number of booths 
in use to be no more than the number of booths assigned to the precinct.

.. admonition:: Don't queue voters!

   You may be tempted to create a queue of ``Voter`` objects. Don't! This is a common
   pitfall when designing this type of simulation. When you generate a ``Voter``
   using the ``VoterGenerator``, it could be the case that all the voting booths
   are occupied at the time the voter arrives. Instead of placing the voter in
   a queue, we will instead use a *priority queue* (described below) to model
   the voting booths. This will allow us to efficiently answer the question
   "When will a voting booth become available in this precinct?". If we can
   easily answer that question, then we will know the voting start time of any
   new ``Voter``.

In this simulation, you will generate voters until you
reach one of the simulation's *stopping conditions*. In this case, we need to stop the
simulation once there are no more voters who can vote. This can happen
in two situations: (1) you have generated the maximum
number of the voters specified for the precinct, or (2) you generate a
voter who arrives after closing time.  In the second case, the late
voter should be discarded and no further voters should be drawn from the sample.
We will refer to voters who arrive before the polls close as *valid*.

.. admonition:: The ``Precinct`` and the ``VoterGenerator``

   Notice how the stopping condition depends on information that is "owned"
   by different classes: the ``Precinct`` is where we should store the number
   of hours that the precinct is open, but the ``VoterGenerator`` is where
   we should store the maximum number of voters to generate (otherwise,
   a ``VoterGenerator`` object would not know when to stop generating voters).
   So, it may be convenient for one object to have the other
   as an attribute, so we can access all the necessary information
   during the simulation. Recalling the "A good rule of thumb" section of the
   lab, which of these phrases makes more sense?

   * A ``Precinct`` **has a** ``VoterGenerator``
   * A ``VoterGenerator`` **has a** ``Precinct``

   Alternatively, let's use "precinct simulator" and "voter generator" in the above phrases:

   * A precinct simulator has a voter generator
   * A voter generator has a precinct simulator

   Alternatively, if we want to simulate a precinct with different ``VoterGenerator``
   objects, we could think about simply passing a ``VoterGenerator`` object as a
   parameter to our simulation method.


Example
~~~~~~~

Let's walk through an example.  The file ``data/config-single-precinct-2.json`` contains the
following precinct::

      {
       "name": "Little Rodentia",
       "hours_open": 1, 
       "num_voters": 10, 
       "num_booths": 2,
       "voter_distribution": {
                               "type": "poisson", 
                               "voting_duration_rate": 0.1, 
                               "arrival_rate": 0.16666666666666666
                             }
      }

And the seed ``1438018944``.

The precinct has 10 voters who arrive at the precinct once every 6
minutes on average and take 10 minutes on average to vote.

When working through an example, it can be useful to call
the ``generate_and_print`` function we suggested above, so we can
see what voters would be generated in this simulation::

    In [6]: generate_and_print("data/config-single-precinct-2.json")
    Arrival Time   Voting Duration   Start Time    Departure Time
          3.29         5.39             None              None

         12.39         5.74             None              None

         12.68        24.45             None              None

         15.78         6.08             None              None

         19.76        39.64             None              None

         20.52         5.79             None              None

         22.33        16.65             None              None

         24.23         3.07             None              None

         26.75         5.24             None              None

         29.04        15.43             None              None


Let's consider what happens when we simulate this precinct using two
voting booths.  The first call to ``util.gen_poisson_voter_parameters``
generates ``(3.29, 5.39)`` as the gap and voting duration, which means
that our first voter arrives at 3.29, enters a voting booth
immediately at time 3.29, and finishes voting at 8.69 (3.29 + 5.39).  (All times
have been rounded to two digits for clarity.)

The second call to ``util.gen_poisson_voter_parameters`` yields ``(9.10,
5.74)``, which means that the second voter arrives at time 12.39 (3.29 + 9.10).  
This second voter starts voting immediately, because a booth is open,
and finishes voting at time 18.13 (12.39 + 5.74).

The third voter arrives at time 12.68 (gap: 0.29) and has a voting
duration of 24.45 minutes.  Since the first voter departed at 8.69,
this voter can start voting immediately, and finishes voting at 37.13.

The fourth voter arrives at 15.78 (gap: 3.10) and has a voting
duration of 6.08 minutes.  Both booths are occupied when the fourth voter arrives,
so this voter must wait until a booth becomes free, which happens when
the second voter finishes at time 18.13.  Thus, the fourth voter's start time will be 18.13
and their finish time will be 24.21.  Notice that this voter finishes
*before* the third voter.

The fifth voter arrives at time 19.76 (gap: 3.99) and will not start
voting until the fourth voter finishes at time 24.21.  Notice that the
third voter is still voting when the fifth voter starts!

And so on.  

All ten voters arrive before the polls close and so all ten get to
vote even though the last two will not start voting until after the
polls have closed at time 60 (remember that we specify time in
*minutes*, but the precinct dictionary specified the number of *hours*)

Here's the output of our simulation::

    Arrival Time   Voting Duration   Start Time    Departure Time
          3.29         5.39             3.29              8.69
    
         12.39         5.74            12.39             18.13
    
         12.68        24.45            12.68             37.13
    
         15.78         6.08            18.13             24.21
    
         19.76        39.64            24.21             63.85
    
         20.52         5.79            37.13             42.92
    
         22.33        16.65            42.92             59.57
    
         24.23         3.07            59.57             62.64
    
         26.75         5.24            62.64             67.88
    
         29.04        15.43            63.85             79.28


Modelling voting booths within a precinct
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You will need a data structure to keep track of the voters who are
currently occupying voting booths and to determine which voter will
depart next.  Your implementation should use the ``PriorityQueue``
class the from the python queue_ library for this purpose.  A minimum
priority queue is a data structure that includes operations for (1)
adding items along with their priorities to the queue, (2) removing
the item with the lowest priority from the queue, and (3) determining
whether the queue is empty.  The ``PriorityQueue`` library also allows
you to specify an upper bound on the number of items allowed to be in
the queue at the same time and provides a method for determining
whether the queue is full.  Please look through the queue_ library
for details on how to use the this class.

For this assignment, the priorities will be the voters' departure
times.

Take into account that your implementation should not need a
voting booth class. There could certainly be models that require
modelling the voting booth as its own class but, in the model
we are assuming, it is enough to have a precinct class that uses
a priority queue to model the state of the voting booths.

Furthermore, your simulation method should not need to access the priority
queue directly. Instead, you should implement private methods that encapsulate
operations like adding a voter to a booth, or removing the next voter who
will finish voting.

.. admonition:: Blocking on ``get`` and ``put``

    You should be careful with one specific aspect of priority queues:
    the ``get`` method is a *blocking* method, meaning that if you call
    ``get`` (to extract a value from the priority queue) and the queue
    has no values, the operation will *block* (or hang) until a value is available.

    In this assignment, you should never call ``get`` on an empty queue
    so, if your code seems to mysteriously hang, the first thing you
    should check is whether you're calling ``get`` on an empty queue
    somewhere in your code. Alternatively, you can call ``get`` like this::

        my_queue.get(block=False)

    This way, ``get`` will raise an ``Empty`` exception if you try to
    extract a value from an empty queue (immediately alerting you to
    the fact that there may be an error somewhere in your code, since
    you should never be calling ``get`` on an empty queue in this
    assignment)

    Similarly, if you call ``put`` on a queue that is full, your code
    will also block. However, if you call ``put`` like this::

        my_queue.put(value, block=False)

    Then putting a value into a full queue will raise a ``Full`` exception.


The ``simulate_election_day`` function
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Since we don't know the exact design
you will choose (and since each student can come up with equally valid
designs), you must also complete the function ``simulate_election_day`` in
``simulate.py`` so it will invoke the simulation method in your ``Precinct``
implementation. This is the function that our tests will be calling so,
in a sense, it acts as an intermediary between the tests and your ``Precinct``
class (this is what is often referred to as `glue code <https://en.wikipedia.org/wiki/Glue_code>`__).

This function takes a list of precinct dictionaries
and a random seed, and returns a dictionary that maps precinct names to a list of the voters
who voted in that precinct. For now, you can assume the list of precincts will
contain a single precinct (we will add support for multiple precints in the next task)

The ``Voter`` objects returned by this function should have their ``start_time`` attribute
filled in. Furthermore, for a given precinct, our tests expects the list of voters to be
sorted by the voters' arrival time;
since your ``VoterGenerator`` should generate voters in the order in which they arrive,
producing a list of voters sorted by arrival time should not require any additional steps
to sort the list. If you do find that you need to sort the list explicitly, it is likely
that your solution is headed in the wrong direction; ask on Piazza if that is the case.

Take into account that the bulk of your implementation
*must* be in your ``Precinct`` class (alongside your ``Voter``
and ``VoterGenerator`` classes). The ``simulate_election_day`` function 
itself should only be about 10 lines long,
and all of the actual simulation code should reside inside your ``Precinct`` class.
If you find yourself writing substantially more than 10 lines in the ``simulate_election_day``
function, it is likely that the design of your class needs some improvement.
Please come to office hours or ask on Piazza so we can provide some preliminary
feedback on your class design.

.. admonition:: The random seed

   Remember that the precincts file includes a seed value
   (returned by ``load_precincts``). This seed must be passed
   to ``simulate_election_day`` (using the ``seed`` paramter) and
   you must use the ``random.seed(seed)`` function to set the seed *inside*
   ``simulate_election_day``. The seed only needs to be set once per each
   precinct you simulate (since we're only simulating one precints, that
   means the seed only gets set once; this will change in Task 2).

   Remember that PA1 includes a longer discussion of random seeds,
   and their significance in testing simulations.


Testing
-------

We have provided several automated tests that will test your implementation
of the ``simulate_election_day`` function. To run only the tests
that assume a single precinct, run the following from the Linux command-line::

    $ py.test -xv -k single test_simulate_election_day.py

(Recall that we use ``$`` to indicate the Linux command-line prompt.
You should not include it when you run this command.)

However, remember that the first approach to testing and debugging your code
should not rely exclusively on the tests. If you run the tests, and it is not
immediately clear what the issue is, you should manually run your simulation
code either from IPython or by using a command we describe below. These
two approaches will generally produce error messages that may be easier to
interpret and debug.

From IPython, first load the ``simulate.py`` and ``util.py`` modules
as follows::

    In [1]: %load_ext autoreload

    In [2]: %autoreload 2

    In [3]: import util

    In [4]: import simulate

Next, you will need to use ``load_precincts`` to load a configuration file.
We have provided six configuration files (named ``config-single-precinct-N.json``,
where ``N`` is a number) that you can use to test your implementation. For
example, you can load ``config-single-precinct-0.json`` like this::

    In [5]: precincts, seed = util.load_precincts("data/config-single-precinct-0.json")

And now you are ready to call ``simulate_election_day``::

    In [6]: voters = simulate.simulate_election_day(precincts, seed)

To easily inspect the values of the voters, you can use the 
``print_voters`` function provided in the ``util`` module::

    In [7]: util.print_voters(voters["Downtown"])
    Arrival Time   Voting Duration   Start Time    Departure Time
          6.78         2.11             6.78              8.89

          7.01         5.13             8.89             14.02

         25.27         2.68            25.27             27.94

         26.89         1.10            27.94             29.04

         33.48        10.08            33.48             43.56

If you'd like to manually verify whether the values are correct,
each configuration file has a corresponding CSV file with the
expected values (take into account that these are the files
the automated tests use; if you find that your implementation seems
to produce correct values, you may want to switch to running the
automated tests).

We also provide a main function in ``simulate.py`` that allows you
to run ``simulate.py`` from the command-line. If you run ``simulate.py``
with just the name of a configuration file, it will print a summary of
the simulation results. For example::

    $ python3 simulate.py data/config-single-precinct-0.json 

    PRECINCT 'Downtown'
    - 5 voters voted.
    - Polls closed at 60.0 and last voter departed at 43.56.
    - Avg wait time: 0.59

You can use the ``--print-voters`` options to, instead, print
all the voters::

    $ python3 simulate.py data/config-single-precinct-0.json --print-voters

    PRECINCT 'Downtown'
    Arrival Time   Voting Duration   Start Time    Departure Time
          6.78         2.11             6.78              8.89

          7.01         5.13             8.89             14.02

         25.27         2.68            25.27             27.94

         26.89         1.10            27.94             29.04

         33.48        10.08            33.48             43.56


Task 2: Simulating multiple precincts
-------------------------------------

The precinct configuration files we have used up to this point contain
only a single precinct. However, the file format allows for multiple
precincts to be specified in the same file. For example, the
``config-multiple-precincts-1.json`` file specifies two identical precincts,
except for the number of booths::

      {
       "name": "Little Rodentia (1 booth)",
       "hours_open": 1, 
       "num_voters": 10, 
       "num_booths": 1,
       "voter_distribution": {
                               "type": "poisson", 
                               "voting_duration_rate": 0.1, 
                               "arrival_rate": 0.16666666666666666
                             }
      }

::

      {
       "name": "Little Rodentia (2 booths)",
       "hours_open": 1, 
       "num_voters": 10, 
       "num_booths": 2,
       "voter_distribution": {
                               "type": "poisson", 
                               "voting_duration_rate": 0.1, 
                               "arrival_rate": 0.16666666666666666
                             }
      }


You must modify ``simulate_election_day`` so that it will simulate all the
precincts specified in the file, in the same order in which they are listed.
The return value of the function will be the same as before: a dictionary
mapping precinct names to lists of voters. However, your dictionary will
now have more than one key in it.

If you designed your classes properly, this will require adding 3-4 lines
of code to ``simulate_election_day`` (in our own implementation, adding
support for multiple precincts required adding just a *single*
line of code to ``simulate_election_day``). The main thing you need to watch out
for is that you must make sure to create a new voter generator object
for each precinct. **You must also make sure to reset the random seed
to the given value before simulating each precinct.**

If you find that adding support for multiple precincts requires substantially
more than 3-4 lines of code, it is likely that your class design needs improvement.
Please go to office hours or ask on Piazza so you can get some
feedback on your class design.

If you make the modifications correctly, you can test this functionality
from IPython or from the command-line simply by using one of the configuration files
with multiple precincts. For example, for the configuration file shown above::

    $ python3 simulate.py data/config-multiple-precincts-1.json 

    PRECINCT 'Little Rodentia (1 booth)'
    - 10 voters voted.
    - Polls closed at 60.0 and last voter departed at 134.48.
    - Avg wait time: 46.43

    PRECINCT 'Little Rodentia (2 booths)'
    - 10 voters voted.
    - Polls closed at 60.0 and last voter departed at 79.28.
    - Avg wait time: 15.00

Notice how the results make sense intuitively: with more booths, it is
possible to accommodate new voters sooner, which means waiting times
will go down.

You can also run the automated tests that use multiple precincts like this::

    $ py.test -xv -k multiple test_simulate_election_day.py


Task 3: Finding the average waiting time of a precinct
------------------------------------------------------

At this point, we have the ability to simulate precincts under a variety of
parameters. As we saw above, an interesting parameter to tweak is the number
of booths, since it has a direct effect on the average waiting time of
the voters.

In this task, you will implement a function that, given a single precinct and
a number of booths, computes the average waiting time when simulating
that precinct with that number of booths::

    def find_avg_wait_time(precinct, num_booths, ntrials, initial_seed=0):

Where: 

- ``precinct`` is a precinct dictionary
- ``num_booths`` is the number of booths to use in the simulation.
  Notice that this will override whatever value is specified in the
  ``precinct`` dictionary. However **do not** modify the ``num_booths``
  field of the ``precinct`` dictionary.
- ``ntrials`` is a number of trials (this is explained below)
- ``initial_seed`` is an initial seed (also explained below)

Given a single simulation of a precinct, computing the average waiting time
of the voters is not difficult. In fact, if you look at the code we provide(see the ``cmd``
function in ``simulate.py``),
you'll find some code to do just that! However, as with any simulation, 
it is unreasonable to accept the result of a single simulation (or "trial"). So, instead
you will simulate the precinct ``ntrials`` times, compute the average wait time for each trial,
sort the resulting average wait times, 
and return the median value 
(which, for simplicity, we will define simply as element ``ntrials//2`` in
the sorted list of average wait times). 

In this task, you will set the random seed to ``initial_seed`` before you run
the first trial. Then, before any subsequent trial, you will increment the seed
by one and set the random seed to be that value.

To test your implementation of the function, you can call
the function from IPython. For example::

    In [1]: %load_ext autoreload

    In [2]: %autoreload 2

    In [3]: import util

    In [4]: import simulate

    In [5]: precincts, seed = util.load_precincts("data/config-single-precinct-3.json")

    In [6]: p = precincts[0]

    In [7]: simulate.find_avg_wait_time(p, num_booths=1, ntrials=20, initial_seed=seed)
    Out[7]: 2203.0502079782727

    In [8]: simulate.find_avg_wait_time(p, num_booths=8, ntrials=20, initial_seed=seed)
    Out[8]: 4.630351652014112


You can also run the automated tests for this function like this::

    $ py.test -xv -k avg

Task 4: Finding the optimal number of booths
--------------------------------------------

At this point, we have all the necessary pieces to answer the following
question: "Given a target
waiting time :math:`W`, how many booths does the precinct need to ensure that 
the average waiting time is less that :math:`W`?".

You will write the following function to answer this question::

    def find_number_of_booths(precinct, target_wait_time, max_num_booths, ntrials, seed=0):

Where:

- ``precinct`` is a precinct dictionary
- ``target_wait_time`` is :math:`W` as defined above.
- ``max_num_booths`` is the maximum number of booths the precinct is willing
  to have.
- ``ntrials`` is the number of trials to run when finding the average waiting time
  of a precinct.
- ``seed`` is a random seed.

Your function will do a simple linear search: you will first simulate the precinct
with one booth, and check whether the average waiting time of the voters
(as produced by ``find_avg_wait_time`` from Task 3)
is strictly below ``target_wait_time``. If it isn't, you will simulate the precinct
with two booths, and check the average waiting time, and so on up to ``max_num_booths`` or until you
find a number of booths that achieves an average waiting time less than
``target_wait_time``. Take into account that it could also be the case that the provided
``target_wait_time`` is infeasible with a number of booths less than or 
equal to ``max_num_booths``.

So, the result of this search with be a tuple with two values: the optimal number of booths,
and the average waiting time with that number of booths. If the provided
target waiting time is infeasible, return ``(0,None)``.


To test your implementation of the function, you can call
the function from IPython. For example::

    In [1]: %load_ext autoreload

    In [2]: %autoreload 2

    In [3]: import util

    In [4]: import simulate

    In [5]: precincts, seed = util.load_precincts("data/config-single-precinct-3.json")

    In [6]: p = precincts[0]

    In [7]: simulate.find_number_of_booths(p, ntrials=20, target_wait_time=10, max_num_booths=10, seed=seed)
    Out[7]: (8, 4.630351652014112)

    In [8]: simulate.find_number_of_booths(p, ntrials=20, target_wait_time=10, max_num_booths=5, seed=seed)
    Out[8]: (0, None)


You can also run the function from the command-line by using the ``--target-wait-time``
and ``--max-num-booths`` options::

    $ python3 simulate.py data/config-single-precinct-3.json --max-num-booths 10 --target-wait-time 10
    Precinct 'Sahara Square' can achieve average waiting time of 4.63 with 8 booths

    $ python3 simulate.py data/config-single-precinct-3.json --max-num-booths 5 --target-wait-time 10
    The target wait time (10.00) is infeasible in precinct 'Sahara Square' with 5 or less booths

Note: The command-line tool sets the value of ``ntrials`` to ``20``.

And, finally, you can run the automated tests for this function like this::

    $ py.test -xv -k find


Grading
-------

Programming assignments will be graded according to a general rubric.
Specifically, we will assign points for completeness, correctness,
design, and style.  (For more details on the categories, see our `PA
Rubric page <../rubric.html>`__.)

The exact weights for each category will vary from one assignment to
another.  For this assignment, the weights will be:

* **Completeness:** 50%
* **Correctness:** 10%
* **Design:** 30%
* **Style:** 10%

Design is an important part of this assignment, because you have to design
the voter generator class, the voter class, and the precinct class. There
are a few things we will be looking out for in your design:

* The only purpose of the voter generator class is to generate voters.
  If any simulation code is included inside the voter generator, that
  is a sign of a bad design.
* Similarly, the voter generator is the only class that is allowed
  to construct voter objects. Creating voter objects anywhere else typically
  denotes a bad design.
* This assignment does not require implementing a voting booth class.
  If you do, you are likely overthinking how to implement a precinct.
* Your ``simulate_election_day`` function (Tasks 1+2), ``find_avg_wait_time``
  function (Task 3), and ``find_number_of_booths`` function (Task 4) should
  be fairly short and straightforward, with absolutely no simulation
  logic in them (which should, instead, be in your precinct class). If
  your classes are poorly designed, then these functions will tend to be
  long and convoluted.

You must include header comments in all the methods you implement
*except* getters/setters and properties. You do not need to include
a docstring for a class (but it certainly doesn't hurt to do so).


Obtaining your test score
~~~~~~~~~~~~~~~~~~~~~~~~~

Like previous assignments, you can obtain your test score by running ``py.test`` followed by ``../common/grader.py``.

Continuous Integration
----------------------

Continuous Integration (CI) is available for this assignment. For more details, please
see our `Continuous Integration <../../ci/index.html>`__ page. We strongly
encourage you to use CI in this and subsequent assignments.


Submission
----------

See `these submission instructions <startup-alone.html#submission>`__  if you are working alone.

See `these submission instructions <startup-pair-continuing.html#submission>`__ if you are working in the same pair as for PA #2 or PA #3.

See `these submission instructions <startup-pair.html#submission>`__ if you are working in a *NEW* pair.


.. _random:  https://docs.python.org/3/library/random.html
.. _queue:    https://docs.python.org/3/library/queue.html

.. |clearfloat|  raw:: html

    <div class="clearer"></div>