Project 1.1: Packets and ARP
CS233/333 - Networks and Distributed Systems
Fall, 1999.
Due Date: Friday, October 15, 11:59 p.m.
1. Introduction
In this part of the project, you are going to implement the code
needed to read and write raw packets of network data. This
functionality is normally performed by the operating system and your
implementation will closely mimic this functionality (although at a
much higher level). In addition, you will be implementing ARP
(Address Resolution Protocol) which is used to associate IP addresses
with ethernet hardware addresses (and which is needed in later
stages of the project).
2. The Big Picture
In order to support the network, the operating system needs to be
able to send and receive packets. Sending packets is reasonably straightforward--just send them to the network device. On the other hand, receiving packets
is a little more complicated. First, packets may be received at any
time--usually resulting in an I/O interrupt whenever a packet is received.
Second, packets may arrive at a high rate of speed. This is
especially true if the machine is using a high speed network
device (gigabit ethernet for instance). Third, once a packet is
received, the system needs to decode it and figure out what kind of
network protocol it belongs to. For instance, IPX and TCP/IP traffic may
appear on the same network interface. In this case, it would up
to the system to dispatch an appropriate handler for each of the
supported network protocols and to discard packets belonging to any
unsupported protocol. Finally, in some cases, packets may
be arriving faster than the system can decode and dispatch them.
Needless to say, these factors lead to a number of design
considerations:
- The network interrupt handler needs to be able to pull packets off
of the network device as fast as possible. In some cases, packets
may be arriving faster than the system can process them. In these
cases, unprocessed packets are simply placed on a queue for later
processing (with the assumption that the system will eventually
catch up).
- The network code may be running under certain memory constraints.
If packets are arriving at a high rate and memory is limited, packets
may have to be dropped.
- A machine may be supporting different network protocols
on the same interface (and packets corresponding to all of these
protocols can appear at any time and in any order).
These tasks are sometimes implemented by breaking up the network
handling code into two pieces: a top-half and a bottom-half. The
top-half handler is responsible for reading raw packets off of the
interface and placing them on an incoming packet queue. In order to
collect packets arriving at a high-rate of speed, the top-half handler
is optimized for speed--in other words, it does very little work other
than reading received packets off of the network interface and saving
them for later processing (it should also be noted that in OS terms,
the top-half handler is an device interrupt handler that runs with
system interrupts disabled--thus there are other reasons for wanting
it to run as fast as possible). The bottom-half handler on the other
hand, does all of the work. It removes incoming packets from the
receive queue, decodes their protocol, and dispatches an appropriate
handling function (or discards the packet if the protocol is unknown).
Unlike the top-half handler, the bottom half has a lot more work to do
and is generally slower. However, it is also interruptible--meaning
that the top-half handler can continue to read packets off of the
network even if the bottom half handler is busy.
Now, assuming that you aren't completely confused by now, what we've
really got here is a "consumer-producer" problem from operating
systems. The top-half handler is producing packets and placing them
on an input queue. The bottom-half handler is consuming packets and
sending them on to additional code for processing.
3. Packet Handling
Your first task is to implement a module for low-level packet handling. Your
module should be placed in a file "packet.py" and contain the network top-half and bottom-half
handlers as well as a function for sending packets.
3.1 The Network Top-Half
To implement the top-half handler, you need to create a thread that
simply reads packets off of the network interface and stores them in a
receive queue.
To get you started on this, please refer to the handouts on Python
thread programming and consider the following example (which contains
about 90% of the code needed for the top-half handler):
# Thread for reading raw network packets
import threading
import eth
import proto
class Network_top(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.setDaemon(1)
# Read packets
def run(self):
while 1:
data = eth.read_packet() # Read a raw packet
print "Got packet of %d bytes" % (len(data),)
# Now create and start the top-handler thread
top = Network_top()
top.start()
Now, modify the above code as follows:
- Create a packet queue. The queue should be globally accessible
as the bottom-half handler will need to look at it.
Note: this should require about one line of code.
- Modify the run() method to place received packets in the queue
unless the number of packets currently in the queue exceeds the value of proto.MAX_QUEUE.
In this case, the received packet should just be discarded. You
may want to print a debugging message when this happens (just so
you know what's going on).
- Create some sort of lock (mutual exclusion, semaphore, etc...) for
the packet queue and modify the run() method to acquire/release the
lock whenever a packet is inserted into the queue. Note : the
theading module already defines a variety of locks you can use for this purpose.
-
You may want to try things out at this point. If you take your
code up to the networks lab, you should be able to grab raw packets
going across the network. Try using FTP, or a browser to see a lot of
packets go flying across the network.
- If running in the Linux lab, a number of test programs will be made available
for you to test your packet handler above (will be announced on the mailing list).
Make sure you fully understand this first part before
proceeding any further. Your code shouldn't be very big (<50 lines of code).
However, there may be a bit of a conceptual barrier to overcome if you
have never worked with threads before.
3.2 The Network Bottom Half
The network bottom half handler is going to decode network
packets and send them on to handler functions (if defined).
- Define a thread Network_bottom in a manner analogous
to the Network_top thread above.
- Define a run() method that simply waits for a packet to appear in
the packet queue and starts processing when a packet is received.
Whenever a packet is available, it should be removed from the queue
as the first step in its decoding. When you do this, you need to make
sure you use the queue locking mechanism created earlier. In
other words, you need to make sure that the top-half and bottom-half
handlers are not updating the packet queue at the same time (otherwise
you may end up with a race condition).
- Modify the run() method to examine the ethernet packet type field and determine if the
packet is encoded according to 802.3 or RFC894 encapsulation (refer to
the handouts). To do this, you'll need to look at the length field of
the packet assuming that it's in the 802.3 format. If the length is
less than or equal to 1500 bytes, it is indeed an 802.3 packet.
Otherwise, the packet is encoded using RC894 and the length is
actually the protocol type number. For this project, you
should discard all packets encoded using 802.3 (although obviously you
wouldn't be able to do this in a real operating system).
- Define a function register_handler(proto, func) that
registers a handler function with a particular protocol type.
proto is an integer protocol type (as found in the type field
of an ethernet packet), and func is a handler function to
call when a packet of that type is received.
The register_handler function should do nothing more than
create an internal table mapping protocol ids to handler functions.
- Define a function unregister_handler(proto) that
unregisters a protocol handler. Basically, the opposite of
register_handler().
- Modify the run() method to compare the packet
type identifier with the protocol types registered using the
register_handler function. If a match is found,
the protocol handler function should be invoked with the
raw packet passed as an argument. If no handlers match the
packet type, discard the packet (unsupported protocol).
- Start the network top half handler and test your bottom
half handler as follows:
import packet
def ip_packet(pkt):
print "Received IP packet ", pkt
# Start the bottom-half thread
bottom = packet.Network_bottom()
bottom.start()
# Register a handler for IP packets
packet.register_handler(0x0800, ip_packet)
Once running, try making some connections such as FTP and telnet.
You should see your ip_packet() function being called. Note:
you'll need to do this in the networks lab.
Here are a few suggestions:
- The struct module can be used to simplify the decoding of packets.
- Several parts of the network packet structure are encoded as big-endian integers.
Make sure your decoder takes this into account (the struct module can take care of this
for you as well).
3.3 Sending packets
Implement a function send_packet(data) that can be used to
send a raw packet of information out of the network device. Your
function should work as follows:
- Use the eth.send_packet() function to send data to the
ethernet device.
- Maintain a global mutual exclusion lock that only allows one thread
to send data at a time. Your function should acquire the lock before
sending any data and release the lock after it has finished. This
is necessary to prevent strange program behavior in the event that
two threads attempt to send packets at the same time.
Congratulations, if you made it this far, you have just implemented the lowest level of the
network subsystem.
4. ARP
Once you have your low-level network handlers working, your
first protocol to implement is ARP. ARP is used to associate
IP addresses with ethernet addresses on a local area network.
Please refer to the handouts for details of the protocol.
You need to implement two features. First, you need to create a
function arp(ipaddr) that takes an IP address of the form
'192.168.69.5' as an argument and returns the 6-byte ethernet address
of the device to which packets should be sent. Higher-level functions
in later stages of the project will use this function to figure out how
to encode outgoing packets in a manner that makes sure they are
delivered to the right destination. Second, you need to implement
a handler function that knows how to correctly respond to
ARP request packets. These two features are
interrelated. In particular, the arp() function works by
sending out ARP requests (which are then answered by the ARP handler
on the other machines).
4.1 ARP Packet Handler
Using your low-level packet module, define a handler function for receiving
ARP packets. ARP packets have a protocol ID of 0x0806
(available as the constant proto.ARP_PROTO). You should
register your handler using the register_handler() function
you created earlier.
Your handler function must operate roughly as follows:
- It should examine the target ethernet address of the received packet. If the
address matches the local ethernet address or is ff:ff:ff:ff:ff:ff
(broadcast), the packet should be further considered. Otherwise, the packet
should be discarded.
- Examine the ARP type field to find out what kind of ARP
packet has been received (a request or a reply).
- If the type is 1 (proto.ARP_REQUEST), examine the IP
address in the ARP request. If it matches the local IP address,
construct an appropriate ARP reply packet containing the local ethernet
address and send it back to the sender. Otherwise, discard the
packet (some other machine must have that IP address).
- If the ARP type is 2 (proto.ARP_REPLY), the packet is a
reply to a request we must have sent earlier.
In this case, the handler function should check to see if there are
any outstanding ARP requests corresponding to the IP address in the reply.
If so, they should be notified with the returned ethernet address in some
manner. If the reply does not match any pending request, the packet
should discarded (maybe someone is doing something weird on the network).
Note: Build the handler function in pieces. In particular, it will
be hard to implement the reply handling unless you have already implemented
a substantial part of the arp() function below.
4.2 The arp() function
The arp(ipaddr) function takes an IP address as input and returns
the 6-byte ethernet address of the machine assigned to that IP
address. In addition, it maintains a cache of recent IP address to ethernet
mappings.
Your arp() function must operate as follows:
- arp() blocks the calling thread until it returns with a success or failure code.
(i.e., arp() is a blocking system call).
- arp() should first check in a local cache of recently returned IP->ethernet mappings.
If the IP address is already known, the cached result should be immediately returned
(no activity on the network is required). Note: it is easy to build a cache using a
Python dictionary.
- If the IP address does not correspond to a machine on the local
subnet, arp() should return the ethernet address of the local
gateway (which is either another machine or a router). Note: in order
to determine if a packet is on the local subnet, use the value of
proto.NETMASK. Likewise, proto.GATEWAY contains the
IP address of the gateway machine (you may need to change this
depending on the network you are using).
- If the IP address corresponds to a machine on the local subnet, formulate
an appropriate ARP request packet and broadcast it. In order to send a
broadcast packet, you need to set the destination ethernet address to
ff:ff:ff:ff:ff:ff.
- If, after sending a request, no reply is received within 5 seconds,
retransmit the original request. If still no reply is received,
repeat one more time. If still no reply is received, the arp() function should
return with an error indicating that no such host can be found (as no machine
is responding to the given IP address). Because of this, the arp() function
should block for no more than about 15 seconds before giving up.
- When a ARP reply is successfully received, the ethernet address corresponding to
the IP address should be placed in a cache for subsequent calls (so we don't have
to keep doing this every time we want to contact another machine).
- Entries in the ARP cache should be expired after 5 minutes. This is
to prevent entries from becoming stale (if someone reconfigures a machine on
the network for instance). In order to implement this feature, you will need
to store the time at which each cache entry was saved. In addition, you
should have some sort of thread that runs periodically and cleans up old
ARP cache entries.
4.3 Implementation Hints
Implementing this part of the project involves some particularly
tricky (and potentially mind-bending) control flow. Here are
a few things to consider:
- arp() is a function that may be called by any number of threads
at any given time. Because of this, you will need to maintain
locks around the ARP cache. You will also need to worry about
the possibility of having multiple outstanding ARP requests active at the
same time and the remote possibility of receiving replies in a different
order than requests.
- To make the arp() function block, you will need to put the thread to
sleep immediately after an ARP request packet has been sent. If a matching
reply is received, the thread should be immediately awakened and control returned
to the caller. If no reply is received, the thread should wake up after 5 seconds
and retry at least two times before giving up entirely.
- You may find the "Condition Variable" object in the threading module
to be particularly useful here.
5. What to hand in
You will be creating two modules:
- packet.py. This file contains the network top-half and bottom-half handlers and the packet send function.
There should be three publicly accessible functions:
- register_handler(type,func). Register a protocol handler.
- unregister_handler(type). Unregister a protocol handler.
- send_packet(data). Send a raw network packet.
In addition, the packet.py file should automatically start the network handling threads when it
is loaded.
- arp.py. The ARP protocol handler. This file should contain the ARP protocol handler function
and a single publicly accessible function:
- arp(ipaddr). Return the ethernet address of the machine with IP address ipaddr.
The arp.py file should automatically register its protocol handler using the packet.register_handler()
function when it is loaded.
Testing and grading
Testing this part of the assignment is easy. You should be able to load the arp module and
determine the ethernet addresses of various machines on the network. For example:
>>> import eth
>>> import arp
>>> e = arp.arp("192.168.69.4")
>>> print eth.eth_address_string(e)
00:c0:80:7f:1a:20
>>> f = arp.arp("192.168.69.70")
Traceback (innermost last):
File "", line 1, in ?
RuntimeError: No response. Timed out.
>>> g = arp.arp("128.135.11.100")
>>> print eth.eth_address_string(g)
72:c8:a2:1f:37:42 # (should be eth address of the gateway)
>>>
Finally, this is potentially the most difficult part of the project since it involves a
bunch of new stuff (locks, threads, condition variables, etc...). Please come see me or
post a message to cs333@cs.uchicago.edu for help.
Note: your solution to this part of the project should involve very little code (my
solution was less than 300 lines of code not counting comments--you can probably do
better than this).