Project 2: Stupid Firewall Tricks

CS233/333 - Networks and Distributed Systems
Fall, 1999.

Due Date: Friday, November 12, 11:59 p.m.

1. Overview

Dave (a.k.a. that sneaky bastard) has decided that it would be an interesting idea to layer a home-grown file-transfer protocol over HTTP so that he can transfer files through a firewall designed to block all outgoing data connections except for certain HTTP requests. Your job, of course, is to stop him from doing this by modifying the firewall to operate in an even more sneaky manner.

Although this sounds like a crazy thing to be doing, internal network security is a pretty serious issue for many organizations--especially when the issue of corporate espionage arises. For instance, many companies try to block all outgoing connections to keep hackers from breaking in through a backdoor (perhaps a modem hanging off someone's machine) and transmitting files offsite through a normal Internet connection.

2. Proxy Servers

To complete the assignment, you will need to write a simple HTTP proxy server that receives HTTP requests, examines their contents, and either forwards the request to the requested server or rejects the request with an HTTP error if it looks suspicious.

When a browser forwards a request to an HTTP proxy server, the request looks something like this:

GET http://www.yahoo.com/index.html HTTP/1.0
Proxy-Connection: Keep-Alive
User-Agent: Mozilla/4.61 [en] (X11; U; SunOS 5.6 sun4u)
Host: www.yahoo.com
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*
Accept-Encoding: gzip
Accept-Language: en
Accept-Charset: iso-8859-1,*,utf-8

The thing to notice is that the requested document contains the full URL (as shown in bold above). To make your proxy server work, you need to read in the request, strip out the target hostname and pass the request on to the target destination. For example, upon reading the above request, your proxy would open up a connection with www.yahoo.com and send something like the following:

GET /index.html HTTP/1.0
If-Modified-Since: Wed, 03 Nov 1999 21:42:24 GMT; length=77865
User-Agent: Mozilla/4.61 [en] (X11; U; SunOS 5.6 sun4u)
Host: www.yahoo.com
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*
Accept-Encoding: gzip
Accept-Language: en
Accept-Charset: iso-8859-1,*,utf-8

(notice how the 'http://www.yahoo.com' has stripped off of the request line).

Upon sending the request, the server will send its response back to the proxy server. To make everything work, you will need to route the data returned by the server back through the socket connection between the proxy and the client's browser.

In the event that the proxy rejects an outgoing request, it should send a response directly back to the browser. For example, instead of contacting the server, the following response should be sent back to the client:

HTTP/1.0 403 Forbidden
Content-type: text/html

<html><body>
<h1>Forbidden</h1>

Your request has been rejected.  Stop surfing and get back to work!
</body></html>

3. Your task

Your task is to write a proxy that examines outgoing HTTP requests and tries to determine if it is a legitimate request or not. In doing this, your proxy should *NOT* interfere with normal web-activity including the posting of forms, cookies, etc... However, you are free to interpret what "normal" means in this case. For instance, it is probably not normal for someone to post a form containing 2 megabytes of data or to send a continuous rapid stream of HTTP requests over a prolonged period of time.

As for solving the problem, you will need to be creative as I am not looking for a specific "solution." Think about the problem from the other point of view (i.e., how would you go about defeating the firewall). Then come up with ways of modifying your proxy server to block access.

4. Logistics and hints

Here are a few things to keep in mind:

Please make use of the Python socket module.
Python contains a number of modules for parsing HTTP headers (the rfc822 module for instance). Take a look at the Python book for details.
To avoid having to worry about HTTP/1.1, you should examine all server responses and rewrite their HTTP protocol version to HTTP/1.0 before sending data back to the browser.
Go ahead and close the connection with the server and the client after each request (this will probably make your life easier).
You can make your browser route all of its HTTP requests through your proxy by adjusting its preference settings (usually under advanced options someplace).
It is okay for your solution to catch an outgoing data transmission mid-stream and cut it off (for example, if your proxy believes that someone is transferring outgoing data, it can choose to reject all further connection requests even if a certain number of requests successfully made it through at first).
Do not assume that the server to which requests are being sent is a web-server (in fact, for my hack, it will be a program that simply emulates a web-server, but which is really designed to receive incoming data).
You may find it useful to come up with a set of heuristics for determining if a request is valid or not (in fact, it is unlikely that you will be able to devise a simple algorithm than is 100% effective).

5. Testing

Testing is simple:

First, I should be able to hook your proxy server up to Netscape and browse the web in a more-or-less normal fashion.
I will then hook your proxy up to a number of implementations of my file transfer program and see how successful or unsuccessful I am at sending a large file between two machines. Of course, I'm not going to tell you how these programs work in advance--other than they don't do anything that's not defined by the normal HTTP protocol.

Your grade will be largely determined by whether or not you implement a working proxy server. After that, cleverness and the effectiveness of your solution in foiling my attempts will come into play. Good luck!