Project 3: Remote Procedure Call
CS233/333 - Networks and Distributed Systems
Fall, 1999.
Due Date: Friday, November 19, 11:59 p.m.
1. Overview
One difficulty of using remote procedure call (RPC) with languages
like C and C++ is that the programmer is required to precisely define
an interface specification (usually in a separate file such as the .x
files used by rpcgen). The primary purpose of this file is to
identify all of the datatypes and calling conventions for the functions
that will be used. It should also be noted that having this file is, in a sense,
necessary since there is no easy way for a C program to simply discover the
calling conventions of a library once it has been compiled (i.e., calling
conventions and type information is not encoded into C libraries or executables).
More modern languages like Java and Python however, have the ability to
perform "introspection." That is, a program can inspect the contents
of classes and modules at run time and, in some cases, even dynamically
generate code to be executed later. As a result, it is possible to support
an RPC-like mechanism without requiring a special interface specification
or even a stub generator.
In this project, you are going to create a module rpc.py that
uses the introspection capabilities of Python to allow remote procedure calls
to be made to arbitrary Python modules.
2. The RPC Module
Your rpc module should work for both servers wanting to provide
an RPC service and clients that want to connect to those servers. You
only need to implement three functions:
- register_module(module, portmap).
Creates a new RPC service and registers it with the portmapper. module is any valid Python
module (already loaded with import) and portmap is a tuple
of the form (ipaddr, port) containing the IP address and port number
of the portmapping service (which may live on any machine).
- serve_forever().
This function is used by servers to start the RPC runtime.
Once started,this function listens for incoming client connections and
dispatches procedures to any of the modules that have been previously registered
using the register_module() function.
- remote_import(modulename, portmap).
Perform a remote module import.
modulename is the name of the module as a string and portmap
is a tuple of the form (ipaddr,port) containing the IP
address and port number of the portmapping server. This function
should return a module object with a collection of stubs that behave
exactly like local procedures, but which actually execute on a remote
server (see the example below). If the remote module name is unknown
to the portmapper or an error occurs, this function should raise the
ImportError exception.
Here is a simple example showing how your module is support to work:
- Start the portmap server on some machine and port. For example:
% python portmap.py 10000
Portmapper started on gargoyle.cs.uchicago.edu:10000
- Write a simple server that provides an RPC service. For example, the following
code turns the "string" module into an RPC service:
# RPC server
import string
import rpc
rpc.register_module(string,("gargoyle.cs.uchicago.edu",10000))
rpc.serve_forever()
- Now, start your server on some machine and try to connect to it with a client
as follows:
>>> import rpc
>>> rpcstring = rpc.remote_import("string",("gargoyle.cs.uchicago.edu",10000))
Remote module 'string' loaded.
>>> rpcstring.split("Hello world") # Makes a remote procedure call
['Hello', 'world']
>>> rpcstring.split(3)
Traceback (innermost last):
File "<stdin>", line 1, in ?
TypeError: argument 1: expected read-only character buffer, int found
>>>
- And that's it. Hopefully, the big picture is clear.
The next few sections describe the pieces you need to implement.
3. The portmapper
The first thing you should implement is the portmapping server.
All this server does is keep track of remote services. Both RPC servers and clients
will contact the portmapper.
Registering a service
When an RPC server wants to publish the availability of a remote module, it should contact
the portmapper and send it the following information:
- The remote server's IP address and port number (on which it will receive
connections by clients).
- The remote module name as a string.
- A list of strings containing all of the procedure names exported by
that module.
This information should then be saved by the portmapper so that it can
later hand it out to clients.
To illustrate, suppose you executed the following code on rustler.cs.chicago.edu:
# RPC server
import string
import rpc
rpc.register_module(string,("gargoyle.cs.uchicago.edu",10000))
This might contact the portmapper and send it the following information:
("rustler.cs.uchicago.edu", 18736, "string", [ 'atof',
'atoi', 'atol', 'capitalize', 'capwords', 'center', 'count',
'expandtabs', 'find', 'index', 'join', 'joinfields', 'ljust', 'lower',
'lstrip', 'maketrans', 'replace', 'rfind', 'rindex', 'rjust',
'rstrip', 'split', 'splitfields', 'strip', 'swapcase', 'translate',
'upper', 'zfill'])
In this case 18736 is the port number selected by the server for
client connections (the value is completely arbitrary). The list of
strings starting with 'atof' simply contains all of the function names
contained within the string module.
Requesting a service
When a client requests a service using the remote_import() function,
it contacts the portmapper and asks it for a particular module name. If
the portmapper knows about that module, it should send the above
information back to the client (at which point it is up to client to contact
the remote service directly). Otherwise, if the portmapper does not know about
the module, it should return an error to the client.
Implementation details of the portmapper
Implementation of the portmapper should be relatively straightforward:
- Implement the portmapper in a file portmap.py.
- Use the socket module to write a simple portmapping server using TCP
sockets.
- Create a simple protocol for both registering and requesting services.
(It should be a very simple protocol indeed).
- Use the pickle module to marshal all of the data sent to and
from the portmapper. (See the details on pickle in the Python book).
- It should be possible to run the portmapper as follows:
% python portmap.py 10000
where 10000 a port number (you can pick whatever port number you want).
I would be very surprised if your implementation of the portmapper is
more than 50 lines of code. Think simple (it's not much more than a
dictionary, a socket connection, a few calls to the pickle module).
4. The RPC server runtime
After you have the portmapper working, create a file rpc.py
and implement the register_module() and serve_forever()
functions. Follow the basic steps below:
- The rpc module, when first imported, should create a TCP socket
for receiving incoming connections. Unlike your past assignments, you
should set this up so that it binds the socket to any available port
number. This is easy:
# rpc.py
import socket
sock = socket.socket(socket.AF_INET, sock.SOCK_STREAM)
sock.bind("",0) # Assign to any port
hostname = socket.gethostname()
port = sock.getsockname()[1]
print "socket open at %s:%d" % (hostname,port)
- Implement the register_module(module, portmap) function. This is also
relatively easy. First, the module argument should be a module already loaded using
the import statement. Next, the first thing that your function should do is
examine the contents of the module and locate all its function names. For example:
def register_module(module, portmap):
mod_name = module.__name__
func_names = [ ]
for name, object in module.__dict__.items():
if callable(object) and name[0] != '_':
func_names.append(name)
The first statement simply extracts the string name of a module.
The callable() function tests an object to see if it
is callable like a function. The check for a leading underscore ('_')
is needed because Python treats all function names of this form as
private (when importing modules).
Next, contact the portmapper and send it a message containing the local
socket address, module name, and list of functions you extracted above. The
easy way to do this is to simply package everything up in a tuple (as shown
earlier), run it through the pickle module and send it to the portmapper.
Note: when contacting the portmapper, you should use a *different* socket
than the one created above.
Finally, have the rpc module keep an internal record of all of the modules
that have been registered. You will need this to handle incoming requests.
This should be pretty easy, just keep a global dictionary mapping module names
to module objects. For example:
modules = { }
...
def register_module(module,portmap):
...
modules[module.__name__] = module
...
- Implement the serve_forever() function. This function should
start listening for incoming connections on the socket created in the first step.
When a connection arrives, it should look at the incoming message and try to
dispatch one of the functions contained in the registered modules. To do
this, you will first need to come up with an appropriate message format. One
option is to simply pass a tuple containing something like this:
(modulename, functionname, args)
Where modulename is a string containing the module name, functionname
is a string containing the name of the function, and args is a tuple containing
the function arguments. Given this information, it is easy to invoke a function
in the module. Simply do this:
result = apply(modules[modulename].__dict__[functionname],args)
(Read the Python book to know exactly what's going on here). Of course, you will probably
want to add in some security and error checking too.
As for the result, you need to follow a similar procedure. The result
of a function should be packaged in a message format suitable for
sending back to the client. Furthermore, if an error occurs, you
should propagate the error back to the client (exceptions on the server
should generate exceptions on the client). Note: an exception on the server
should not cause the server to stop running so you will need to do some
exception handling using try and except.
Now, a few miscellaneous implementation notes:
- You should use TCP for client connections. Furthermore, a client
will keep its TCP connection open the entire time it is connected.
Thus, you will need to figure out how to manage multiple requests and
responses going across the same connection (it shouldn't be hard).
- Your RPC runtime should either use fork() or threads to allow
multiple clients to connect simultaneously.
- Use the pickle module to marshal and unmarshal data sent across
the TCP connection. This will simplify your life considerably.
- It is an error for the RPC runtime to allow a client to execute any
function not explicitly registered with the portmapper. Thus, you will need
to add some error checking to make sure the client isn't accessing private
functions, procedures in non-registered modules, etc...
5. Client Stubs
Finally, your last step is to figure out the problem of client stubs. First,
keep in mind that the real implementation of the remote functions live on the
server. A stub is just a little function that lives on the client that
takes the arguments passed by the user and packages them up into a network message
to be sent to the server. It then needs to be able to receive the return
result. In Python, this is going to be relatively easy. A stub might
look something similiar to the following:
# atoi stub
def atoi(*args):
message = ("atoi", args)
send_message(server, message)
The *args is used to collect any sequence of function arguments
into a tuple. Once in this form, it's pretty easy to package. We'll just stuff
them all into a network message and let the server figure them out (if the arguments
are invalid, it should send an error back).
Now, there are a few somewhat complicated details to work out on the
client. However, the general idea of how this is going to work is
that when the client contacts the portmapper, it is going to receive a
list of function names. Using this list, you are going to dynamically
generate a set of stub functions as a big Python string. Then, using
some magic, you are going to execute this string to generate
stub functions "on the fly"--at which point you will have a working
stub module.
Here's how to proceed:
- First, in the rpc.py file (created earlier), write an internal function send_message()
that knows how to send an RPC call message to an RPC server. Basically, this should create a
message that is compatible with what is expected by the RPC server runtime code written
earlier. Use the pickle module to marshal data.
- Now, start implementing the remote_import(name, portmap)
function. The first thing that this function should do is establish a connection with
the portmap server and see if it knows anything about the module name. If it does,
you should receive the IP address, port number, module name, and a list of function names
exported by the RPC service back in its response. Otherwise, raise an ImportError to
indicate an unknown module.
- Next, using the list of function names returned by the portmapper, continue to work on
remote_import() by writing some code
that generates a big string containing Python function definitions. It's going to look a little
funky, but pieces of it might look like this:
print """
def %s(*args):
rpc.send_message(__sock__, "%s","%s",args)
""" % (fname, modulename, fname)
(Note: the specific details will depend on how you have implemented things). Note,
the __sock__ variable is just something I picked to indicate the socket
connection to the remote server. You will need to have something like this somewhere.
- Now, the magic begins. Continuing to work on the remote_import() function,
you need to construct a new Python module out of thin-air. The way to do this
is as follows:
import new
m = new.module(modname) # Create a module object
Next, execute all of the stub code you placed into a string inside this new
module like this:
# stubs = string of stub code
# execute stub code inside the newly created module
exec stubs in m.__dict__, m.__dict__
- Patch up the newly created module with any addition information needed to make it work
(such as having a reference to socket object connected to the remote server).
- Return the stub module back to the caller.
- And that's it.
Now, if you make it this far (as I'm sure you will), you will definitely know something
about how Python operates.
6. Testing
Testing is pretty simple. I should be able to start your portmapper on some machine.
Then using code similiar to that listed at the beginning, I should be able to publish
modules on the network and import modules remotely.
7. Extra Credit
- Make your RPC module support Python's keyword arguments.
- Use UDP instead of TCP.
- Modify Python's import statement to automatically contact the
portmapper and remotely load a module if available.