Project 1 : The Yash Shell

CS230/330 - Operating Systems (Spring 1999).

Due : Friday, April 16, 11:59 p.m.

Overview

Not satisfied with any of the existing Unix command shells (and in the holy name of code reuse), you have decided to create your own command shell called Yash (Yet another shell). Ultimately, you will use your shell to run your own rewritten versions of emacs, the C compiler, the linker, sendmail, and Netscape--however all of this will have to wait until Project 2. In any event, you've decided that your shell minimally needs to support program execution, I/O redirection, pipes, background processes, and signals.

In this project, you will implement the Yash shell. The goal of this project is to familiarize yourself with basic system calls related to processes, files, interrupts/signals, and interprocess communication. In project 2, you will be building an operating system kernel that implements many of these system calls.

Your shell must be implemented in C or C++ and run under Solaris.

Description

The input to Yash is a sequence of commands, each provided on a separate line of input text (or typed interactively at the keyboard). The following commands must be supported:

progname [args]

Runs the program progname with the given, possibly optional, arguments. For example:

yash % ls
foo.c
bar.c
yash % cp foo.c foo1.c
yash % rm -f foo.c

exit

Forces the shell to exit.

resume

Resumes execution of the last suspended program.

I/O Redirection

By default, yash runs programs so that input data is read from standard input and output data is written to standard output. However, this behavior can be changed using I/O redirection. I/O redirection is specified using the < and > operators at the end of a command line. For example:

progname [args] >file.out

Writes the standard output of progname to the file file.out.

progname [args] <file.in

Uses the contents of the file file.in as the standard input to program progname.
Both input and output redirection may be specified for a single command so your shell will have to check for both and operate accordingly.

Pipes

A pipe is nothing more than a way of routing the standard output of one program to the standard input of another program. The yash shell supports pipes using the | operator as follows:

progname1 [args] | progname2 [args]

Pipes the output of program progname1 to the input of program progname2. For example:

yash % ls -l | wc
       7      56     366
yash %
For simplicity, your shell only needs to support a single pipe operator per command line (although we do not disallow the use of more than one pipe per command line). Pipes may be combined with I/O redirection such as follows:
yash % grep hello <infile | wc >count
The output of a program can not be sent to both a file and a pipe.

Background Jobs

When yash runs a program, the shell is blocked until the program terminates. However, you can put a program into the background using the & operator. For example:

progname [args] &

Detaches the program progname and runs it in the background. Control is immediately returned to the command shell where additional commands can be executed. Background jobs should continue to run even if you quit the shell before they have finished execution. In yash, there is no way to attach to background jobs.

Signals

Finally, your shell needs to catch two signals, SIGINT and SIGTSTP. The SIGINT signal is generated when a user presses Control-C on the keyboard and is normally used to terminate a running program. The SIGTSTP signal is generated when a user presses Control-Z to suspend a running program. In both cases, your shell needs to catch the signal and pass it on to the currently running program. Thus, when a user presses Control-C, it should stop the running program, not the shell itself.

The SIGTSTP signal is used to temporarily suspend the execution of a running program. When received, your shell should print a message "Suspended" and go back to the command prompt such as follows:

yash % myprog
Running...
<Control-Z pressed>
Suspended.
yash % ls
foo.c
bar.c
Makefile
...
yash %
To resume the execution of the last stopped program, the user should type "resume." To restart the stopped job, your shell will need to send it a SIGCONT signal.

Exit codes

When programs terminate, they return an integer exit code to your shell. If this code is non-zero, your shell should print the returned value. For example:
yash % cp foo bar
cp: cannot access foo
[ Program returned exit code 1 ]
yash % 

How to get Started

Make sure you understand the assignment before beginning any work. Now, consider the following steps as a rough guide.

Step 1 : Command line parsing

Write a function that takes a line of input text and parses it into some sort command structure containing information about the program name, arguments, and options for I/O redirection, pipes, and background jobs. If it helps, the syntax for the the shell is as follows (optional fields are in brackets) :
command    :   program
           |   program | program
           |   "exit"
           |   "resume"
           ;
            
program    :   identifier [ arglist ] [ <infile ] [ >outfile ] [ & ]
Tokens and arguments are separated by white space. To simplify parsing, Your shell does NOT need to support quoted strings such as the following:
yash % foobar "This is a quoted argument" 
Furthermore, you can assume that no whitespace separates the < and > operators from the filename that follows.

After you've got your command line parser working, write an infinite loop that does nothing but print the shell prompt ("yash % "), read a line of input, and pass it to your command line parser. Check the data returned from the parsing function to make sure it looks reasonable.

Step 2 : Make your shell run programs

Once you're satisfied with the parser, modify the command loop to execute programs. You will need to use the fork() and exec() functions to do this. While running, the shell process should wait for the program to complete by calling wait(). The shell should also check the exit code returned by the program and print a message if it is nonzero. Note : the exit code is placed into the lower 8-bits of the status code set by wait. Your code will look roughly like this:
while (1) {
    read a line of input
    cmd = parse command line
    pid = fork();
    if (pid == 0) {
        extract the program name from cmd
        ...
        exec( ... args ...);
    } else {
        wait(&status);
        check return code placed in status;
    }
}

At this point you should have a working shell. Try it out by running some of your favorite Unix commands such as "ls", "cp", "ftp" and so forth. If it doesn't work, you have done something wrong.

Step 3 : Add I/O direction

To add I/O redirection, modify the child process created by fork() by adding some code to open the input and output files specified on the command line. This should be done using the open() system call. Next, use the dup2() system call to replace the standard input or standard output streams with the appropriate file that was just opened. Finally, call exec() to run the program.

Step 4 : Add Background Jobs

This is a little more tricky. When a job is put into the background, the shell just starts it and forgets about it (the shell should return to the command prompt and allow more commands to be typed). However, this presents two problems. First, the background job should keep running even if the shell terminates. Thus, this means that the background job can't be a child of the shell process. Second, when the background job finishes, it needs to have its exit code collected--otherwise it turns into a zombie.

Modify your shell to run background jobs in a way that solves both of these problems. Hint : the solution involves the fork() function.

Step 5 : Add pipes

To support pipes, you need to execute two separate programs and play some funny games with I/O to make the output of one program go to the input of the other program. To do this, you'll need to use the pipe() function and the dup2() function.

Step 6 : Add signal handling

Add support for signals by writing signal handlers for the SIGINT and SIGTSTP signals. Your signal handler function will look roughly like this:
void sig_interrupt(int signo) {
  /* Re-establish the signal handler */
  signal(SIGINT, sig_interrupt);
  send the SIGINT signal to the currently running program
}
After writing this function, you need to tell the OS about it in the main() function of your shell using the signal() function like this:
int main() {
   ...
   signal(SIGINT, sig_interrupt);

   /* Start running the command processor */
   ...
}
Now, whenever the user presses Control-C, the signal handler sig_interrupt will be called.

SIGTSTP signal is handled in a similar manner.

Step 7 : Add the resume function

After you have implemented the signal handler for SIGTSTP, modify your shell to recognize the "resume" command. This command should resume the last job that was stopped with the SIGTSTP signal. To do this you will need to send the job a SIGCONT signal using the kill() function (note : kill() is poorly named and does not necessarily kill a job).

Step 8 : Sit back and look at your shell in amazement.

By now, you should be ready for project 2. "Ha, bring it on!", you say.

Other Odds and Ends

Header files

You will probably need to use the following header files in your solution.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <errno.h>
#include <signal.h>

Getting Help

Since this is an upper division/graduate computer science course, you are expected to do your own research regarding the usage of various system calls, header files, and libraries. Information is readily available in the man pages, Unix reference books, or on the web.

Otherwise, do not hesistate to ask a question if you are unclear about how some part of the assignment is supposed to work. Dustin Mitchell (djmitche@cs) is the TA for this class and can also answer specific questions.

Handin Procedure

Although you are encouraged to talk with your classmates, this is an individual project (project 2 is the group project). Your shell needs to be handed in via CVS as a project named "project1" and must include the following files: Follow the same procedures as in assignment 1 to create your project directory and check it into the CVS repository. You may find it useful to do this when beginning your project as opposed doing it at the last minute. Projects will automatically be collected from CVS shortly after the due date.