Diana Franklin

Goals for this Pre-lab

Practice binary search trees
Practice making a data structure generic using void pointers and function pointers

Remember to create a new directory in your repository for this lab. (Refer to previous labs for how if you have forgotten)

First answer the questions in lab6questions.html.

Second, read through the warmup exercises so you are ready to begin in class.

Valgrind

Valgrind is a memory mismanagement detector. It shows you memory leaks, deallocation errors, etc. Actually, Valgrind is a wrapper around a collection of tools that do many other things (e.g., cache profiling); however, here we focus on the default tool, memcheck. Memcheck can detect the following things that are useful to you right now:

Use of uninitialised memory
Reading/writing memory after it has been free'd
Reading/writing off the end of malloc'd blocks
Reading/writing inappropriate areas on the stack
Memory leaks -- where pointers to malloc'd blocks are lost forever
Mismatched use of malloc [] vs free []

Click here for a nice example with explanation for how to use valgrind and interpret its results.

You can use valgrind if you log into the main cs department computers:

ssh cnetid@linux.cs.uchicago.edu

You can check out your repository there, compile it, and test it. You may use vim to edit your code there.

Setup

I am providing several files. For each file, create the file in vi and paste in the code I provided. The files are:

Exercises

During this week, you will create some code that will be used in your project in the next two weeks. These exercise structures, trees, and function pointers.

Problem 1: word struct and functions

You will first create a struct that will be used to test the binary search tree. As you know, to use a binary search tree, the struct being stored needs to provide a compare function for insertion. The word struct contains two fields: a string and a count. You need to implement a compare function, compare_words, that compares the word as compared by string compare functions. You need to make sure that even if some letters are capitalized in the word, it still returns that they are identical. Look at the string library to help you handle comparison and case. The details of the function is in word.h.

Problem 2: binary search tree

You are writing several binary search tree functions (bst):

bst* create_bst(); // create an empty bst
int accum_all(bst*, int (*f)(void*)); // accumulate the result from this function on all nodes of a bst
void* find(bst*,void*,int (*c)(void*,void*)); // find an item in the tree using the compare function provided. If there are multiple matches, you can return any of the matching ones.
void bst_insert(bst*,void*,int (*c)(void*,void*)); // inst an item into the tree using the compare function provided. If they match, you may place the item either in the left or right subtree - your choice.

Think very carefully about the order in which you will implement these. Specifically, what is the minimum number of functions you need to implement in order to fill and print out a bst (putting in wordcount structs)? Choose the simplest functions that will give you this functionality.

You are not expected to complete this in lab. You are welcome to, but not required to, work with your duet partner after lab is over. If you wish, you may copy over your work so far and work independently after lab.

Testing

A program like this, that takes in a lot of data, is exceedingly annoying to test through command-line arguments or hard-coded tests within the code. Instead, what you would like is to have a file of words that can be opened and read.

For files that are not in a strict format, the paired functions fgets and strtok are functions that work every time. Once you learn to use them, you will be able to read in files of any format.

There are two steps to reading from a file. First, read in the file line by line. Second, split up each line into tokens or the different pieces to be dealt with separately.

In this assignment, we care about splitting up the file into words. We will define words as the contiguous sequences of adjacent alphanumeric characters. Therefore, "45, hello there" would consist consist of three words: "45", "hello", and "there".

Here is code that reads in a file line by line, print out each line with its line number.


#include <stdio.h >
#include <stdlib.h>

#define BUFSIZ 120

void display_file(char *filename)
{
  FILE *fp;
  char buf[BUFSIZ] = "Garbage";
  int i;

  // attempt to open the file, then check whether that worked.
  if ((fp = fopen(filename, "r")) == NULL)
  {
    fprintf(stderr,"Could not open file %s\n",filename);
    return (1);
  }

  // for each line of the file, read it in and then print it out
  i = 0;
  while (fgets(buf, sizeof(buf), fp) != NULL)
  {
    printf ("Line %4d: %s", i, buf);
    i++;
  }

  fclose(fp);
}

You can read more about reading files with fgets in this fgets tutorial.

The second step is to learn to use strtok to read the file. Strtok divides up a string into different parts, making separate strings. It does NOT allocate new memory - everything it does is within the original string. Therefore, if you want to use the strings later, you'll need to allocate space for them and copy them over.

To learn how to do this, read this tutorial on strtok.

Resources

Pro Tip 1: Commit often. Each time you complete a step and verify it works, commit your code!

Just checking: Did you add all your files? Don't forget!