- Get a MySQL account. For specific instructions on
how to get a MySQL account, login to MySQL, create relations,
etc. please, refer to Getting
Started With MySQL. Get acquainted with MySQL, change your
initial password, experiment with some simple commands, and check out
the help command. This part is for your enjoyment only -
there is nothing to hand in.
- Create relations for your DatingBase based on your
final relational schema from Part 2. In addition to creating the
appropriate attributes and types, please declare primary keys
for your relations. The attribute types supported by MySQL are
described in the MySQL online documentation.
Turn in a script showing a MySQL session in which your relations
are created successfully. Also show, for each relation, the result of
the mysql DESCRIBE command once the relation has been
created: for a relation T, type "DESCRIBE T;".
Please see Recording Your Session from the Getting Started With
MySQL document for a guide to preparing output to be submitted
for this and subsequent project parts.
- Develop a small amount (10-20 tuples per relation) of real
data and load it into your relations using the SQL load
command. See Loading Data in MySQL for information on how to bulk-load data.
Turn in a script showing the loading of your data into MySQL.
- Develop a substantial amount of data for your database and
load it into your relations using the SQL load command. To
create the data, write a program in any programming language you like
that creates large files of records in a format acceptable to the
MySQL bulk loader, then load the data into your DatingBase
relations. If you are using real data, your program will need to
transform the data into files of records conforming to your DatingBase
schema. The rest of you will write a program to fabricate
data: your program will generate either random or nonrandom (e.g.,
sequential) records conforming to your schema. Note that it is both
fine and expected for your data values--strings especially--to be
meaningless gibberish. The point of generating large amounts of data
is so that you can experiment with a database of realistic size,
rather than the small "toy" databases often used in classes. The data
you generate and load should be on the order of:
- At least two relations with thousands of tuples
- At least one additional relation with hundreds of tuples
If the semantics of your schema includes relations that are
expected to be relatively small (e.g., states in the US),
it is fine to use some small relations, but please ensure that you
have relations of the sizes prescribed above as well. When writing a
program to fabricate data, there are two important points to keep in
mind:
- Be sure not to generate duplicate values for your key attributes.
- Your DatingBase almost certainly includes relations that are
expected to join with each other. For example, you may have an
Users relation with attribute ID that's expected to
join with attribute userID in relation Dates. In
generating data, be sure to generate values that actually do
join--otherwise all of your interesting queries will have empty
results! One way to guarantee joinability
is to generate the values in one
relation, then use the generated values in one relation to select
joining values for the other relation.
For example, you could generate user IDs
first (either sequentially or randomly), then use these
IDs to fill in the userID values in the Dates
relation.
Turn in your program code for generating or transforming data, a small
sample of the records generated for each relation (5 or so records per
relation), and a script showing the loading of your data into MySQL.
- Develop and test:
- At least six SQL data retrieval (select) commands.
- At least two each of the four types of SQL data modification
commands: insert a single tuple, insert a subquery,
delete, update.
Please note:
- For this assignment you will be invoking your SQL commands
interactively through mysql, as described in the document Getting
Started With MySQL. Of course you should certainly build a
script file, rather than typing in the queries each time you run them.
- Please write "interesting" queries. You should try to use
most or all of the SQL constructs discussed in class and in the
textbook (subqueries, aggregates, set operators, etc.). You will not
receive full credit if your queries and modifications are all extremely
simple.
- We suggest that you experiment with your SQL commands on your
small hand-created database before running them on the large database
for which you generated data. Initial debugging is much easier when
you're operating on small amounts of data. Once you're confident that
your commands are working, try them on your large database. We do
expect that the commands you turn in work properly on both your small
and large databases.
- If you discover that most or all of your "interesting" queries
return an empty answer on your large database, check whether you
followed the instructions in part (b) for generating data values
that join properly. You may need to modify your data generator.
- Turn in a copy of all of your SQL commands, along with a
script illustrating their execution on either your large
database. Your script should be sufficient to convince us that your
commands run successfully, but you can and should truncate query
results after a few lines. Please do not turn in query results that
are hundreds of lines long.
Problem Set
You will complete the problem set online using Gradiance. Log in at
http://www.gradiance.com/services.
The name of the homework is CS 235 Homework 3. All questions in this
problem set are multiple choice. However, to answer them correctly
you will need to work out their long (general) answers. A correct
answer is worth 3 points. You lose a point for each incorrect
answer. You can take the assignment as many times as you like; only
your last score will count. Note that you will probably get slightly
different questions each time you take it.