CMSC 12300 and CAPP 30123: Computer Science with Applications III
University of Chicago, Spring 2015
Syllabus
This syllabus, last updated April 3, shows my plans for the course. I have posted it on the web to facilitate updates over the course of the quarter. I reserve the right to make changes; for instance, I may rearrange or change lecture topics in response to student interests.
Course Staff
Instructor: Matthew Wachs
Email: mwachs
Office: RY 175-A
Office Hours: Monday and Wednesday after class and Tuesday, 2-3pm. (Please don't hesitate to email if you'd like to meet at another time.)
TA: Nick Seltzer
Email: nseltzer
Office: RY 176
Office Hours: M 10am, W 4pm
Meeting Times
Lecture: MWF 1:30pm-2:20pm, RY 276
Lab: R 1:30pm-2:50pm, CSIL 4
Lab: 3:00pm-4:20pm, CSIL 1
Labs begin Week 2 and will be held every week up to and including Week 6. There will be another lab in either Week 7, 8, or 9. I will decide when to schedule this lab based on the most appropriate timing relative to the lectures.
Course Components
The course consists of:
- Lectures
- Labs: give you the opportunity to practice with real Big Data environments and get immediate feedback and assistance from course staff (not graded)
- Programming assignments: give you more in-depth practice on selected material than would be possible in the labs; three programming assignments are planned, each contributing 10% towards your final grade for a cumulative weight of 30%
- Project: an extended, open-ended team project, similar to the project last quarter; you will propose, report on progress, and give a final presentation alongside submitting your code. The theme of the project is answering hypotheses on large data sets. Projects will count 70% towards your final grade
Topics
This course is about Big Data: the challenges of working with it, and the solutions that have ben developed to successfully overcome them. Topics include:
- Algorithms:
- considerations and changes needed when moving from smaller data sets to large ones
- analysis of computational time and memory requirements and how they scale with data size
- methods and conceptual frameworks for dividing up the work of an algorithm into separate tasks that can be run in parallel on multiple computing resources
- C++: an expansion of your C skills to the additional features and the different philosophy adopted in this language
- Big Data and cloud computing environments and programming paradigms:
- Amazon Web Services
- MapReduce
- Hadoop
- Multi-process and multi-threaded programming
- Concurrency and synchronization primitives (mutexes, condition variables)
- MPI
Tentative Schedule
- Week1
- Lecture 1 3/30: Introduction, challenges of scale
- Lecture 2 4/1: Amazon Web Services, space and time analysis I
- Lecture 3 4/3: Space and time analysis II, algorithms I
- Week 2
- Lecture 4 4/6: Algorithms II
- Lecture 5 4/8: Algorithms III
- Lab 4/9: AWS
- Lecture 6 4/10: MapReduce I
- Week 3
- Lecture 7 4/13: MapReduce II; project proposals tentatively due
- Lecture 8 4/15: Tentative proposal presentations
- Lab 4/16: Algorithms
- Lecture 9 4/17: Tentative proposal presentations
- Week 4
- Lecture 10 4/20: C++ I
- Lecture 11 4/22: C++ II
- Lab 4/23: C++ or mrjob / S3
- Lecture 12 4/24: C++ III
- Week 5
- Lecture 13 4/27: C++ IV
- Lecture 14 4/29: C++ V
- Lab 4/30: C++ or mrjob / S3
- Lecture 15 5/1: C++ & concurrency & parallelism I
- 5/1 PA1 (MapReduce) due
- Week 6
- Lecture 16 5/4: Concurrency & parallelism II
- Lecture 17 5/6: Concurrency & parallelism III
- Lab 5/7: Concurrency
- Lecture 18 5/8: Concurrency & parallelism IV
- Week 7
- Lecture 19 5/11: Tentative progress reports
- Lecture 20 5/13: Tentative progress reports
- Lab 5/14: MPI or no lab
- Lecture 21 5/15: Big Data environments, other topics I
- 5/15 PA2 (URL shortening service) due
- Week 8
- Lecture 22 5/18: Big Data environments, other topics II
- Lecture 23 5/20: Big Data environments, other topics III
- Lab 5/21: MPI or no lab
- Lecture 24 5/22: Big Data environments, other topics IV
- Week 9
- Memorial Day
- Lecture 25 5/27: Big Data environments, other topics V
- Lab 5/28: MPI or no lab
- Lecture 26 5/29: Big Data environments, other topics VI
- 5/29 PA3 (Digit recognition) due
- Week 10
- Lecture 27 6/1: Final presentations; code due at least for convo students
- Lecture 28 6/3: Final presentations
Academic Honesty
The University's rules on academic honesty apply equally to this course as they did in the prior courses in the sequence and will be rigorously and rigidly enforced. If you have any doubts, questions, or concerns, please ask, particularly in advance.
Textbook
This course does not have a textbook. However, this book is highly relevant to the course and is available online at no cost; you may find it of value.