Statistics

Departmental Counselor: Stephen M. Stigler, E 102, 702-8328 or 702-8333,
stigler@galton.uchicago.edu
World Wide Web: http://galton.uchicago.edu/


Program of Study

The modern science of statistics involves the invention, study, and development of principles and methods for modeling uncertainty through mathematical probability; for designing experiments, surveys, and observational programs; and for analyzing and interpreting empirical data. Mathematics plays a major role in all statistical activity, whether of an abstract nature or dealing with specific techniques for analyzing data. Statistics is an excellent field for students with strong mathematical skills and an interest in applying these skills to problems in the natural and social sciences. A program leading to the Bachelor of Arts degree in statistics offers excellent coverage of the principles and methods of statistics in combination with a solid training in mathematics. In addition, there is considerable elective freedom enabling interested students to examine those areas of knowledge in the biological, physical, and social sciences that are often subjected to detailed statistical analysis. The concentration provides a base for graduate study in statistics or in other subjects with strong quantitative components. An honors program is available. Students considering graduate study in statistics or related fields are encouraged to discuss their programs with the departmental counselor at an early stage, whether or not they plan to receive an undergraduate degree in statistics.

Statistics Courses for Students in Other Concentrations. Courses at the 20000 level are designed to provide instruction in statistics, probability, and statistical computation for students from all parts of the University. These courses differ in emphasis on theory or methods, on the mathematical level, and in the direction of applications. Most of the introductory courses make serious use of computers to exemplify and explore statistical concepts and methods. The nature and extent of computer work varies according to the course and instructor. No previous experience with computers is expected for any first course. Statistics courses are not mathematics courses, but the mathematics prerequisites provide a useful guide to the level of a statistics course. In general, students are advised to take the course with the highest prerequisites that they can meet and, when possible, to take a two- or three-quarter sequence rather than a one-quarter course. In particular, students who have taken calculus should not take Statistics 20000 but, rather, should take Statistics 22000, 24400-24500, or 25100.

Introductory Courses and Sequences. Statistics 22000 is the usual first course in statistics, providing a general introduction to statistical concepts, techniques, and applications to data analysis and to problems in the design, analysis, and interpretation of experiments and observational programs. Computers are used throughout the course. One or two sections of Statistics 22000 in the autumn, winter, and spring quarters use examples drawn from economics and business and a selection of texts and topics that are more appropriate for concentrators in economics. A score of 4 or 5 on the AP test in statistics yields credit for Statistics 22000. Statistics 20000 is an alternative that has no calculus prerequisite and places less emphasis on exploring statistical techniques. Statistics 25100 is an introductory course in probability.

Statistics 24400-24500 is recommended for students who want a thorough introduction to statistical theory and methodology. No prior training in statistics or probability is required for Statistics 24400-24500. However, Statistics 20000 or 22000 would provide a helpful background; students who have taken one of these are encouraged to take Statistics 24400-24500 if they want more extensive training in the basis of statistical methods. Statistics 24600 is offered as a supplement to the 24400-24500 sequence.

Statistics 24400-24500 and 25100 form the core of the statistics concentration: this is recommended as a cognate sequence to concentrators in the quantitative sciences and mathematics. It would be preferable, but not mandatory, to take Statistics 25100 after 24400-24500; accordingly, 25100 is now offered in the spring quarter to permit the completion of this cognate sequence in one year.

For students more interested in exploring methods and their applications, Statistics 22200, 22400, and 22600 are recommended. These are complementary second courses that emphasize some class of methods for the analysis of data. They may be taken in any order. Each presumes a previous course in statistics (Statistics 22000 or equivalent) and experience using computers in data analysis (as in Statistics 22000). The emphasis is on linear models and experimental design in Statistics 22200, multiple regression and least squares in Statistics 22400, and categorical data analysis in Statistics 22600.

For students who have completed Statistics 24500, many graduate courses in statistics offer opportunities for further study of statistical theory, methods, and applications. The introductory probability course (Statistics 25100) may be taken separately from any statistics courses. Statistics 25100 can be supplemented with more advanced probability courses, such as Statistics 31200, 31300, or 38100-38300. NOTE: College students may register for a number of other 30000-level courses in statistics. For further information, consult the instructor, the departmental counselor, or the Department of Statistics Web site (http://galton.uchicago.edu/).

Program Requirements

Degree Programs. Students in the statistics program should meet the general education requirements in the mathematical sciences with courses in calculus. Concentration requirements include four additional prescribed mathematics courses and five prescribed statistics courses; the four mathematics courses should be completed by the end of the third year. Additional requirements include one course in computer science and two more courses in mathematics, statistics, or computer science. The five required statistics courses must include Statistics 24400-24500 and Statistics 25100; and either 22400 or 34300. The fifth required statistics course may be either Statistics 22000 or another course such as Statistics 22200, 22600, 24000, 24600, 30100, 31200, or 32100. If Statistics 22000 or 24000 is included as part of the program, it should be taken before Statistics 24400 is taken. Candidates should be sure their course program has the approval of the departmental counselor. NOTE: Students completing concentrations in both statistics and economics may replace Mathematics 20000-20100 and Mathematics 25000/25500 with Mathematics 19500-19600 and Mathematics 20300.

Summary of Requirements

General MATH 13100-13200, 15100-15200, or 16100-16200†

Education

Concentration 1 MATH 13300, 15300, or 16300†

2 MATH 20000-20100, 20300-20400,

or 20700-20800

1 MATH 25000 or 25500

5 STAT 22400, 24400, 24500, 25100, or 34300,

and one other approved statistics course

1 CMSC 10500 or 11500

2 approved courses in mathematics,

statistics, or computer science

12

Credit may be granted by examination.

Grading. Subject to College and divisional regulations, and with the consent of the instructor, all students except concentrators in statistics may register for regular letter grades or P/F grades in any 20000-level statistics course. A grade of P is given only for work of C- quality or higher. Incompletes are allowed only in cases of serious emergency. To meet the concentration requirement in statistics, a grade of at least C- must be earned in each of the twelve courses; a grade of P is not acceptable for meeting these concentration requirements.

Honors. The B.A. with honors is awarded to students who have a grade point average of 3.0 or better overall and 3.25 or better in the twelve required courses in the concentration and who, in addition to these courses, complete an approved honors paper (Statistics 29900). Interested students who meet the program requirements should see the departmental counselor before the end of their third year in the College.

Faculty

Yali Amit, Professor, Department of Statistics and the College

Zhiyi Chi, Assistant Professor, Department of Statistics and the College

steven P. lalley, Professor, Department of Statistics and the College

MICHAEL D. LARSEN, Senior Lecturer, Department of Statistics and the College

Peter McCullagh, Ralph and Mary Otis Isham Professor, Department of Statistics and the College

Mary Sara McPeek, Associate Professor, Department of Statistics, Committee on Genetics, and the College

Xiao-Li Meng, Professor, Department of Statistics and the College

Per A. Mykland, Professor, Department of Statistics and the College

DAN L. NICOLAE, Assistant Professor, Department of Statistics and the College

PARTHA NIYOGI, Assistant Professor, Departments of Computer Science and Statistics, and the College

Michael L. Stein, Professor, Department of Statistics and the College; Chairman, Department of Statistics

Stephen M. Stigler, Ernest DeWitt Burton Distinguished Service Professor, Department of Statistics, Committee on Conceptual & Historical Studies of Science, and the College

Ronald A. Thisted, Professor, Departments of Health Studies, Statistics, and Anesthesia & Critical Care, and the College; Chairman, Department of Health Studies

Michael J. Wichura, Associate Professor, Department of Statistics and the College

Kirk M. Wolter, Professor, Department of Statistics

Courses

For a description of the numbering guidelines for the following courses, consult the section on reading the catalog on page 15.

12500. Quantitative Methods in Environmental Science (=ENST 12500, NTSC 12500, STAT 12500). PQ: NTSC 12400 or consent of instructor. This course studies mathematical, statistical, and computational approaches to scientific issues raised previously in this sequence. Three principal tools are: differential equations as a way to model a changing world, probability theory as a way to quantify uncertainty, and the application of computer simulations to understanding environmental processes. M. Stein. Winter.

20000. Elementary Statistics. PQ: MATH 10600, placement into 13100 or higher, or satisfactory performance on a special elementary diagnostic mathematics examination. This course meets one of the general education requirements in the mathematical sciences. NOTE: If STAT 20000 is used to meet the general education requirement, it may not also be used to count toward a concentration requirement. This course is an introduction to statistical concepts and methods for the collection, presentation, analysis, and interpretation of data. Elements of sampling, simple techniques for analysis of means, proportions, and linear association are used to illustrate both effective and fallacious uses of statistics. Staff. Autumn, Winter, Spring.

22000. Statistical Methods and Their Applications. PQ: MATH 15200 or equivalent. This course is an introduction to statistical techniques and methods of data analysis, including the use of computers. Examples are drawn from the biological, physical, and social sciences. Students are required to apply the techniques discussed to data drawn from actual research. Topics include data description, graphical techniques, exploratory data analyses, random variation and sampling, one- and two-sample problems, the analysis of variance, linear regression, and analysis of discrete data. One or more sections of STAT 22000 use examples drawn from economics and business and a selection of texts and topics that are more appropriate for concentrators in economics. Staff. Summer, Autumn, Winter, Spring.

22200. Linear Models and Experimental Design. PQ: STAT 22000 or equivalent. This course covers principles and techniques for the analysis of experimental data and the planning of the statistical aspects of experiments, surveys, and observational programs. Topics may include linear and nonlinear models; analysis of variance and response surface analysis; randomization, blocking, and factorial designs; fractional replication and confounding; incorporation of covariate information; design and analysis of sample surveys; designs subject to constraints; split-plot and nested experiments; and components of variance. Staff. Spring.

22400. Applied Regression Analysis. PQ: STAT 22000 or equivalent. This course is an introduction to the methods and applications of fitting and interpreting multiple regression models. The primary emphasis is on the method of least squares and its many varieties. Topics include the examination of residuals, the transformation of data, strategies and criteria for the selection of a regression equation, the use of dummy variables, tests of fit, nonlinear models, biases due to excluded variables and measurement error, and the use and interpretation of computer package regression programs. The techniques discussed are illustrated by many real examples involving data from both the physical and social sciences. Matrix notation is introduced as needed. Staff. Autumn.

22600. Analysis of Categorical Data. PQ: STAT 22000 or equivalent. This course covers statistical methods for the analysis of structured, counted data. Topics discussed may include Poisson, multinomial, and product-multinomial sampling models; chi-square and likelihood ratio tests; log-linear models for cross-classified counted data, including models for data with ordinal categories and log-multiplicative models; logistic regression and logit linear models; and measures of association. Applications in the social and biological sciences are considered, and the interpretation of models and fits, rather than mathematical details of computational procedures, is emphasized. The computer is used throughout the course. Staff. Winter.

24000. Probability and Statistics for the Natural Sciences. PQ: MATH 20100 or 19600; and CHEM 11300 or 12300, or PHYS 12300, 13300, or 14300. This course is an introduction to those topics in probability and statistics most relevant to experimental sciences, particularly the physical sciences. Probability topics include the central limit theorem and rules of probability, random variables, means, variances, and correlations. Statistics topics include propagation of errors, inference for means, and regression analysis for experimental data. In addition, topics in linear algebra (e.g., vector spaces, projection, and eigenvalues and eigenvectors) are studied within the context of regression. Connections of statistical methods to Fourier series and other mathematical methods commonly used in the physical sciences may also be made. Staff. Spring.

24100. Probability and Statistics for the Natural Sciences. PQ: MATH 20100 or 19600; and CHEM 11300 or 12300, or PHYS 12300, 13300, or 14300. This course carries 50 units of credit. STAT 24100 is the first five weeks of STAT 24000. Staff. Spring.

24400-24500. Statistical Theory and Methods I, II. PQ: MATH 15300 or equivalent. Some previous experience with statistics recommended but not required. This course is a systematic introduction to the principles and techniques of statistics, with emphasis on the analysis of experimental data. Topics include theoretical and empirical frequency distributions; binomial, Poisson, normal, and other standard distributions; random variables and probability distributions; principles of inference including Bayesian inference, maximum likelihood estimation, hypothesis testing, and confidence intervals; and analysis of counted data, analysis of variance, least squares, and multiple and logistic regression. Computers are used throughout the sequence. Staff. Autumn, Winter.

24600. Statistical Theory and Methods III. PQ: STAT 24400-24500 or equivalent. Knowledge of probability distributions, random variables, and estimation techniques, such as maximum likelihood from STAT 24400-24500. This course is intended to be the third part of a sequence that introduces mathematical statistics and probability. In this quarter, the impact of missing data on statistical analyses is considered. Probability models, methods of estimation and inference, and applications to data are studied for various examples. Algorithms for iterative maximum likelihood estimation, such as the Expectation-Maximization (EM) and Newton-Raphson algorithms, and for Bayesian computation, such as Data Augmentation and Monte Carlo Markov Chain methods, are introduced. Staff. Spring.

24700/31000. Mathematical and Statistical Methods for the Neuro-Sciences II (=MATH 32100, STAT 24700/31000). PQ: Students must have completed the equivalent of one year of college calculus and a course in linear algebra such as MATH 25000 and preferably a course in differential equations such as MATH 27300, and at least one course in neurobiology such as BIOS 14106 or 24236, or NURB 31800. This course is for students interested in computational and theoretical neuroscience. It introduces various mathematical and statistical ideas and techniques used in the analysis of brain mechanisms. It treats statistical methods important in understanding nervous system function. It includes basic concepts of mathematical probability; information theory, discrete Markov processes, and time series. Staff. Winter.

25100. Introduction to Mathematical Probability. PQ: MATH 20000 or 20300, or consent of instructor. This course covers fundamentals and axioms; combinatorial probability; conditional probability and independence; binomial, Poisson, and normal distributions; the law of large numbers and the central limit theorem; and random variables and generating functions. Staff. Spring.

26700/36700. History of Statistics (=CFSC 32900, HIPS 25600, STAT 26700/36700). PQ: Prior statistics course. This course covers topics in the history of statistics, from the eleventh century to the middle of the twentieth century. The emphasis is on the period from 1650 to 1950, and on the mathematical developments in the theory of probability and how they came to be used in the sciences, both to quantify uncertainty in observational data and as a conceptual framework for scientific theories. The course includes broad views of the development of the subject, and closer looks at specific people and investigations, including reanalyses of historical data. S. Stigler. Spring.

29700. Undergraduate Research. PQ: Consent of faculty adviser and departmental counselor. Students are required to submit the College Reading and Research Course Form. Open to students concentrating in statistics and to nonconcentrators. May be taken either for a P/F grade or for a quality grade. This course consists of reading and research in an area of statistics or probability under the guidance of a faculty member. A written report must be submitted at the end of the quarter. Staff. Autumn, Winter, Spring.

29900. Bachelor's Paper. PQ: Consent of faculty adviser and departmental counselor. Students are required to submit the College Reading and Research Course Form. Open to students concentrating in statistics. May be taken P/N or P/F. This course consists of reading and research in an area of statistics or probability under the guidance of a faculty member, leading to a bachelor's paper. The paper must be submitted at the end of the quarter. Staff. Autumn, Winter, Spring.

For more information on 30000-level courses, consult the departmental counselor. Updated departmental and course information can be found on the Department of Statistics Web site (http://galton.uchicago.edu/).

30100-30200. Mathematical Statistics. PQ: STAT 30400 or consent of instructor. This course surveys the mathematical structure of modern statistics. Topics include statistical models, methods for parameter estimation, comparison of estimators, efficiency, confidence sets, theory of hypothesis tests, elements of linear hypothesis theory, analysis of discrete data, and an introduction to Bayesian analysis. Staff. Winter, Spring.

30400. Distribution Theory. PQ: STAT 24500 and MATH 20500, or consent of instructor. This course covers methods of deriving, characterizing, displaying, approximating, and comparing distributions. Topics include algebra by computer (Maple and Macsyma), standard distributions (uniform, normal, beta, gamma, F, t, Cauchy, Poisson, binomial, and hypergeometric), moments and cumulants, characteristic functions, exponential families, the Pearson system, Edgeworth and saddlepoint approximations, and Laplace's method. Staff. Autumn.

31200. Introduction to Stochastic Processes I. PQ: STAT 25100, and MATH 20100 or 20400. This course is an introduction to stochastic processes not requiring measure theory. Topics include branching processes, recurrent events, renewal theory, random walks, Markov chains, Poisson, and birth-and-death processes. Staff. Winter.

31300. Introduction to Stochastic Processes II. PQ: STAT 31200 or consent of instructor. This course is a sequel to STAT 31200. Topics covered include continuous time Markov chains: birth-and-death processes and queues, introduction to discrete time martingales, and Brownian motion and diffusions. Stochastic ordering and Poisson approximations may also be discussed. The emphasis is on defining the processes and calculating or approximating various related probabilities. The measure theoretic aspects of these processes are not covered rigorously. Staff. Spring.

32900. Applied Multivariate Analysis (=GSBC 42400, STAT 32900). PQ: STAT 22000 or equivalent. This course is an introduction to multivariate analysis. Topics include principal component analysis, multidimensional scaling, discriminant analysis, canonical correlation analysis, and cluster analysis. Staff. Spring.

34300. Applied Linear Statistical Methods. PQ: STAT 24500 and MATH 25000, or equivalents. This course is an introduction to the theory, methods, and applications of fitting and interpreting multiple regression models. Topics include the examination of residuals, the transformation of data, strategies and criteria for the selection of a regression equation, nonlinear models, biases due to excluded variables and measurement error, and the use and interpretation of computer package regression programs. The theoretical basis of the methods, the relation to linear algebra, and the effects of violations of assumptions are studied. Techniques discussed are illustrated by examples involving both physical and social sciences data. Staff. Autumn.

35000-35100. Epidemiology (=HSTD 31000-31100, STAT 35000-35100). PQ: Consent of instructor. The topic of this course is the quantitative study of the spread of diseases in a population. Staff. Autumn, Spring.

35600. Introduction to Survival Analysis (=HSTD 33100, STAT 35600). PQ: Consent of instructor. This course covers the analysis of longitudinal data on patients. Staff. Winter.

38100. Measure-Theoretic Probability I. PQ: STAT 31300 or consent of instructor. This course provides a detailed, rigorous treatment of probability from the point of view of measure theory, as well as existence theorems, integration and expected values, characteristic functions, moment problems, limit laws, Radon-Nikodym derivatives, and conditional probabilities. Staff. Autumn.