Statistics 8931, Fall 2011
Dimension Reduction
Course Instructor
R.
D. Cook, 397 Ford (5-7732)
email: dennis@stat.umn.edu
Office
Hours: 2:30-3:30 MW, Ford 397, and by appointment.
Lectures
1:25—2:15
Ford 127. An alternate time for
some Friday lectures may be scheduled.
Text
None,
but the following book may be useful for reference: "Regression Graphics:
Ideas
for Studying Regressions through
Graphics" by R. D. Cook. The web page for the text is at http://www.stat.umn.edu/RegGraph/. This book is on reserve in the math
library and is available from the statistics library as well.
Course Web Page: http://www.stat.umn.edu/~dennis/Stat8931F11/.
Homework
is a required part of the course.
There will be homework assignments throughout the semester, portions of
which will be graded.
Grading
A
grade of "B" requires satisfactory completion of the homework
problems and reading assignments, along with regular attendance and
participation in classroom discussion. A grade of "A" requires completion of a
class project involving detailed study of some aspect of the course
material. Projects, which must be
approved in advance, should be underway by mid November. Project suggestions will be given in class
from time to time. You should
expect to spend about ¼ of your time on the project.
Exam
None planned at present. Some project presentations might be scheduled during finals
week.
Incompletes
Grades
of "I" will be given only in extraordinary circumstances, and then
only by written agreement between the instructor and the student.
Computing
Matlab will be the primary computing platform for this
course. Some methods are available
in R via Weisberg's dr package, but many of the novel methods have been written
only in Matlab. The Matlab code
and documentation are available at
http://liliana.forzani.googlepages.com/ldr-package.
This course will consider both
traditional and modern methods of dimension reduction, and attempt to construct
a common framework that may suggest new theory and methods. Traditional methods to be discussed
include principal components and partial least squares. More modern methods
include several methods that fall under the heading of "sufficient
dimension reduction".
Emphasis will be placed on contrasting historical and modern
foundations. There will likely be
more questions than answers.
Reduction of the dimensionality
of the predictor vector is the primary goal in regressions with a univariate
response. There are several
reasons why dimension reduction may be useful in this context, including the
possibilities of mitigating the effects of collinearity, facilitating model
specification by allowing visualization of the data in low dimensions,
providing a relatively small set of predictors on which to base prediction or
interpretation, and dealing usefully with large-p-small-n problem. When the response is multivariate,
reduction of the response vector and the predictor vector may be considered
separately or simultaneously.
Most
of classical Twentieth-century Fisherian statistics focused on problems where
the number of unknowns "p" was small and, in particular, much smaller
than the number of observations or experimental units. However, with advances in computing and
the emergence of applications with relatively large p, the practical
environment has changed dramatically over the past 20 years. The statistical community has not yet
decided how to deal effectively with related issues. This course is intended in part to be a contribution to the
discussion.
Assignment
1
Reading:
C.J.C. Burges, "Dimension
Reduction: A Guided Tour", Foundations and Trends in Machine
Learning, Vol. 2, No. 4, 275-365, 2010. Writing: As discussed in class and
in Problem 1.1 of the distributed notes.
To avoid overlap, your
choice must be approved by the instructor prior to writing (an email will be
sufficient). Due: Friday, Sept. 23.
DISABILITY ACCESS STATEMENT: This publication/material is
available in alternative formats upon request. Please contact Dana, School of Statistics, 313 Ford Hall,
625-8046.