New! Homework 5 solutions posted. User name and password given out in class Mon Sep 28. Subject to revision during grading.
New! Class recordings up to date through Wednesday, December 16.
New lecture note on maximum likelihood
Changed one word in problem 5-3.
Calculate their mean values becomes
Calculate their expected values.
New! New lecture note on how to draw a graph in R.
New! Homework assignment 5 posted. Assignment upload link and discussion group have been made in Canvas.
Version 0.4 of R package
glmbb has a bug that causes it
to report models twice (so they get only half the weight they should have)
depending on the order of terms in the formulas you provide R function
glmbb. The bug has been fixed in version 0.5-1, which is
now on the main CRAN site and soon will be on all mirrors. So it is
advisable to reinstall this package (or do
before finishing homework 4.
New! Homework 4 due date changed to Wednesday, November 25. Canvas submission date changed accordingly, and the end of the Canvas discussion group for this assignment also changed accordingly.
New! New erratum posted. Simplified notes on Chapter 8.
New! New erratum posted. Error noticed the notes on model selection and model averaging has been fixed.
New! New reading assignment in Chapters 8 and 9 in Agresti.
Homework assignment 4 posted. Assignment upload link and discussion group have been made in Canvas.
Some formulas from office hours Monday and Wednesday.
gout$xextracts the model matrix from the result returned by R function
glm(provided the optional argument
x = TRUEwas supplied) and the R expression
gout$yextracts the response vector. They are used in R function
loglin defined in the notes. The line
eta <- drop(modmat %*% beta)in that function defines η = M β. The rest of the function in the notes is for the binomial rather than the poisson distribution, so you have to figure out how to implement the formula above that calculates the log unnormalized density of the posterior distribution (what you want R function
Mentioned in class, for more on Markov chain Monte Carlo, including an explanation of why the Metropolis algorithm works, one can turn to the introductory chapter of the Handbook of Markov Chain Monte Carlo written by your humble instructor. But everything necessary to do the homework is in the lecture notes for this course. This is only for those who want more info on MCMC.
Reading assignment for Friday, Oct 16 cut. Just read Sections 4.1, 4.2, and 4.3 in Agresti.
Link for submitting homework assignment 1 late added to canvas.
Broken link in hw 2 assignment now fixed. The assignment was wrong the file is at
that is, hw2-4.txt rather than h2-4.txt.
New course notes on the Poisson distribution.
New erratum about R function logl in the notes on the binomial distribution.
On Friday, Sep 25, 2020 we depart from categorical data analysis to the
reproducibility crisis in science and what statistics done badly
has to do with that. Our texts are
- a talk I gave to the Minnesota Center for Philosophy of Science and the University of Minnesota Program in History of Science, Technology and Medicine in January 2012,
- a paper published in the journal Science titled Estimating the Reproducibility of Psychological Science by the Open Science Collaboration (270 authors) in August 2015, and
- the Many Faces of Reproducibility Interdisciplinary Collaborative Workshop that has been running here at the U of M for the past 2 years and is continuing.
This course will use plain R rather than Rstudio.
You can use Rstudio if you want but I don't need anything it does.
There are two R packages designed to be used in this course.
- R package CatDataAnalysis is found at Github.
- R package glmbb will not be used until the middle of
the course. It is found on CRAN
Install the package in R by executing the command
install.packages("glmbb")at the R command line or, of you prefer, by mousing around in menus of the R app or Rstudio.
New! This web site has no index, so in order to find stuff one needs to use a search engine. Here is how to do that. For example, if you want to find information on the beta distribution, then the search
"beta distribution" site:www.stat.umn.edu/geyer/5101does that. This works either with Google or with DuckDuckGo. The quotation marks mean find the exact phrase. If they are left off, then the search engine will return results that have the word beta and the word distribution, not necessarily in the same page much less in the same sentence. The magic is the
site:part, which tells the search engine only to look in that
site. The site can be made more restrictive, for example,
"beta distribution" site:www.stat.umn.edu/geyer/5101/slidessays to look only in the slides (this seems to work only in Google, but not in DuckDuckGo even though it is supposed to work in DuckDuckGo).
New! This course is entirely on-line, regardless of what the U does with other courses. It is synchronous in the sense that, if you want to ask questions during lecture, then you must be in the Zoom session at the scheduled time. It is asynchronous in the sense that all Zoom sessions will be recorded and linked on the Canvas site for the course.