Undergraduate Research Opportunities Program (UROP): Aster Models

University of Minnesota, Twin Cities     School of Statistics     Charlie Geyer's Home Page
Department of Ecology, Evolution, and Behavior     Ruth Shaw's Home Page
Chicago Botanic Garden     Institute for Plant Conservation Biology     Stuart Wagenius's Home Page
Echinacea Project Page     Aster Project Page

Faculty Advisors

Two University of Minnesota (Twin Cities) faculty members are looking for undergraduates interested in research on aster models. They are Ruth Shaw in the Department of Ecology, Evolution, and Behavior and Charles Geyer in the School of Statistics.

The home page for the aster project is here.

Project: Aster Analysis of Life History Data

One project, suitable for biology students with an interest in evolutionary genetics or ecology, is aster analysis of life history data. These would be like the aster analyses in the two published papers linked on the home page but some details would be different.

Several data sets are available and waiting for an analysis.

The very short introduction of aster is that it is a generalization of linear regression, binomial and Poisson regression, and survival analysis that allows life history data to be used for estimation of population growth, for estimation of fitness landscapes, and for many other purposes.

For a somewhat longer introduction, read the introduction and discussion sections of the new paper to appear in American Naturalist.

Some knowledge of statistics is necessary.

Project: Coding the R Package Aster

One project, suitable for computer science students or other skilled programmers is work on the R package aster.

R (www.r-project.org) is to statistics as C is to general programming. Both were invented at Bell Labs. Both have the unix nature. It is the language of choice for research statistics.

Unlike C, the R language is interpreted, garbage collected, and has dynamic typing (like Perl, Python, or Scheme). It is a Turing complete programming language with some functional and object-oriented features.

R is free software distributed from the CRAN web site (similar to CTAN for TeX and CPAN for Perl). It supports extensions and hundreds of R contributed packages that deal with specific statistical problems are available and trivial for users to load into R and use.

R contributed package aster has existed for about three years and has sufficed to do the analyses in the two existing papers.

However the first paper described a more general class of models than the package actually implements and it is unclear how to refactor the existing code base to implement this more general class of models, so some serious programming is needed.

R is implemented in C. Extension packages often use C called from R. The current version of the aster package has about 1300 lines of R and about 2700 lines of C. Since C is a lot harder to learn than R, some knowledge of C is necessary. R is simple enough to learn on the project.