Chapter 20 Some Helpful R Discussion

20.1 Finding help

R has built-in help. That said, R help is the kind of help that is useful if you generally understand R but are trying to use a specific function. If you’re not that far into R yet, you probably want to start with An Introduction to R found on the R documentation page.

If you already know the name of the function you want help for, say linear.contrast(), then you can use the help command. Note that it won’t look everywhere, just in things you have loaded in via library().

> help(linear.contrast)
> library(cfcdae)
> help(linear.contrast)

If you don’t know the name but only know a word that is related (contrast, for example), then you can use the ??xxx form, which will look through all the help you have for help topics that include xxx. Be aware, you can get lots of results this way.

> ??contrast

Try these yourselves. I have searched but I cannot figure out how to make the output of these examples show up in my document. The help() and ??xxx output is going directly to RStudio’s help window and not into my document. Grrr.

20.2 You didn’t load cfcdae first.

If you use lm() before you do library(cfcdae), then you model will be fit using R’s default parameterization of \(\alpha_1 = 0\). Some functions in cfcdae will object to that, and the cure is to just refit after having loaded cfcdae.

How can you figure this out? If you do a summary() for your fit and there is no estimated effect for the first level of any factor, then you are using the R default. If you have estimated effects for all but the last level of the factor, then you probably are using the sum-to-zero parameterization (there is a parameterization where \(\alpha_g=0\), but you’ll need to do some R gymnastics to get that one).

20.3 Some lm() tips

  1. The update() function lets you refit a slightly modified model to the same data. This is most useful when you have lots of terms in your model and want to delete one or two, or when you have a complicated model and want to add another term or two. The usefulness is that you don’t have to type in everything again to lm() (long model statement, data frame, subset, etc.).

    If you have model fit foo and want to add newterm, then you can say update(foo,~.+newterm). If you want to take out oldterm, then say update(foo,~.-oldterm).

  2. If you’re in the middle of typing in a complicated model and you realize that variable foo is not a factor and should be, you can just use as.factor(foo) in the model in place of foo.

    After you’ve fit your model, it might be best to either make foo into a factor or create a new variable that is a factor version.

  3. When you are building a functional model (we typically use a polynomial function), you can use sin(z) or log(z) or sqrt(z) as a predictor in the model, but you cannot use z^2. That is because R models use ^k to mean “interactions up to order k”.

    You can accomplish the same thing by using I(z^2) as a predictor. The I(expression) form means “evaluate the expression and then use that result as a predictor”. It’s label will be I(z^2) and the model parser never gets to see the caret.

20.4 After lm()

We have a number of functions that we can use to examine model fits more closely. They have arguments that are similar, but not identical. The almost, but not quite, the same arguments can lead to confusion.

  • cfcdae::linear.contrast(modelfit,factor,coefs) uses the model, then the factor. factor must be the plain old factor, not a quoted name of the factor (foo not "foo").

  • cfcdae::model.effects(modelfit,termname) requires that termname be a quoted name of a term, not the term itself ("foo" not foo).

  • effects::effect(termname,modelfit) uses the arguments in the reverse order, and termname must be a quoted name ("foo" not foo).

  • emmeans::emmeans(modelfit,termname) uses a quoted name for termname ("foo" not foo). But it can also be used as emmeans::emmeans(modelfit,~termname). In the second usage, termname is not quoted (foo not "foo").