Chapter 3 Working with Distributions
We will often need to work with distributions, with the most common
activities being computing a p
-value as the probability of some random variable
being greater or less than a given value and computing a quantile
(percent point) of a distribution for use in confidence intervals.
Less commonly, we might want to generate a “random” sample from
a distribution or evaluate the density at a particular value.
R has functions to do all of these for most standard distributions.
In general in R,
pFOO(x,params)
gives you the cumulative probability up to (and including)x
for distribution FOO with the given parameters,qFOO(p,params)
gives you the quantile that has cumulative (to the left) probabilityp
for distribution FOO with the given parameters,rFOO(n,params)
gives you a random sample of sizen
from distribution FOO with the given parameters, anddFOO(x,params)
gives you the density atx
of distribution FOO with the given parameters.
df
refers to degrees of freedom, and ncp
refers to the non-centrality parameter.
FOO | params | |
---|---|---|
Normal | norm | mean=0,sd=1 |
Student’s t | t | df,ncp=0 |
F | f | df1,df2,ncp=0 |
Chi-Squared | chisq | df,ncp=0 |
Binomial | binom | size,prob |
Poisson | pois | lambda |
Thus, we could use rnorm(10,mean=3,sd=.1)
to get 10 random normals with
mean 3 and standard deviation 1, or qt(.975,20)
to get the 97.5
percentile of a t-distribution with 20 degrees of freedom,
or pf(4.2,3,10)
to get the probability that an F with 3 and 10
degrees of freedom would be less than or equal to 4.2.
The p
and q
forms also have two additional parameters:
lower.tail=TRUE,log.p=FALSE
. By default, these functions
use the lower tail, or area to the left. If you set lower.tail=FALSE
,
then you will get the upper tail or area to the right. Also, by default
the probabilities, either as an argument in qFOO(p)
or a result
in pFOO(x)
, are ordinary probabilities between 0 and 1. If you
set log.p=TRUE
, then the function uses or returns the logarithm of the probability (always a non-positive number).