Chapter 3 Working with Distributions

We will often need to work with distributions, with the most common activities being computing a p-value as the probability of some random variable being greater or less than a given value and computing a quantile (percent point) of a distribution for use in confidence intervals. Less commonly, we might want to generate a “random” sample from a distribution or evaluate the density at a particular value. R has functions to do all of these for most standard distributions.

In general in R,

  • pFOO(x,params) gives you the cumulative probability up to (and including) x for distribution FOO with the given parameters,
  • qFOO(p,params) gives you the quantile that has cumulative (to the left) probability p for distribution FOO with the given parameters,
  • rFOO(n,params) gives you a random sample of size n from distribution FOO with the given parameters, and
  • dFOO(x,params) gives you the density at x of distribution FOO with the given parameters.
Here is a table with some of the available distributions and their associated parameters. Values after an “=” indicate default values that will be used if you do not specify something different. In the table df refers to degrees of freedom, and ncp refers to the non-centrality parameter.
FOO params
Normal norm mean=0,sd=1
Student’s t t df,ncp=0
F f df1,df2,ncp=0
Chi-Squared chisq df,ncp=0
Binomial binom size,prob
Poisson pois lambda

Thus, we could use rnorm(10,mean=3,sd=.1) to get 10 random normals with mean 3 and standard deviation 1, or qt(.975,20) to get the 97.5 percentile of a t-distribution with 20 degrees of freedom, or pf(4.2,3,10) to get the probability that an F with 3 and 10 degrees of freedom would be less than or equal to 4.2.

The p and q forms also have two additional parameters: lower.tail=TRUE,log.p=FALSE. By default, these functions use the lower tail, or area to the left. If you set lower.tail=FALSE, then you will get the upper tail or area to the right. Also, by default the probabilities, either as an argument in qFOO(p) or a result in pFOO(x), are ordinary probabilities between 0 and 1. If you set log.p=TRUE, then the function uses or returns the logarithm of the probability (always a non-positive number).