Next: fastanova() Up: MacAnova Help File Previous: exp()   Contents


factor(n1 [, n2, ...]) where n1, n2, ... are REAL scalars or vectors,
  all of whose elements are positive integers.

Keywords: glm, anova
factor(A), where A is a vector of positive integers or MISSING, creates
a vector with contents identical to A except that the new vector is
marked as a "factor" with number of factor levels = max(A).

The non-MISSING elements of A must be positive integers <= 32767.

Since the number of factor levels is the largest integer in A, both
factor(vector(1,2,4)) and factor(vector(1,2,3,4)) produce factors marked
as having four levels, although only three of the levels are present in

Argument A can also be a matrix or array when isvector(A) is True, that
is, when all dimensions beyond the first must be 1.  In that case the
result has the same dimensions as A.

factor(a1, a2, ... ak) is equivalent to factor(vector(a1, a2, ... ak))
where a1, ..., ak are all scalars or vectors.

When A is a LOGICAL vector, factor(A) is equivalent to factor(A+1), that
is, False and True are translated to levels 1 and 2, respectively.  The
result is always marked as having 2 factor levels, even if every element
of A is False.

The purpose of marking a variable as a factor is to ensure that, when it
is a variable in a model for a non-regression GLM (generalized linear or
linear model) command such as anova() or poisson(), its values are
interpreted as specifying levels of a categorical (non-quantitative)
variable, that is, classes or categories.

A vector in a model that has not been marked as a factor using factor()
is called a "variate" and its values are taken to specify quantities,
even if they are all positive integers.  In regress(), and screen()
factors are treated the same as variates -- that is the levels are
viewed as quantitative.

In a model which includes both factors and variates, the variates are
often referred to as "covariates".

A common mistake in using GLM commands is to forget to use factor() to
turn vectors of factor levels into factors.  This error results in their
being treated as variates with single degrees of freedom.

When A is a factor with k levels and J is an appropriate subscript for A
(for example, J might be A != 3, vector(1,run(3,length(A))) or -2), A[J]
is also marked as a factor with k levels, even if max(A[J]) < k.

When A is a factor with k levels, A[j] <- newvalue is legal only if
newvalue is an integer between 1 and k.  The number of levels associated
with A will not change even if max(A) < k after the replacement.

In both these last two situations, subscripting a factor or assigning to
a subscripted factor, it is possible to create a factor whose actual
maximum level is less than k.  However, the actual maximum factor level
will be used in any analysis.

See also topics makefactor(), 'models'.

Gary Oehlert 2003-01-15