Next: spool() Up: MacAnova Help File Previous: sort() Contents

split()

Usage:

split(x,A [,compnames:CharVar ,silent:T]), x REAL, A a factor or vector
  of integers or LOGICAL vector, CharVar a CHARACTER scalar or vector
split(x,bycols:T or byrows:T [,compnames:CharVar, silent:T]), x a REAL
  matrix

Keywords: combining variables, structures

split(Data,Factor) creates a structure by splitting Data along its first
subscript into separate components according to the values of Factor all
of whose elements must be positive integers.

If N = max(Factor), the result has N components, some of which may be
empty.  Thus, when the values in Factor are group or treatment numbers,
each component of the result consists of the data corresponding to a
particular group or treatment.  It is an error if N > 32767.

It is also acceptable for Factor to be a LOGICAL vector, in which case
False and True correspond to factor levels 1 and 2, respectively.  For
example, split(y, x <= 0) would create a structure with two components.

Data must be REAL or LOGICAL and the components of the result are the
same type.  Each component of the result will be a vector, matrix or
array, depending on whether Data is a vector, matrix, or array.  A
warning is printed if any component of the result contains no elements.
If Factor[i] is MISSING, all the corresponding data are omitted.  It is
an error for all the elements of Factor to be MISSING.

split(Data,Factor,silent:T) does the same, except warning messages about
missing values in Factor or empty components in the result are
suppressed.

If Factor is a variable rather than an expression, say groups or
@groups, the components will be named 'groups1', 'groups2', etc.
Similarly if Factor is specified in a keyword phrase such as
dose:rep(run(4),5), components will be named 'dose1', 'dose2', etc.

split(Data,bycols:T) or simply split(Data) splits the data along the
last subscript, creating a structure with one component corresponding to
each value of the last subscript.  The most important case is when Data
is a m by n matrix, in which case the result each of the n components of
the result is a vector containing the data from a column of Data.
Components will be named 'col1', 'col2', ... .  If Data is a vector, the
result is a structure with a single component named 'col1'.

split(Data,byrows:T) splits the data along the first subscript, creating
a structure with one component corresponding to each value of the first
subscript.  For example, when Data is a m by n matrix, the result is a
structure with m components, each a row vector of size n (1 by n
matrix).  Components will be named 'row1', 'row2', ... .

For all these usages, an additional argument of the form
compnames:CharVec, is recognized, where CharVec is a CHARACTER vector.
The elements of CharVec are used as names for the components of the
result, truncated to 12 characters if necessary, overriding the naming
conventions just described.  If length(CharVec) = 1, say "group", it is
used as a "root" for forming names for the components of the form
"group1", "group2", ... .  Otherwise length(CharVec) must match the
number of components of the result.  It is an error if any element of
CharVec contains '$'.

An important use of split is in boxplot(split(y,groups)), where y is a
REAL vector and groups is a factor.  This produces parallel boxplots of
the of the data in y corresponding to each level of groups.  Similarly,
when y is a REAL matrix, boxplot(split(y,bycols:T)) or simply
boxplot(split(y)), produces parallel box plots of the data in each
column of y.

Examples:
  Cmd> split(run(4),variety:vector(1,2,1,2))
  component: variety1
  (1)            1            3
  component: variety2
  (1)            2            4

An equivalent command would be
  Cmd> split(run(4),vector(1,2,1,2),compnames:"variety")

  Cmd> boxplot(split(y),ylab:"Column Number")
where y is a matrix, produces parallel boxplots of the columns of y.

See also topics boxplot(), 'structures'.

Gary Oehlert 2003-01-15