split(x,A [,compnames:CharVar ,silent:T]), x REAL, A a factor or vector of integers or LOGICAL vector, CharVar a CHARACTER scalar or vector split(x,bycols:T or byrows:T [,compnames:CharVar, silent:T]), x a REAL matrix |

split(Data,Factor) creates a structure by splitting Data along its first subscript into separate components according to the values of Factor all of whose elements must be positive integers. If N = max(Factor), the result has N components, some of which may be empty. Thus, when the values in Factor are group or treatment numbers, each component of the result consists of the data corresponding to a particular group or treatment. It is an error if N > 32767. It is also acceptable for Factor to be a LOGICAL vector, in which case False and True correspond to factor levels 1 and 2, respectively. For example, split(y, x <= 0) would create a structure with two components. Data must be REAL or LOGICAL and the components of the result are the same type. Each component of the result will be a vector, matrix or array, depending on whether Data is a vector, matrix, or array. A warning is printed if any component of the result contains no elements. If Factor[i] is MISSING, all the corresponding data are omitted. It is an error for all the elements of Factor to be MISSING. split(Data,Factor,silent:T) does the same, except warning messages about missing values in Factor or empty components in the result are suppressed. If Factor is a variable rather than an expression, say groups or @groups, the components will be named 'groups1', 'groups2', etc. Similarly if Factor is specified in a keyword phrase such as dose:rep(run(4),5), components will be named 'dose1', 'dose2', etc. split(Data,bycols:T) or simply split(Data) splits the data along the last subscript, creating a structure with one component corresponding to each value of the last subscript. The most important case is when Data is a m by n matrix, in which case the result each of the n components of the result is a vector containing the data from a column of Data. Components will be named 'col1', 'col2', ... . If Data is a vector, the result is a structure with a single component named 'col1'. split(Data,byrows:T) splits the data along the first subscript, creating a structure with one component corresponding to each value of the first subscript. For example, when Data is a m by n matrix, the result is a structure with m components, each a row vector of size n (1 by n matrix). Components will be named 'row1', 'row2', ... . For all these usages, an additional argument of the form compnames:CharVec, is recognized, where CharVec is a CHARACTER vector. The elements of CharVec are used as names for the components of the result, truncated to 12 characters if necessary, overriding the naming conventions just described. If length(CharVec) = 1, say "group", it is used as a "root" for forming names for the components of the form "group1", "group2", ... . Otherwise length(CharVec) must match the number of components of the result. It is an error if any element of CharVec contains '$'. An important use of split is in boxplot(split(y,groups)), where y is a REAL vector and groups is a factor. This produces parallel boxplots of the of the data in y corresponding to each level of groups. Similarly, when y is a REAL matrix, boxplot(split(y,bycols:T)) or simply boxplot(split(y)), produces parallel box plots of the data in each column of y. Examples: Cmd> split(run(4),variety:vector(1,2,1,2)) component: variety1 (1) 1 3 component: variety2 (1) 2 4 An equivalent command would be Cmd> split(run(4),vector(1,2,1,2),compnames:"variety") Cmd> boxplot(split(y),ylab:"Column Number") where y is a matrix, produces parallel boxplots of the columns of y. See also topics boxplot(), 'structures'.

Gary Oehlert 2003-01-15