kmeans(y [,means or classes] [,kmax:k1,kmin:k2,start:method,standard:F, weights:wts, quiet:T]), y a REAL matrix, means a REAL matrix with ncols(y) columns, classes a REAL vector with nrows(y) rows, k1 and k2 positive integers, k1 >= k2, method one of "random", "optimal", "means", or "classes", wts a REAL vector with nrows(wts) = nrows(y) |

kmeans(y, kmax:k1 [, kmin:k2]) performs k-means clusterings of the rows of REAL matrix y, starting with k1 clusters, and successively merging clusters until there are k2 clusters. By default the data are standardized and the initial clusters are selected randomly. At each stage, cases are reallocated among clusters in an attempt to minimize the sum of the within-cluster sums of squares. If kmin:k2 is omitted, k2 is taken to be k1. It is an error when k2 > k1. kmeans() returns a structure with components 'classes' and 'criterion'. Component classes is a nrows(y) by k2-k1+1 matrix (vector if k2 = k1) containing the cluster membership at each stage. Component criterion is a k2-k1+1 REAL vector containing the minimized criterion at each stage. By default, a brief history of the merging process is printed, including the values of the criterion being minimized. kmeans(y, kmax:k1 [, kmin:k2], start:"random") is identical to kmeans(y, kmax:k1 [, kmin:k2]). kmeans(y, kmax:k1 [, kmin:k2], start:"optimal") attempts to select the initial clusters so as to minimize the within-cluster sums of squares for column 1 of y. kmeans(y, Means [, kmin:k2], start:"means"), where Means is a k1 by ncols(y) matrix, selects as initial cluster j those rows of y that are closer to row j of Means than to any other row of Means using (Euclidean distance). If kmax:k1 is an argument with k1 != nrows(Means), a warning message is given and nrows(Means) is used. If there are duplicates among the rows of Means, a warning message is printed. kmeans(y, Classes [, kmin:k2], start:"classes"), where Classes is a vector of nrows(y) positive integers <= 255, uses Classes to specify initial clusters. If kmax:k1 is an argument with k1 != max(Classes), a warning message is given and max(Classes) is used. If there are empty classes (not all integers between 1 and max(Classes) are present), the empty classes are "squeezed out", and max(Classes) reduced accordingly. Additional keywords standard:F Do not standardize before clustering weights:wts Use weighted means and sums of squares with wts a REAL vector of length nrows(y) with w[i] > 0. quiet:T Suppress printing of clustering history. See also cluster().

Gary Oehlert 2003-01-15