Next: labels
Up: MacAnova Help File
Previous: keywords
Contents
Usage:
kmeans(y [,means or classes] [,kmax:k1,kmin:k2,start:method,standard:F,
weights:wts, quiet:T]), y a REAL matrix, means a REAL matrix with
ncols(y) columns, classes a REAL vector with nrows(y) rows, k1 and k2
positive integers, k1 >= k2, method one of "random", "optimal",
"means", or "classes", wts a REAL vector with nrows(wts) = nrows(y)
|
Keywords:
multivariate analysis
kmeans(y, kmax:k1 [, kmin:k2]) performs k-means clusterings of the rows
of REAL matrix y, starting with k1 clusters, and successively merging
clusters until there are k2 clusters. By default the data are
standardized and the initial clusters are selected randomly. At each
stage, cases are reallocated among clusters in an attempt to minimize
the sum of the within-cluster sums of squares. If kmin:k2 is omitted,
k2 is taken to be k1.
It is an error when k2 > k1.
kmeans() returns a structure with components 'classes' and 'criterion'.
Component classes is a nrows(y) by k2-k1+1 matrix (vector if k2 = k1)
containing the cluster membership at each stage. Component criterion is
a k2-k1+1 REAL vector containing the minimized criterion at each stage.
By default, a brief history of the merging process is printed, including
the values of the criterion being minimized.
kmeans(y, kmax:k1 [, kmin:k2], start:"random") is identical to kmeans(y,
kmax:k1 [, kmin:k2]).
kmeans(y, kmax:k1 [, kmin:k2], start:"optimal") attempts to select the
initial clusters so as to minimize the within-cluster sums of squares
for column 1 of y.
kmeans(y, Means [, kmin:k2], start:"means"), where Means is a k1 by
ncols(y) matrix, selects as initial cluster j those rows of y that are
closer to row j of Means than to any other row of Means using (Euclidean
distance). If kmax:k1 is an argument with k1 != nrows(Means), a warning
message is given and nrows(Means) is used. If there are duplicates
among the rows of Means, a warning message is printed.
kmeans(y, Classes [, kmin:k2], start:"classes"), where Classes is a
vector of nrows(y) positive integers <= 255, uses Classes to specify
initial clusters. If kmax:k1 is an argument with k1 != max(Classes), a
warning message is given and max(Classes) is used. If there are empty
classes (not all integers between 1 and max(Classes) are present), the
empty classes are "squeezed out", and max(Classes) reduced accordingly.
Additional keywords
standard:F Do not standardize before clustering
weights:wts Use weighted means and sums of squares
with wts a REAL vector of length
nrows(y) with w[i] > 0.
quiet:T Suppress printing of clustering
history.
See also cluster().
Gary Oehlert
2003-01-15