Next: vecread_keys Up: MacAnova Help File Previous: vecread()   Contents

vecread_file

Keywords: variables, files, input, output
This topic discusses the format of files vecread(), readcols() and
readdata() can read.  Such files are plain text files which contain only
REAL or CHARACTER data in unstructured format.

Macro readcols() uses vecread() to read a file.  The only difference in
file format is that, when no variable names are provided to readcols()
or readdata(), the first line of the file is interpreted as containing
variable names.

See topic 'matread_file' for information on the format of files to be
read by matread() and read().

See topic 'vecread_keys' for information on vecread() keywords.

With the help of keyword 'bypass' you can read one of several sets of
data in the same file if the data sets are separated with lines
beginning with the "stop character" (default "!"; see 'vecread_keys').
You can read the third set, say, by including 'bypass:2' as an argument
to vecread() or readcols().  This discussion really describes the lines
after the bypassed data.

Keyword 'bypass' cannot be used with readdata().

REAL data to be read by vecread(), readcols() and readdata() consist of
numbers and codes for MISSING, often on several lines.  When there are
several data items on a line, they are separated by spaces, tabs or
commas.  Any of '?', '.', '*' and 'NA' code for MISSING.  '??', '???',
... are treated as a single missing value.

Interpretation of extra commas and non-numeric "fields" (sequences of
characters that are not commas, spaces or tabs) depends on whether
'byfields:T' is an argument to vecread() or readcols().  See topic
'vecread_keys' for details.

readdata() always uses 'byfields:T' in reading REAL data,.

What is done with items that are not numbers or missing value codes
depends on whether 'badvalue:badv' is an argument to vecread(),
readcols() or readdata(), where badv is a REAL scalar or MISSING.  See
topic 'vecread_keys'.

CHARACTER data to be read by vecread() or readcols() have no special
format when 'bylines:T' or 'bychars:T' is an argument.  When 'bywords:T'
is an argument, the file is interpreted as consisting of "words"
separated by "white space" or commas.  A word is a sequence of visible
characters other than commas.  An empty word, read as "", is assumed
before a leading comma, after a trailing comma or between two commas
enclosing no visible characters.

When all the data lines are at the start of the file and start with the
same character, for example '%', you can restrict reading to them by
including 'go:"%"' as an argument to vecread() or readcols().

Since keywords 'bywords', 'bychars' and 'bylines' are illegal for
readdata(), it cannot be used for reading CHARACTER data;

Data can be terminated by a "stop character" (default is '!').  With
'bychars:T' and 'bylines:T' this is recognized only as the first
character in a line.

You can change the stopping character by, say, 'stop:"$"'.  See topic
'vecread_keys' for details.

With readdata() or with 'byfields:T' and 'bywords:T' on vecread() or
readcols(), the stopping character must be at the start of a field or
word.  In other situations, it can be anywhere in the file.

Lines starting with a "skip character" specified by an argument of the
form, say, skip:"#" are skipped.  See topic 'vecread_keys' for more
information.

The default skip character is '#'.  If you don't want any skip
character, perhaps because there are lines in the file starting with '#'
that should be read, use 'skip:""' as an argument (not with readdata()).

You can write a file vecdata.txt of REAL data that vecread(),
readcols() and readdata() can read by

  Cmd> print(x,new:T,file:"vecdata.txt",header:F,labels:F,missing:"?")

where x is a REAL vector or matrix.  When x is a matrix, it is written
row by row as readcols() expects.  If you want it written column by
column, use x' as an argument.  You can specify the format or the number
of significant digits by keywords 'format' and 'nsig'.  See print().


Gary Oehlert 2003-01-15