Next: vecread_file Up: MacAnova Help File Previous: vconcat()   Contents

vecread()

Usage:
vecread(FileName [keyword phrases]), FileName a CHARACTER scalar, (REAL
  data)
vecread(FileName, bywords:T [keyword phrases]) (CHARACTER data)
vecread(FileName, bylines:T [keyword phrases]) (CHARACTER data)
vecread(FileName, bychars:T [keyword phrases]) (CHARACTER data)
Keyword phrases are: silent:T, quiet:F, echo:F or echo:T, prompt:F,
  printname:F, badkeyok:T, nofileok:T, stop:stopChar or go:goChar,
  skip:skipChar, skipthru:skipthruChar, n:N, startline:M, bypass:P,
  badvalue:val, byfields:T, stopChar, goChar, skipChar CHARACTER scalars
  consisting of one character, N > 0, M > 0, P >= 0 integers, val a REAL
  scalar or MISSING.  See topic 'vecread_keys'.
FileName can also be CONSOLE or have the form string:charVal where
charVal is a CHARACTER scalar or vector.



Keywords: input, files
This topic has sections on Reading REAL data, with examples, Reading
CHARACTER data, with examples, Reading a matrix with vecread(),
Controlling the lines to be read, Reading from a CHARACTER variable, and
Reading the console or batch file.

Topics 'vecread_file' and 'vecread_keys' provide additional information
on file format and keyword use for vecread().

vecread() reads data from a text file sequentially, row by row, starting
at the beginning of the file, interpreting items as numerical or
character data depending on keyword phrases.

The general usage of vecread is

  Cmd> Var <- vecread(FileName [,keyword phrases])

where FileName is a quoted string or a CHARACTER variable.  In windowed
versions (Macintosh, Windows, Motif), when FileName is "", you are
prompted to enter the file name using a dialog box.  Var becomes a REAL
or CHARACTER vector or possibly (with 'nofileok:T') NULL.

vecread(FileName [,keyword phrases], bypass:P), where P > 0 is an
integer, skips by all lines until P lines starting with the "stop
character" (default '!'; see below) have been read.  bypass:P can be
used with any other keyword phrases.  Other keywords affecting which
lines are read have no effect until after the P-th line starting with
the stop character.  This allows you to have several data sets in the
same file, separated by lines starting with the stop character.

vecread(FileName [,keyword phrases], startline:M), where M > 0 is an
integer, completely ignores the first M-1 lines in the file (or after
the P-th line starting with the stop character with 'bypass:P').
startline:M can be used with any other keyword phrases.

                           Reading REAL data
vecread(FileName) and vecread(FileName, byfields:T) read numbers from
the file with name FileName and return a REAL vector containing the
data.

Data of the form '?', '??', '???', ... as well as an isolated 'NA',
period '.'  or asterisk '*' are read as MISSING.

A number that is too large to be represented in the computer (for
example, -3.1e10000) is read as MISSING.

Reading REAL data without byfields:T or with byfields:F
 The file should contain a sequence of numbers or missing value codes,
 separated by tabs, spaces or single commas.

 Unreadable items are skipped and an informative message is printed
 once.  Numbers are extracted from "words" like '-1.2a5' which is
 interpreted as if it were '-1.2 a 5'.  Single commas between items are
 ignored; a sequence of m commas is treated as m-1 unreadable items.

Reading REAL data with byfields:T
 The file is interpreted as a sequence of possible empty "fields"
 separated by commas, spaces, tabs and ends of lines, with each field
 becoming an element of the result.  A field that is not a number or
 missing value code is unreadable and is returned as MISSING.  This
 includes fields like '-1.2a5' that contain one or more digits.  Empty
 fields, before a leading comma, after a trailing comma and between two
 commas with no intervening visible characters, are returned as MISSING.

vecread(FileName, badvalue:BadVal [,byfields:T]), where BadVal is a REAL
scalar or MISSING (?), returns a REAL vector with BadVal substituted for
every unreadable item.  For example, when reading '-1.2a5 17 ?',
vecread(FileName, badvalue:-99) returns vector(-1.2,-99,5,17,?) and
vecread(FileName, byfields:T, badvalue:-99) returns vector(-99,17,?).
With byfields:T this enables you to distinguish between codes for
MISSING and non-numeric items.

                       Reading REAL data examples
File "data1.txt" looks like the following:
   Henry   Male   67.3,10.5
   Susan   Female 59.2,   ?

File "data2.txt" looks like the following (note the extra comma):
   Henry   Male   67.3,   10.5
   Susan   Female 59.2,  ,   ?

File "data3.txt" looks like the following (note digits in fields):
   Henry   Season_1 67.3,10.5
   Susan   Season_2 59.2,   ?

  vecread("data1.txt") returns vector(67.3,10.5,59.2,?).

  vecread("data1.txt",byfields:T) returns vector(?,?,67.3,10.5,?,?,59.2,
   ?).

  Both vecread("data1.txt", badvalue:-1) and vecread("data1.txt",
   badvalue:-1,byfields:T) return vector(-1,-1,67.3, 10.5,-1,-1,59.2,?).

  vecread("data2.txt", badvalue:-1) returns vector(-1,-1,67.3,10.5,-1,
   -1,59.2,-1,?).

  vecread("data2.txt", badvalue:-1,byfields:T) returns
   vector(-1,-1,67.3, 10.5,-1,-1, 59.2,?,?).

  vecread("data3.txt") returns vector(1,67.3,10.5,2,59.2,?), extracting
   1 and 2 from Season_1 and Season_2.

  vecread("data3.txt",badvalue:-1) returns vector(-1,-1,1,67.3,10.5,-1,
   -1,2,59.2,?).

  vecread("data3.txt",byfields:T) returns vector(?,?,67.3,10.5,?,?,59.2,
   ?), treating Season_1 and Season_2 as unreadable.

  vecread("data3.txt", byfields:T, badvalue:-1) returns
   vector(-1,-1,67.3, 10.5,-1,-1,59.2,?).

  vecread(string:",1,,2,3,",byfields:T) returns vector(?,1,?,2,3,?)
   (see topic 'vecread_keys' for the use of 'string').

                         Reading CHARACTER data
vecread(fileName, bywords:T), vecread(fileName, bylines:T) and
vecread(fileName, bychars:T) read CHARACTER data from a file.  The
latter two can read data containing commas, spaces or tabs or other
"invisible" characters.

vecread(FileName, bywords:T) returns a CHARACTER vector, each element of
which is a "word" from the file.  For this usage, a word is a sequence
of printable non-blank characters, excluding commas.  Words are
separated by commas, or spaces, tabs or other "invisible" characters.

Quotation marks (") are not special and are treated as any other visible
character that is not a comma.

An "empty" word, before a leading comma, after a trailing comma, or
between successive commas with no intervening visible characters, is
returned as the null string "".

vecread(fileName, bylines:T) returns a CHARACTER vector, each element of
which is an entire line read from file fileName. The lines do not
include an end-of-line character but do include any other "invisible"
or non-printing characters such as TABS.

vecread(fileName, bychars:T) returns a CHARACTER vector, each element of
which is a single character read from file fileName, including any
end-of-line characters (returned as "\n") or other invisible characters.

                    Reading CHARACTER data examples
Here are more examples of reading the sample files used above to
illustrate reading REAL data:

  vecread("data1.txt", bywords:T) returns vector("Henry","Male",
   "67.3","10.5","Susan","Female","59.2","?").

  vecread("data2.txt", bywords:T) returns vector("Henry","Male",
   "67.3","10.5","Susan","Female","59.2","","?").

  vecread("data1.txt", bylines:T) returns
   vector("Henry   Male   67.3,10.5", "Susan   Female 59.2,   ?").

  vecread("data1.txt",bychars:T) returns vector("H","e","n","r","y"," ",
   " "," ","M","a","l","e"," "," "," ", "6","7",".","3",",","1","0",".",
   "5","\n","S","u","s","a","n"," "," ", " ","F","e","m","a","l","e","
   ", "5","9",".","2",","," "," "," ","?", "\n")

  vecread(string:",,a,b,", bywords:T) returns vector("","","a","b","")
    (see below for the use of 'string').

The following creates macro isnumber that tests whether each "word" of
a CHARACTER scalar or vector represents a valid number:

  Cmd> isnumber <- macro("@tmp <-  paste(vecread(string:$1,bywords:T))
    !ismissing(vecread(string:@tmp,badvalue:?,byfields:T))",\
    dollars:T)

Then
  isnumber("3.45") returns True, isnumber("3b45") returns False, and
  isnumber("3.4 4.5 A") returns vector(T,T,F).

                    Reading a matrix with vecread()
When the file contains a data matrix consisting of n rows of data, each
of k items, you can read the data into a n by k matrix by
  Cmd> x <- matrix(vecread(FileName [,byfields:T]),k)'
or, for CHARACTER data,
  Cmd> x <- matrix(vecread(FileName, bywords:T),k)'

The transpose is needed because vecread() reads row by row, but matrices
are filled column by column.

If there are several matrices in a file, separated by lines starting
with the stop character (default "!"), you can read the third one, say,
by
  Cmd> x <- matrix(vecread(FileName, bypass:2), k)'

                     Controlling the lines to read
In addition to 'bypass:P' and 'startline:M', keyword phrases 'skip:C',
'skipthru:C', 'stop:C' and 'go:C' control which lines will be scanned
for data.  In each case C is a single character such as "#" or "!" .
These are referred to as the "skip character", the "skipthru character",
the "stop character" and the "go character".  Except for the stop
character, these have no effect until after the lines skipped by
bypass:P and startline:M.

The default stop character is "!", whether reading numerical or
CHARACTER data.  That is, a '!' terminates scanning the file.  There are
no defaults for the skip, skipthru or go characters.

Briefly, lines starting with the skip character are skipped, as are all
lines up to and including the first line starting with the skipthru
character, and reading is terminated by the stop character or a line
that does not start with the go character.  When the skipthru character
is "\n", reading will start after the first completely empty line.  See
topic 'vecread_keys' for details.

If '!' appears in the file as other than a stop character, you should
use 'stop:C', where C is a character that does not occur in the file.
If the file consists solely of standard ASCII characters, 'stop:"\377"'
is a good choice.

With 'bylines:T' and 'bychars:T' and when skipping lines as controlled
by 'bypass:P', the stop character is recognized only as the first
character in a line.  With 'bywords:T' or 'byfields:T' it is recognized
only as the first character in a word or field.  Otherwise it is
recognized at any position in a line.

When you use keyword phrase n:N (see above), reading is terminated when
N items have been read.  When a stop character is found or a line that
does not start with the go character is found before N items have been
read, reading is stopped and a warning message is printed.

                  "Reading" from a CHARACTER variable
vecread(string:CharVar [, keywords]) where CharVar is a CHARACTER scalar
or vector, does not read from a file.  Instead, it "reads" CharVar as if
were a file.  See topic 'vecread_keys' for details.

Here is a particularly useful way to use keyword 'string':

  Cmd> x <- vecread(string:CLIPBOARD [,byfields:T])
and
  Cmd> x <- vecread(string:CLIPBOARD,bywords:T)

would read data from the special variable CLIPBOARD.  In the Macintosh,
Windows and Motif versions this would be taken from the Clipboard.
Pre-defined macro fromclip() makes use of this feature to read data on
the Clipboard.  In the Motif version, you can use special variable
SELECTION in a similar way to read the current X selection.  See topic
'CLIPBOARD'.

                Reading from the keyboard or batch file
vecread(CONSOLE [,keywords]) reads what you type rather than a file.  If
a variable CONSOLE exists, its value is ignored.  In windowed versions
(Macintosh, Windows, Motif) a dialog box is displayed in which you enter
data; in other versions, you are prompted to type in the data.  The
prompt can be suppressed by 'prompt:F'.

Data should be typed in one of the formats just described.  To stop
input, type the stop character (default '!'), followed by RETURN, or, if
you provided a go character (see 'vecread_keys'), type a line starting
with any other character.  In windowed versions, clicking on the "Done"
or "Cancel" button in the dialog box also ends input.

In a batch file vecread(CONSOLE [,...]) reads the immediately following
lines as the data file.  For this usage it is essential that either a
stop character terminates the data or keyword phrase n:N limits
number of items read.  You will probably want to use 'prompt:F' in this
case.  See also batch().

        Other optional keywords phrases that may always be used
You can also specify the file name by 'file:FileName', which need not be
the first argument.  In addition, you can replace FileName by
'string:CharVar', where CharVar is a CHARACTER scalar or vector which is
"read" as if it were a file.

You can use keyword phrases 'echo:T', 'quiet:T', 'quiet:F', 'silent:T'
and 'printname:F' to control what vecread() prints.

You can ignore duplicated or unrecognized keywords by 'badkeyok:T'.
This is useful in writing macros, since you can have a line like
  @x <- vecread(string:@lines, stop:"$", badkeyok:T, $K)
without checking to see whether 'string', 'stop' or a non-vecread
keyword is a macro argument and thus included when $K is expanded.  See
topics 'macros' and 'macro_syntax'.

Keyword phrase 'nofileok:T' instructs vecread() to return NULL instead
of aborting when it is unable to open a file.  This is useful in writing
robust macros that use vecread().

See topic 'vecread_keys' for details on these keywords.

See also topics readcols(), matread(), macroread(), batch(),
'vecread_file', 'vecread_keys', 'files', console(), 'vectors'.


Gary Oehlert 2003-01-15