Index
- NAME
- VERSION
- SYNOPSIS
- DESCRIPTION
- FILE FORMAT
- PUBLIC FUNCTIONS
- OBJECT METHODS
- TIE-ARRAY METHODS
- PRIVATE METHODS
- REQUIRES
- INSTALLATION
- SEE ALSO
- BUGS
- AUTHOR
- COPYRIGHT AND LICENCE
NAME
Tie::FieldVals - an array tie for a file of enhanced Field:Value data
VERSION
This describes version 0.6202 of Tie::FieldVals.
SYNOPSIS
use Tie::FieldVals; use Tie::FieldVals::Row; # tie the array my @records; my $recs_obj = tie @records, 'Tie::FieldVals', datafile=>$datafile; # object methods my @field_names = $recs_obj->field_names();
DESCRIPTION
This is a Tie object to map the records in an enhanced Field:Value data file into an array. Each file has multiple records, each record has its values defined by a Field:Value pair, with the enhancements that (a) the Value part can extend over more than one line (because the Field names are predefined) and (b) Fields can have multiple values by repeating the Field:Value part for a given field.
Because of its use of the Tie::File module, access to each record is reasonably fast. The Tie::File module also ensures that (a) the whole file doesn't have to be read into memory (b) record changes are written to the file straight away (c) record changes don't require the whole file to be rewritten, just the part of the file after the change.
The advantage of this setup is that one can have useful data files which are plain text, human readable, human editable, and at the same time able to be accessed faster than using XML (I know, I wrote a version of my reporting software using XML data, and even the fastest XML parsers weren't as fast as this setup, once there were a reasonable number of records). This also has advantages over a simpler setup where values are given one per line with no indication of what value belongs to what field; the problems with that is that it is harder to fix corrupted data by hand, and it is harder to add new fields, and one can't have multi-line data.
It is likewise better than a CSV (Comma-Separated Values) file, because again, with a CSV file, the data is positional and therefore harder to fix and harder to change, and again one can't have multi-line data.
This module is both better and worse than file-oriented databases like DB_File and its variants and extensions (such as MLDBM). This module does not require that each record have a unique key, and the fact that a DBM file is binary makes it not only less correctable, but also less portable. On the downside, this module isn't as fast.
Naturally, if one's data needs are more complex, it is probably better to use a fully-fledged database; this is oriented towards those who don't wish to have the overhead of setting up and maintaining a relational database server, and wish to use something more straightforward.
This comes bundled with other support modules, such as the Tie::FieldVals::Row module. The Tie::FieldVals::Select module is for selecting and sorting a sub-set from a Tie::FieldVals array, and the Tie::FieldVals::Join is a very simple method of joining two files on a common field.
This distribution includes the fv2xml script, which converts a Tie::FieldVals data file into an XML file, and xml2fv which converts an XML file into a Tie::FieldVals data file.
FILE FORMAT
The data file is in the form of Field:Value pairs, with each record separated by a line with '=' on it. The first record is an "empty" record, which just contains the field names; this lets us know what the legal fields are. A line which doesn't start with a recognised field is considered to be part of the value of the most recent Field.
Example 1
Name: Entry: = Name:fanzine Entry:Fanzines are amateur magazines produced by fans. = Name:fan fiction (fanfic) Entry:Original fiction written by fans of a particular TV Show/Movie set in the universe depicted by that work. =
The first record just contains Name: and Entry: fields to show that those are the legal fields for this file. The third record gives an example of a value that goes over more than one line.
Example 2
Author: AuthorEmail: AuthorURL: AuthorURLName: = Author:Adele AuthorEmail:adele@example.com AuthorEmail:adele@example.tas.edu AuthorURL: AuthorURLName: = Author:Danzer,Brenda AuthorEmail: AuthorURL:http://www.example.com/~danzer AuthorURLName:Danzer Dancing AuthorURL:http://www.brendance.com/ AuthorURLName:BrenDance =
This one gives examples of multi-valued fields.
Gotchas
Field names cannot have spaces in them, indeed, they must consist of plain alphanumeric characters or underscores. They are case-sensitive.
The record separator (=) must be on a line by itself, and the last record in the file must also have a record-separator after it.
PUBLIC FUNCTIONS
find_field_names
my @field_names = Tie::FieldVals::find_field_names($datafile);
Read the field-name information from the file, if the file exists and is readable.
OBJECT METHODS
field_names
Get the field names of this data.
my @field_names = $recs_obj->field_names();
flock
$recs_obj->flock(MODE);
Locks the data file. "MODE" has the same meaning as the second argument to the Perl built-in "flock" function; for example "LOCK_SH" or "LOCK_EX | LOCK_NB". (These constants are provided by the "use Fcntl ':flock';" declaration.)
"MODE" is optional; the default is "LOCK_EX".
When you use "flock" to lock the file, "Tie::FieldVals" assumes that the record cache is no longer trustworthy, because another process might have modified the file since the last time it was read. Therefore, a successful call to "flock" discards the contents of the record cache.
The best way to unlock a file is to discard the object and untie the array. It is probably unsafe to unlock the file without also untying it, because if you do, changes may remain unwritten inside the object. That is why there is no shortcut for unlocking. If you really want to unlock the file prematurely, you know what to do; if you don't know what to do, then don't do it.
See flock in Tie::File for more information (this calls the flock method of that module).
TIE-ARRAY METHODS
TIEARRAY
Create a new instance of the object as tied to an array.
tie @people, 'Tie::FieldVals', datafile=>$datafile; tie @people, 'Tie::FieldVals', datafile=>$datafile, mode=>O_RDONLY, cache_size=>1000, memory=>0; tie @people, 'Tie::FieldVals', datafile=>$datafile, fields=>[qw(Name Email)], mode=>(O_RDWR|O_CREAT); tie @people, 'Tie::FieldVals', datafile=>$datafile, mode=>O_RDWR, cache_all=>1;
Arguments:
- datafile
-
The file with the data in it. (required)
- fields
-
Field defintions for creating a new file. This is ignored if the file already exists.
- mode
-
The mode to open the file with. O_RDONLY means that the file is read-only. O_RDWR means that the file is read-write. (default: O_RDONLY)
- cache_all
-
If true, cache all the records in the file. This will speed things up, but consume more memory. (default: false)
Note that this merely sets the cache_size to the size of the file when the tie is initially made: if you add more records to the file, the cache size will not be increased.
- cache_size
-
The size of the cache (if we aren't caching all the records). (default: 100) As ever, there is a trade-off between space and time.
- memory
-
The upper limit on the memory consumed by
Tie::File
. (See Tie::File). (default: 10,000,000)Note that there are two caches: the cache of unparsed records maintained by Tie::File, and the cache of parsed records maintained by Tie::FieldVals. The memory option affects the Tie::File cache, and the cache_* options affect the Tie::FieldVals cache.
FETCH
Get a row from the array.
$val = $array[$ind];
Returns a reference to a Tie::FieldVals::Row hash, or undef.
STORE
Add a value to the array. Value must be a Tie::FieldVals::Row hash.
$array[$ind] = $val;
If $ind is bigger than the array, then just push, don't extend.
FETCHSIZE
Get the size of the array.
STORESIZE
Set the size of the array, if the file is writeable.
EXISTS
exists $array[$ind];
DELETE
delete $array[$ind];
Delete the value at $ind if the file is writeable.
CLEAR
@array = ();
Clear the array if the file is writeable.
UNTIE
untie @array;
Untie the array.
PRIVATE METHODS
This documentation is for developer reference only.
debug
Set debugging on.
whowasi
For debugging: say who called this
set_field_names
Set the field names in the data-file to be the given field names. (Assumes the file didn't exist before).
REQUIRES
Test::More Carp Tie::Array Tie::File Fcntl Data::Dumper Getopt::Long Pod::Usage Getopt::ArgvFile File::Basename
INSTALLATION
To install this module, run the following commands:
perl Build.PL ./Build ./Build test ./Build install
Or, if you're on a platform (like DOS or Windows) that doesn't like the "./" notation, you can do this:
perl Build.PL perl Build perl Build test perl Build install
In order to install somewhere other than the default, such as in a directory under your home directory, like "/home/fred/perl" go
perl Build.PL --install_base /home/fred/perl
as the first step instead.
This will install the files underneath /home/fred/perl.
You will then need to make sure that you alter the PERL5LIB variable to find the modules, and the PATH variable to find the script.
Therefore you will need to change: your path, to include /home/fred/perl/script (where the script will be)
PATH=/home/fred/perl/script:${PATH}
the PERL5LIB variable to add /home/fred/perl/lib
PERL5LIB=/home/fred/perl/lib:${PERL5LIB}
SEE ALSO
perl(1). Tie::FieldVals::Row Tie::FieldVals::Select Tie::FieldVals::Join Tie::FieldVals::Row::Join
BUGS
Please report any bugs or feature requests to the author.
AUTHOR
Kathryn Andersen (RUBYKAT) perlkat AT katspace dot com http://www.katspace.com
COPYRIGHT AND LICENCE
Copyright (c) 2004-2008 by Kathryn Andersen
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.