Index

NAME
VERSION
SYNOPSIS
DESCRIPTION
FILE FORMAT
PUBLIC FUNCTIONS
- find_field_names
OBJECT METHODS
- field_names
- flock
TIE-ARRAY METHODS
- TIEARRAY
- FETCH
- STORE
- FETCHSIZE
- STORESIZE
- EXISTS
- DELETE
- CLEAR
- UNTIE
PRIVATE METHODS
REQUIRES
INSTALLATION
SEE ALSO
BUGS
AUTHOR
COPYRIGHT AND LICENCE

NAME

Tie::FieldVals - an array tie for a file of enhanced Field:Value data

VERSION

This describes version 0.6202 of Tie::FieldVals.

SYNOPSIS

    use Tie::FieldVals;
    use Tie::FieldVals::Row;

    # tie the array
    my @records;
    my $recs_obj = tie @records, 'Tie::FieldVals', datafile=>$datafile;

    # object methods
    my @field_names = $recs_obj->field_names();

DESCRIPTION

This is a Tie object to map the records in an enhanced Field:Value data file into an array. Each file has multiple records, each record has its values defined by a Field:Value pair, with the enhancements that (a) the Value part can extend over more than one line (because the Field names are predefined) and (b) Fields can have multiple values by repeating the Field:Value part for a given field.

Because of its use of the Tie::File module, access to each record is reasonably fast. The Tie::File module also ensures that (a) the whole file doesn't have to be read into memory (b) record changes are written to the file straight away (c) record changes don't require the whole file to be rewritten, just the part of the file after the change.

The advantage of this setup is that one can have useful data files which are plain text, human readable, human editable, and at the same time able to be accessed faster than using XML (I know, I wrote a version of my reporting software using XML data, and even the fastest XML parsers weren't as fast as this setup, once there were a reasonable number of records). This also has advantages over a simpler setup where values are given one per line with no indication of what value belongs to what field; the problems with that is that it is harder to fix corrupted data by hand, and it is harder to add new fields, and one can't have multi-line data.

It is likewise better than a CSV (Comma-Separated Values) file, because again, with a CSV file, the data is positional and therefore harder to fix and harder to change, and again one can't have multi-line data.

This module is both better and worse than file-oriented databases like DB_File and its variants and extensions (such as MLDBM). This module does not require that each record have a unique key, and the fact that a DBM file is binary makes it not only less correctable, but also less portable. On the downside, this module isn't as fast.

Naturally, if one's data needs are more complex, it is probably better to use a fully-fledged database; this is oriented towards those who don't wish to have the overhead of setting up and maintaining a relational database server, and wish to use something more straightforward.

This comes bundled with other support modules, such as the Tie::FieldVals::Row module. The Tie::FieldVals::Select module is for selecting and sorting a sub-set from a Tie::FieldVals array, and the Tie::FieldVals::Join is a very simple method of joining two files on a common field.

This distribution includes the fv2xml script, which converts a Tie::FieldVals data file into an XML file, and xml2fv which converts an XML file into a Tie::FieldVals data file.

FILE FORMAT

The data file is in the form of Field:Value pairs, with each record separated by a line with '=' on it. The first record is an "empty" record, which just contains the field names; this lets us know what the legal fields are. A line which doesn't start with a recognised field is considered to be part of the value of the most recent Field.

Example 1

    Name:
    Entry:
    =
    Name:fanzine
    Entry:Fanzines are amateur magazines produced by fans.
    =
    Name:fan fiction (fanfic)
    Entry:Original fiction written by fans of a particular
    TV Show/Movie set in the universe depicted by that work.
    =

The first record just contains Name: and Entry: fields to show that those are the legal fields for this file. The third record gives an example of a value that goes over more than one line.

Example 2

    Author:
    AuthorEmail:
    AuthorURL:
    AuthorURLName:
    =
    Author:Adele
    AuthorEmail:adele@example.com
    AuthorEmail:adele@example.tas.edu
    AuthorURL:
    AuthorURLName:
    =
    Author:Danzer,Brenda
    AuthorEmail:
    AuthorURL:http://www.example.com/~danzer
    AuthorURLName:Danzer Dancing
    AuthorURL:http://www.brendance.com/
    AuthorURLName:BrenDance
    =

This one gives examples of multi-valued fields.

Gotchas

Field names cannot have spaces in them, indeed, they must consist of plain alphanumeric characters or underscores. They are case-sensitive.

The record separator (=) must be on a line by itself, and the last record in the file must also have a record-separator after it.

PUBLIC FUNCTIONS

find_field_names

    my @field_names = Tie::FieldVals::find_field_names($datafile);

Read the field-name information from the file, if the file exists and is readable.

OBJECT METHODS

field_names

Get the field names of this data.

my @field_names = $recs_obj->field_names();

flock

    $recs_obj->flock(MODE);

Locks the data file. "MODE" has the same meaning as the second argument to the Perl built-in "flock" function; for example "LOCK_SH" or "LOCK_EX | LOCK_NB". (These constants are provided by the "use Fcntl ':flock';" declaration.)

"MODE" is optional; the default is "LOCK_EX".

When you use "flock" to lock the file, "Tie::FieldVals" assumes that the record cache is no longer trustworthy, because another process might have modified the file since the last time it was read. Therefore, a successful call to "flock" discards the contents of the record cache.

The best way to unlock a file is to discard the object and untie the array. It is probably unsafe to unlock the file without also untying it, because if you do, changes may remain unwritten inside the object. That is why there is no shortcut for unlocking. If you really want to unlock the file prematurely, you know what to do; if you don't know what to do, then don't do it.

See flock in Tie::File for more information (this calls the flock method of that module).

TIE-ARRAY METHODS

TIEARRAY

Create a new instance of the object as tied to an array.

    tie @people, 'Tie::FieldVals', datafile=>$datafile;

    tie @people, 'Tie::FieldVals', datafile=>$datafile,
	mode=>O_RDONLY, cache_size=>1000, memory=>0;

    tie @people, 'Tie::FieldVals', datafile=>$datafile,
	fields=>[qw(Name Email)], mode=>(O_RDWR|O_CREAT);

    tie @people, 'Tie::FieldVals', datafile=>$datafile,
	mode=>O_RDWR, cache_all=>1;

Arguments:

datafile

The file with the data in it. (required)

fields

Field defintions for creating a new file. This is ignored if the file already exists.

mode

The mode to open the file with. O_RDONLY means that the file is read-only. O_RDWR means that the file is read-write. (default: O_RDONLY)

cache_all

If true, cache all the records in the file. This will speed things up, but consume more memory. (default: false)

Note that this merely sets the cache_size to the size of the file when the tie is initially made: if you add more records to the file, the cache size will not be increased.

cache_size

The size of the cache (if we aren't caching all the records). (default: 100) As ever, there is a trade-off between space and time.

memory

The upper limit on the memory consumed by Tie::File. (See Tie::File). (default: 10,000,000)

Note that there are two caches: the cache of unparsed records maintained by Tie::File, and the cache of parsed records maintained by Tie::FieldVals. The memory option affects the Tie::File cache, and the cache_* options affect the Tie::FieldVals cache.

FETCH

Get a row from the array.

    $val = $array[$ind];

Returns a reference to a Tie::FieldVals::Row hash, or undef.

STORE

Add a value to the array. Value must be a Tie::FieldVals::Row hash.

    $array[$ind] = $val;

If $ind is bigger than the array, then just push, don't extend.

FETCHSIZE

Get the size of the array.

STORESIZE

Set the size of the array, if the file is writeable.

EXISTS

    exists $array[$ind];

DELETE

    delete $array[$ind];

Delete the value at $ind if the file is writeable.

CLEAR

    @array = ();

Clear the array if the file is writeable.

UNTIE

    untie @array;

Untie the array.

PRIVATE METHODS

This documentation is for developer reference only.

debug

Set debugging on.

whowasi

For debugging: say who called this

set_field_names

Set the field names in the data-file to be the given field names. (Assumes the file didn't exist before).

REQUIRES

    Test::More

    Carp
    Tie::Array
    Tie::File
    Fcntl
    Data::Dumper

    Getopt::Long
    Pod::Usage
    Getopt::ArgvFile
    File::Basename

INSTALLATION

To install this module, run the following commands:

    perl Build.PL
    ./Build
    ./Build test
    ./Build install

Or, if you're on a platform (like DOS or Windows) that doesn't like the "./" notation, you can do this:

   perl Build.PL
   perl Build
   perl Build test
   perl Build install

In order to install somewhere other than the default, such as in a directory under your home directory, like "/home/fred/perl" go

   perl Build.PL --install_base /home/fred/perl

as the first step instead.

This will install the files underneath /home/fred/perl.

You will then need to make sure that you alter the PERL5LIB variable to find the modules, and the PATH variable to find the script.

Therefore you will need to change: your path, to include /home/fred/perl/script (where the script will be)

	PATH=/home/fred/perl/script:${PATH}

the PERL5LIB variable to add /home/fred/perl/lib

	PERL5LIB=/home/fred/perl/lib:${PERL5LIB}

BUGS

Please report any bugs or feature requests to the author.

AUTHOR

    Kathryn Andersen (RUBYKAT)
    perlkat AT katspace dot com
    http://www.katspace.com

COPYRIGHT AND LICENCE

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.