package Tree::M; use Carp; use DynaLoader; BEGIN { $VERSION = 0.02; @ISA = qw(DynaLoader); bootstrap Tree::M, $VERSION; } =head1 NAME Tree::M - implement M-trees for efficient "metric/multimedia-searches" =head1 SYNOPSIS use Tree::M; $M = new Tree::M =head1 DESCRIPTION (not yet) Ever had the problem of managing multi-dimensional (spatial) data but your database only had one-dimensional indices (b-tree etc.)? Queries like select data from table where latitude > 40 and latitude < 50 and longitude> 50 and longitude< 60; are quite inefficient, unless longitude and latitude are part of the same spatial index (e.g. an R-tree). An M-tree is an index tree that does not directly look at the stored keys but rather requires a I (a metric, e.g. a vector norm) function to be defined that sorts keys according to their distance. In the example above the distance function could be the maximum norm (C). The lookup above would then be something like this: my $res = $M->range([45,55], 5); This module implements an M-tree. Although the data structure and the distance function is arbitrary, the current version only implements n-dimensional discrete vectors and hardwires the distance function to the suared euclidean metric (i.e. C<(x1-x2)**2 + (y1-y2)**2 + (z1-z2)**2 + ...>). Evolution towards more freedom is expected ;) =head2 THE Tree::M CLASS =over 4 =item $M = new Tree::M arg => value, ... Creates a new M-Tree. Before it can be used you have to call one of the C or C methods below. ndims => integer the number of dimensions each vector has range => [min, max, steps] min the lowest allowable scalar value in each dimension max the maximum allowable number steps the number of discrete steps (used when stored externally) pagesize => integer the size of one page on underlying storage. usually 4096, but large objects (ndims > 20 or so) might want to increase this Example: create an M-Tree that stores 8-bit rgb-values: $M = new Tree::M ndims => 3, range => [0, 255, 256]; Example: create an M-Tree that stores coordinates from -1..1 with 100 different steps: $M = new Tree::M ndims => 2, range => [-1, 1, 100]; =item $M->open(path) =item $M->create($path) Open or create the external storage file C<$path> and associate it with the tree. [this braindamaged API will go away ;)] =item $M->insert(\@v, $data) Insert a vector (given by an array reference) into the index and associate it with the value C<$data> (a 32-bit integer). =item $M->sync Synchronize the data file with memory. Useful after calling C to ensure the data actually reaches stable storage. =item $res = $M->range(\@v, $radius) Search all entries not farther away from C<@v> then C<$radius> and return an arrayref containing the searchresults. Each result is again anarrayref composed like this: [\@v, $data] e.g. the same as given to the C method. =item $res = $M->top(\@v, $n) Return the C<$n> "nearest neighbours". The results arrayref (see C) contains the C<$n> index values nearest to C<@v>, sorted for distance. =item $distance = $M->distance(\@v1, \@v2) Calculcate the distance between two vectors, just as they databse engine would do it. =item $depth = $M->maxlevel Return the maximum height of the tree (usually a small integer specifying the length of the path from the root to the farthest leaf) =cut sub new { my $class = shift; my %a = @_; $class->_new( $a{ndims}, $a{range}[0], $a{range}[1], $a{range}[2], $a{pagesize}, ); } =back =head1 BUGS Inserting too many duplicate keys into the tree cause the C++ library to die, so don't do that. =head1 AUTHOR Marc Lehmann . =head1 SEE ALSO perl(1), L. =cut 1;