--- Tree-M/M.pm 2001/05/06 00:45:51 1.1 +++ Tree-M/M.pm 2001/05/06 17:28:23 1.2 @@ -17,6 +17,8 @@ use Tree::M; + $M = new Tree::M + =head1 DESCRIPTION (not yet) @@ -25,32 +27,95 @@ database only had one-dimensional indices (b-tree etc.)? Queries like select data from table where latitude > 40 and latitude < 50 - and longitude> 50 and longitude< 70; + and longitude> 50 and longitude< 60; are quite inefficient, unless longitude and latitude are part of the same -spatial index (e.g. a r-tree). +spatial index (e.g. an R-tree). -An M-tree etc.. etc... +An M-tree is an index tree that does not directly look at the stored keys +but rather requires a I (a metric, e.g. a vector norm) function +to be defined that sorts keys according to their distance. In the example +above the distance function could be the maximum norm (C). The lookup above would then be something like this: + + my $res = $M->range([45,55], 5); + +This module implements an M-tree. Although the data structure and the +distance function is arbitrary, the current version only implements +n-dimensional discrete vectors and hardwires the distance function to the +suared euclidean metric (i.e. C<(x1-x2)**2 + (y1-y2)**2 + (z1-z2)**2 + +...>). Evolution towards more freedom is expected ;) =head2 THE Tree::M CLASS =over 4 -=item $M = new Tree::M arg => value, ... +=item $M = new Tree::M ndims, min, max, steps + +Creates a new M-Tree. Before it can be used you have to call one of the +C or C methods. + + ndims the number of dimensions each vector has + min the lowest allowable scalar value in each dimension + max the maximum allowable number + steps the number of discrete steps (used when stored externally) + +Example: create an M-Tree that stores 8-bit rgb-values: + + $M = new Tree::M 3, 0, 255, 256; + +Example: create an M-Tree that stores coordinates from -1..1 with 100 different steps: + + $M = new Tree::M 2, -1, 1, 100; + +=item $M->open(path) + +=item $M->create($path) + +Open or create the external storage file C<$path> and associate it with the tree. + +[this braindamaged API will go away ;)] + +=item $M->insert(\@v, $data) -... +Insert a vector (given by an array reference) into the index and associate +it with the value C<$data> (a 32-bit integer). - distance => specify a distance function. the default distance - method uses this function in some way. - keyof => the keyof method uses this argument to determine - the key of an object - min => the minimum number of objects stored in a node - max => the maximum number of objects stored in a node +=item $res = $M->range(\@v, $radius) + +Search all entries not farther away from C<@v> then C<$radius> and return +an arrayref containing the searchresults. + +Each result is again anarrayref composed like this: + + [\@v, $data] + +e.g. the same as given to the C method. + +=item $res = $M->top(\@v, $n) + +Return the C<$n> "nearest neighbours". The results arrayref (see C) +contains the C<$n> index values nearest to C<@v>, sorted for distance. + +=item $distance = $M->distance(\@v1, \@v2) + +Calculcate the distance between two vectors, just as they databse engine +would do it. + +=item $depth = $M->maxlevel + +Return the maximum height of the tree (usually a small integer specifying +the length of the path from the root to the farthest leaf) =cut =back +=head1 BUGS + +Inserting too many duplicate keys into the tree cause the C++ library to +die, so don't do that. + =head1 AUTHOR Marc Lehmann .