Revision 1.1 by

Revision 1.2 by

… | … | ||
---|---|---|---|

15 | 15 | ||

16 | =head1 SYNOPSIS | 16 | =head1 SYNOPSIS |

17 | 17 | ||

18 | use Tree::M; | 18 | use Tree::M; |

19 | 19 | ||

20 | $M = new Tree::M | ||

21 | |||

20 | =head1 DESCRIPTION | 22 | =head1 DESCRIPTION |

21 | 23 | ||

22 | (not yet) | 24 | (not yet) |

23 | 25 | ||

24 | Ever had the problem of managing multi-dimensional (spatial) data but your | 26 | Ever had the problem of managing multi-dimensional (spatial) data but your |

25 | database only had one-dimensional indices (b-tree etc.)? Queries like | 27 | database only had one-dimensional indices (b-tree etc.)? Queries like |

26 | 28 | ||

27 | select data from table where latitude > 40 and latitude < 50 | 29 | select data from table where latitude > 40 and latitude < 50 |

28 | and longitude> 50 and longitude< 70; | 30 | and longitude> 50 and longitude< 60; |

29 | 31 | ||

30 | are quite inefficient, unless longitude and latitude are part of the same | 32 | are quite inefficient, unless longitude and latitude are part of the same |

31 | spatial index (e.g. a r-tree). | 33 | spatial index (e.g. an R-tree). |

32 | 34 | ||

33 | An M-tree etc.. etc... | 35 | An M-tree is an index tree that does not directly look at the stored keys |

36 | but rather requires a I<distance> (a metric, e.g. a vector norm) function | ||

37 | to be defined that sorts keys according to their distance. In the example | ||

38 | above the distance function could be the maximum norm (C<max(x1-x2, | ||

39 | y1-y2)>). The lookup above would then be something like this: | ||

40 | |||

41 | my $res = $M->range([45,55], 5); | ||

42 | |||

43 | This module implements an M-tree. Although the data structure and the | ||

44 | distance function is arbitrary, the current version only implements | ||

45 | n-dimensional discrete vectors and hardwires the distance function to the | ||

46 | suared euclidean metric (i.e. C<(x1-x2)**2 + (y1-y2)**2 + (z1-z2)**2 + | ||

47 | ...>). Evolution towards more freedom is expected ;) | ||

34 | 48 | ||

35 | =head2 THE Tree::M CLASS | 49 | =head2 THE Tree::M CLASS |

36 | 50 | ||

37 | =over 4 | 51 | =over 4 |

38 | 52 | ||

39 | =item $M = new Tree::M arg => value, ... | 53 | =item $M = new Tree::M ndims, min, max, steps |

40 | 54 | ||

41 | ... | 55 | Creates a new M-Tree. Before it can be used you have to call one of the |

56 | C<create> or C<open> methods. | ||

42 | 57 | ||

43 | distance => specify a distance function. the default distance | 58 | ndims the number of dimensions each vector has |

44 | method uses this function in some way. | 59 | min the lowest allowable scalar value in each dimension |

45 | keyof => the keyof method uses this argument to determine | 60 | max the maximum allowable number |

46 | the key of an object | 61 | steps the number of discrete steps (used when stored externally) |

47 | min => the minimum number of objects stored in a node | 62 | |

48 | max => the maximum number of objects stored in a node | 63 | Example: create an M-Tree that stores 8-bit rgb-values: |

64 | |||

65 | $M = new Tree::M 3, 0, 255, 256; | ||

66 | |||

67 | Example: create an M-Tree that stores coordinates from -1..1 with 100 different steps: | ||

68 | |||

69 | $M = new Tree::M 2, -1, 1, 100; | ||

70 | |||

71 | =item $M->open(path) | ||

72 | |||

73 | =item $M->create($path) | ||

74 | |||

75 | Open or create the external storage file C<$path> and associate it with the tree. | ||

76 | |||

77 | [this braindamaged API will go away ;)] | ||

78 | |||

79 | =item $M->insert(\@v, $data) | ||

80 | |||

81 | Insert a vector (given by an array reference) into the index and associate | ||

82 | it with the value C<$data> (a 32-bit integer). | ||

83 | |||

84 | =item $res = $M->range(\@v, $radius) | ||

85 | |||

86 | Search all entries not farther away from C<@v> then C<$radius> and return | ||

87 | an arrayref containing the searchresults. | ||

88 | |||

89 | Each result is again anarrayref composed like this: | ||

90 | |||

91 | [\@v, $data] | ||

92 | |||

93 | e.g. the same as given to the C<insert> method. | ||

94 | |||

95 | =item $res = $M->top(\@v, $n) | ||

96 | |||

97 | Return the C<$n> "nearest neighbours". The results arrayref (see C<range>) | ||

98 | contains the C<$n> index values nearest to C<@v>, sorted for distance. | ||

99 | |||

100 | =item $distance = $M->distance(\@v1, \@v2) | ||

101 | |||

102 | Calculcate the distance between two vectors, just as they databse engine | ||

103 | would do it. | ||

104 | |||

105 | =item $depth = $M->maxlevel | ||

106 | |||

107 | Return the maximum height of the tree (usually a small integer specifying | ||

108 | the length of the path from the root to the farthest leaf) | ||

49 | 109 | ||

50 | =cut | 110 | =cut |

51 | 111 | ||

52 | =back | 112 | =back |

113 | |||

114 | =head1 BUGS | ||

115 | |||

116 | Inserting too many duplicate keys into the tree cause the C++ library to | ||

117 | die, so don't do that. | ||

53 | 118 | ||

54 | =head1 AUTHOR | 119 | =head1 AUTHOR |

55 | 120 | ||

56 | Marc Lehmann <pcg@goof.com>. | 121 | Marc Lehmann <pcg@goof.com>. |

57 | 122 |

– |
Removed lines |

+ |
Added lines |

< |
Changed lines |

> |
Changed lines |