ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/Geo-LatLon2Place/LatLon2Place.pm
(Generate patch)

Comparing Geo-LatLon2Place/LatLon2Place.pm (file contents):
Revision 1.1 by root, Mon Mar 14 02:41:51 2022 UTC vs.
Revision 1.2 by root, Mon Mar 14 03:14:40 2022 UTC

8 8
9 my $db = Geo::LatLon2Place->new ("/var/lib/mydb.cdb"); 9 my $db = Geo::LatLon2Place->new ("/var/lib/mydb.cdb");
10 10
11=head1 DESCRIPTION 11=head1 DESCRIPTION
12 12
13This is a simple-purpose module that tries to do one job: find the nearest 13This is a single-purpose module that tries to do one job: find the nearest
14placename for a point on earth. It doesn't claim to do a perfect job, but 14placename for a point on earth. It doesn't claim to do a perfect job, but
15it tries to be simple to set up, simple to use and be fast. 15it tries to be simple to set up, simple to use and be fast. It doesn't
16attempt to provide many features or nifty algorithms, and is meant to be
17used in situations where you simply need a name for a coordinate without
18becoming a GIS expert first.
16 19
17=head2 BUILDING AND SETTING UP 20=head2 BUILDING, SETTING UP AND USAGE
18 21
19To build this module, you need tinycdb, a cdb implementation by Michael 22To build this module, you need tinycdb, a cdb implementation by Michael
20Tokarev, or a compatible library. On GNU/Debian-based systems you can get 23Tokarev, or a compatible library. On GNU/Debian-based systems you can get
21this by executing F<apt-get install libcdb-dev>. 24this by executing F<apt-get install libcdb-dev>.
22 25
27(L<https://www.geonames.org/export/>, note the license), for example, 30(L<https://www.geonames.org/export/>, note the license), for example,
28F<cities500.zip>, which lists all places with population 500 or more: 31F<cities500.zip>, which lists all places with population 500 or more:
29 32
30 wget https://download.geonames.org/export/dump/cities500.zip 33 wget https://download.geonames.org/export/dump/cities500.zip
31 unzip cities500.zip 34 unzip cities500.zip
32 geo-latlon2place-makedb --geonames-gazetteer cities500.txt ll2place.cdb 35 geo-latlon2place-makedb cities500.txt cities500.ll2p
33 36
34This will create a file F<ll2place.cdb> that you can use for lookups 37This will create a file F<ll2p.cdb> that you can use for lookups
35with this module. At the time of this writing, the F<cities500> database 38with this module. At the time of this writing, the F<cities500> database
36results in about a 10MB file while the F<allCountries> database results in 39results in about a 10MB file while the F<allCountries> database results in
37about 120MB. 40about 120MB.
38 41
42Lookups will return a string of the form C<placename, countrycode>.
43
44If you want to use the geonames postal code database (from
45L<https://www.geonames.org/zip/>), use these commands:
46
47 wget https://download.geonames.org/export/zip/allCountries.zip
48 unzip allCountries.zip
49 geo-latlon2place-makedb --extract geonames-postalcodes allCountries.txt allCountries.ll2p
50
51You can then use the resulting database like this:
52
53 my $lookup = Geo::LatLon2Place->new ("allCountries.ll2p");
54
55 # and then do as many queries as you wish:
56 my $res = $lookup->(49, 8.4);
57 if (defined $res) {
58 utf8::decode $res; # convert $res from utf-8 to unicode
59 print "49, 8.4 found $res\n"; # should be Karlsruhe, DE for geonames
60 } else {
61 print "nothing found at 49, 8.4\n";
62 }
63
64=head1 THE Geo::LatLon2Place CLASS
65
39=over 4 66=over
40 67
41=cut 68=cut
42 69
43package Geo::LatLon2Place; 70package Geo::LatLon2Place;
44 71
48 75
49BEGIN { 76BEGIN {
50 our $VERSION = 0.01; 77 our $VERSION = 0.01;
51 78
52 require XSLoader; 79 require XSLoader;
53 XSLoader::load __PACKAGE__, $VERSION; 80 XSLoader::load (__PACKAGE__, $VERSION);
54 81
55 eval 'sub TORAD() { ' . ((atan2 1,0) / 180) . ' }'; 82 eval 'sub TORAD() { ' . ((atan2 1,0) / 180) . ' }';
56} 83}
84
85=item $lookup = Geo::LatLon2Place->new ($path)
86
87Opens a database created by F<geo-latlon2place-makedb> and return an
88object that allows you to run queries against it.
89
90The database will be mmaped, so it will not be loaded into memory, but
91your operating system will cache it appropriately.
92
93=cut
57 94
58sub new { 95sub new {
59 my ($class, $path) = @_; 96 my ($class, $path) = @_;
60 97
61 open my $fh, "<", $path 98 open my $fh, "<", $path
80sub DESTROY { 117sub DESTROY {
81 my ($self) = @_; 118 my ($self) = @_;
82 119
83 cdb_free $self->[1]; 120 cdb_free $self->[1];
84} 121}
122
123=item $res = $lookup->lookup ($lat $lon[, $radius])
124
125Looks up the point in the database that is "nearest" to C<$lat, $lon>,
126search at leats up to C<$radius> kilometres. The default for C<$radius> is
127the cell size the database is built with, and this usually works best, so
128you usually do not specify this parameter.
129
130If something is found, the associated data blob (always a binary string)
131is returned, otherwise you receive C<undef>.
132
133Unless you specify a cusotrm format, the data blob is actually a UTF-8
134string, so you might want to call C<utf8::decode> on it to get a unicode
135astring.
136
137At the moment, the implementation is in pure perl, but will eventually
138move to C.
139
140=cut
85 141
86sub lookup { 142sub lookup {
87 my ($self, $lat, $lon, $radius) = @_; 143 my ($self, $lat, $lon, $radius) = @_;
88 144
89 $radius ||= $self->[2]; 145 $radius ||= $self->[2];
115 } 171 }
116 172
117 $res 173 $res
118} 174}
119 175
176=back
177
178=head1 ALGORITHM
179
180The algorithm that this module implements consists of two parts: binning
181and weighting (done when writing the database) and then finding the
182nearest point.
183
184The first part bins all data points into a grid which has its minimum cell
185size at the equator and poles, with somewhat larger cells in between.
186
187The lookup part will then read the cell that the coordinate is in and some
188neighbouring cells (depending on the search radius, by default it will
189read the eight cells around it).
190
191It will then calculate the (squared) distance to the search coordinate
192using an approximate euclidean distance on an equireactangular
193projection. The squared distance is multiplied with a weight (1..25 for
194the geonames database, based on population and adminstrative status,
195always 1 for postcal codes), and the minimum distance wins.
196
197Binning should not introduce errors, but bigger bins can slow down lookup
198times due to having to look at more places. The lookup assumes a spherical
199shape for the earth, the equirectangular projection stretches distances
200unevenly and the euclidean distance calculation introduces further
201errors. For typical distance (<< 100km) and the intended usage, these
202errors should be considered negligible.
203
204=head1 SPEED
205
206The current implementation is written in pure perl, and on my machine,
207typically does 10000-200000 lookups per second. The goal for version 1.0
208is to move the lookup to C.
209
210=head1 TENTATIVE ROADMAP
211
212The database writer should be accessible via a module, so you cna easily
213generate your own databases without having to run an external command.
214
215The api might be extended to allow for multiple returns, or nearest
216neighbour search.
217
218=head1 SEE ALSO
219
220L<geo-latlon2place-makedb> to create databases from common formats.
221
120=head1 AUTHOR 222=head1 AUTHOR
121 223
122 Marc Lehmann <schmorp@schmorp.de> 224 Marc Lehmann <schmorp@schmorp.de>
123 http://home.schmorp.de/ 225 http://home.schmorp.de/
124 226

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines