ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/Geo-LatLon2Place/README
Revision: 1.3
Committed: Thu Mar 17 00:32:54 2022 UTC (2 years, 2 months ago) by root
Branch: MAIN
CVS Tags: rel-1_0, rel-0_9, HEAD
Changes since 1.2: +25 -13 lines
Log Message:
0.9

File Contents

# User Rev Content
1 root 1.1 NAME
2 root 1.2 Geo::LatLon2Place - convert latitude and longitude to nearest place
3 root 1.1
4     SYNOPSIS
5 root 1.2 use Geo::LatLon2Place;
6    
7     my $db = Geo::LatLon2Place->new ("/var/lib/mydb.cdb");
8 root 1.1
9     DESCRIPTION
10 root 1.2 This is a single-purpose module that tries to do one job: find the
11     nearest placename for a point on earth. It doesn't claim to do a perfect
12     job, but it tries to be simple to set up, simple to use and be fast. It
13     doesn't attempt to provide many features or nifty algorithms, and is
14     meant to be used in situations where you simply need a name for a
15     coordinate without becoming a GIS expert first.
16    
17     BUILDING, SETTING UP AND USAGE
18     To build this module, you need tinycdb, a cdb implementation by Michael
19     Tokarev, or a compatible library. On GNU/Debian-based systems you can
20     get this by executing apt-get install libcdb-dev.
21    
22     After install the module, you need to generate a database using the
23     geo-latlon2place-makedb command.
24    
25     Currently, it accepts various databases from geonames
26     (<https://www.geonames.org/export/>, note the license), for example,
27     cities500.zip, which lists all places with population 500 or more:
28    
29     wget https://download.geonames.org/export/dump/cities500.zip
30     unzip cities500.zip
31     geo-latlon2place-makedb cities500.txt cities500.ll2p
32    
33     This will create a file ll2p.cdb that you can use for lookups with this
34     module. At the time of this writing, the cities500 database results in
35     about a 10MB file while the allCountries database results in about
36     120MB.
37    
38     Lookups will return a string of the form "placename, countrycode".
39    
40     If you want to use the geonames postal code database (from
41     <https://www.geonames.org/zip/>), use these commands:
42    
43     wget https://download.geonames.org/export/zip/allCountries.zip
44     unzip allCountries.zip
45     geo-latlon2place-makedb --extract geonames-postalcodes allCountries.txt allCountries.ll2p
46    
47     You can then use the resulting database like this:
48    
49     my $lookup = Geo::LatLon2Place->new ("allCountries.ll2p");
50    
51     # and then do as many queries as you wish:
52     my $res = $lookup->(49, 8.4);
53     if (defined $res) {
54     utf8::decode $res; # convert $res from utf-8 to unicode
55     print "49, 8.4 found $res\n"; # should be Karlsruhe, DE for geonames
56     } else {
57     print "nothing found at 49, 8.4\n";
58     }
59    
60     THE Geo::LatLon2Place CLASS
61     $lookup = Geo::LatLon2Place->new ($path)
62     Opens a database created by geo-latlon2place-makedb and return an
63     object that allows you to run queries against it.
64    
65     The database will be mmaped, so it will not be loaded into memory,
66     but your operating system will cache it appropriately.
67    
68     $res = $lookup->lookup ($lat $lon[, $radius])
69     Looks up the point in the database that is "nearest" to "$lat,
70     $lon", search at leats up to $radius kilometres. The default for
71     $radius is the cell size the database is built with, and this
72     usually works best, so you usually do not specify this parameter.
73    
74     If something is found, the associated data blob (always a binary
75     string) is returned, otherwise you receive "undef".
76    
77 root 1.3 Unless you specify a custom format/extractor when building your
78     database, the data blob is actually a UTF-8 string, so you might
79     want to call "utf8::decode" on it to get a unicode string:
80    
81     my $res = $db->lookup (47, 37); # near mariupol, UA
82     if (defined $res) {
83     utf8::decode $res;
84     # $res now contains the unicode result
85     }
86 root 1.2
87     ALGORITHM
88     The algorithm that this module implements consists of two parts: binning
89     and weighting (done when writing the database) and then finding the
90     nearest point.
91    
92     The first part bins all data points into a grid which has its minimum
93     cell size at the equator and poles, with somewhat larger cells in
94     between.
95    
96     The lookup part will then read the cell that the coordinate is in and
97     some neighbouring cells (depending on the search radius, by default it
98     will read the eight cells around it).
99    
100     It will then calculate the (squared) distance to the search coordinate
101     using an approximate euclidean distance on an equireactangular
102     projection. The squared distance is multiplied with a weight (1..25 for
103     the geonames database, based on population and adminstrative status,
104 root 1.3 always 1 for postal codes), and the minimum distance wins.
105 root 1.2
106     Binning should not introduce errors, but bigger bins can slow down
107     lookup times due to having to look at more places. The lookup assumes a
108     spherical shape for the earth, the equirectangular projection stretches
109     distances unevenly and the euclidean distance calculation introduces
110     further errors. For typical distance (<< 100km) and the intended usage,
111     these errors should be considered negligible.
112    
113     SPEED
114 root 1.3 On my machine, "lookup" typically does more than a million lookups per
115     second - performance varies depending on result density and number of
116     indexed points.
117 root 1.2
118     TENTATIVE ROADMAP
119 root 1.3 The database writer should be accessible via a module, so you can easily
120 root 1.2 generate your own databases without having to run an external command.
121    
122 root 1.3 The API might be extended to allow for multiple lookups, multiple
123     returns, or nearest neighbour search, or more return values (distance,
124     coordinates).
125    
126     Longer lookups will take advantage of perlmulticore.
127    
128     PERL MULTICORE SUPPORT
129     This is not yet implemented:
130    
131     This module supports the perl multicore specification
132     (<http://perlmulticore.schmorp.de/>) when doing lookups.
133 root 1.2
134     SEE ALSO
135     geo-latlon2place-makedb to create databases from common formats.
136 root 1.1
137     AUTHOR
138     Marc Lehmann <schmorp@schmorp.de>
139     http://home.schmorp.de/
140