1 |
NAME |
2 |
Geo::LatLon2Place - convert latitude and longitude to nearest place |
3 |
|
4 |
SYNOPSIS |
5 |
use Geo::LatLon2Place; |
6 |
|
7 |
my $db = Geo::LatLon2Place->new ("/var/lib/mydb.cdb"); |
8 |
|
9 |
DESCRIPTION |
10 |
This is a single-purpose module that tries to do one job: find the |
11 |
nearest placename for a point on earth. It doesn't claim to do a perfect |
12 |
job, but it tries to be simple to set up, simple to use and be fast. It |
13 |
doesn't attempt to provide many features or nifty algorithms, and is |
14 |
meant to be used in situations where you simply need a name for a |
15 |
coordinate without becoming a GIS expert first. |
16 |
|
17 |
BUILDING, SETTING UP AND USAGE |
18 |
To build this module, you need tinycdb, a cdb implementation by Michael |
19 |
Tokarev, or a compatible library. On GNU/Debian-based systems you can |
20 |
get this by executing apt-get install libcdb-dev. |
21 |
|
22 |
After install the module, you need to generate a database using the |
23 |
geo-latlon2place-makedb command. |
24 |
|
25 |
Currently, it accepts various databases from geonames |
26 |
(<https://www.geonames.org/export/>, note the license), for example, |
27 |
cities500.zip, which lists all places with population 500 or more: |
28 |
|
29 |
wget https://download.geonames.org/export/dump/cities500.zip |
30 |
unzip cities500.zip |
31 |
geo-latlon2place-makedb cities500.txt cities500.ll2p |
32 |
|
33 |
This will create a file ll2p.cdb that you can use for lookups with this |
34 |
module. At the time of this writing, the cities500 database results in |
35 |
about a 10MB file while the allCountries database results in about |
36 |
120MB. |
37 |
|
38 |
Lookups will return a string of the form "placename, countrycode". |
39 |
|
40 |
If you want to use the geonames postal code database (from |
41 |
<https://www.geonames.org/zip/>), use these commands: |
42 |
|
43 |
wget https://download.geonames.org/export/zip/allCountries.zip |
44 |
unzip allCountries.zip |
45 |
geo-latlon2place-makedb --extract geonames-postalcodes allCountries.txt allCountries.ll2p |
46 |
|
47 |
You can then use the resulting database like this: |
48 |
|
49 |
my $lookup = Geo::LatLon2Place->new ("allCountries.ll2p"); |
50 |
|
51 |
# and then do as many queries as you wish: |
52 |
my $res = $lookup->(49, 8.4); |
53 |
if (defined $res) { |
54 |
utf8::decode $res; # convert $res from utf-8 to unicode |
55 |
print "49, 8.4 found $res\n"; # should be Karlsruhe, DE for geonames |
56 |
} else { |
57 |
print "nothing found at 49, 8.4\n"; |
58 |
} |
59 |
|
60 |
THE Geo::LatLon2Place CLASS |
61 |
$lookup = Geo::LatLon2Place->new ($path) |
62 |
Opens a database created by geo-latlon2place-makedb and return an |
63 |
object that allows you to run queries against it. |
64 |
|
65 |
The database will be mmaped, so it will not be loaded into memory, |
66 |
but your operating system will cache it appropriately. |
67 |
|
68 |
$res = $lookup->lookup ($lat $lon[, $radius]) |
69 |
Looks up the point in the database that is "nearest" to "$lat, |
70 |
$lon", search at leats up to $radius kilometres. The default for |
71 |
$radius is the cell size the database is built with, and this |
72 |
usually works best, so you usually do not specify this parameter. |
73 |
|
74 |
If something is found, the associated data blob (always a binary |
75 |
string) is returned, otherwise you receive "undef". |
76 |
|
77 |
Unless you specify a cusotrm format, the data blob is actually a |
78 |
UTF-8 string, so you might want to call "utf8::decode" on it to get |
79 |
a unicode astring. |
80 |
|
81 |
At the moment, the implementation is in pure perl, but will |
82 |
eventually move to C. |
83 |
|
84 |
ALGORITHM |
85 |
The algorithm that this module implements consists of two parts: binning |
86 |
and weighting (done when writing the database) and then finding the |
87 |
nearest point. |
88 |
|
89 |
The first part bins all data points into a grid which has its minimum |
90 |
cell size at the equator and poles, with somewhat larger cells in |
91 |
between. |
92 |
|
93 |
The lookup part will then read the cell that the coordinate is in and |
94 |
some neighbouring cells (depending on the search radius, by default it |
95 |
will read the eight cells around it). |
96 |
|
97 |
It will then calculate the (squared) distance to the search coordinate |
98 |
using an approximate euclidean distance on an equireactangular |
99 |
projection. The squared distance is multiplied with a weight (1..25 for |
100 |
the geonames database, based on population and adminstrative status, |
101 |
always 1 for postcal codes), and the minimum distance wins. |
102 |
|
103 |
Binning should not introduce errors, but bigger bins can slow down |
104 |
lookup times due to having to look at more places. The lookup assumes a |
105 |
spherical shape for the earth, the equirectangular projection stretches |
106 |
distances unevenly and the euclidean distance calculation introduces |
107 |
further errors. For typical distance (<< 100km) and the intended usage, |
108 |
these errors should be considered negligible. |
109 |
|
110 |
SPEED |
111 |
The current implementation is written in pure perl, and on my machine, |
112 |
typically does 10000-200000 lookups per second. The goal for version 1.0 |
113 |
is to move the lookup to C. |
114 |
|
115 |
TENTATIVE ROADMAP |
116 |
The database writer should be accessible via a module, so you cna easily |
117 |
generate your own databases without having to run an external command. |
118 |
|
119 |
The api might be extended to allow for multiple returns, or nearest |
120 |
neighbour search. |
121 |
|
122 |
SEE ALSO |
123 |
geo-latlon2place-makedb to create databases from common formats. |
124 |
|
125 |
AUTHOR |
126 |
Marc Lehmann <schmorp@schmorp.de> |
127 |
http://home.schmorp.de/ |
128 |
|