| 1 |
NAME |
| 2 |
Geo::LatLon2Place - convert latitude and longitude to nearest place |
| 3 |
|
| 4 |
SYNOPSIS |
| 5 |
use Geo::LatLon2Place; |
| 6 |
|
| 7 |
my $db = Geo::LatLon2Place->new ("/var/lib/mydb.cdb"); |
| 8 |
|
| 9 |
DESCRIPTION |
| 10 |
This is a single-purpose module that tries to do one job: find the |
| 11 |
nearest placename for a point on earth. It doesn't claim to do a perfect |
| 12 |
job, but it tries to be simple to set up, simple to use and be fast. It |
| 13 |
doesn't attempt to provide many features or nifty algorithms, and is |
| 14 |
meant to be used in situations where you simply need a name for a |
| 15 |
coordinate without becoming a GIS expert first. |
| 16 |
|
| 17 |
BUILDING, SETTING UP AND USAGE |
| 18 |
To build this module, you need tinycdb, a cdb implementation by Michael |
| 19 |
Tokarev, or a compatible library. On GNU/Debian-based systems you can |
| 20 |
get this by executing apt-get install libcdb-dev. |
| 21 |
|
| 22 |
After install the module, you need to generate a database using the |
| 23 |
geo-latlon2place-makedb command. |
| 24 |
|
| 25 |
Currently, it accepts various databases from geonames |
| 26 |
(<https://www.geonames.org/export/>, note the license), for example, |
| 27 |
cities500.zip, which lists all places with population 500 or more: |
| 28 |
|
| 29 |
wget https://download.geonames.org/export/dump/cities500.zip |
| 30 |
unzip cities500.zip |
| 31 |
geo-latlon2place-makedb cities500.txt cities500.ll2p |
| 32 |
|
| 33 |
This will create a file ll2p.cdb that you can use for lookups with this |
| 34 |
module. At the time of this writing, the cities500 database results in |
| 35 |
about a 10MB file while the allCountries database results in about |
| 36 |
120MB. |
| 37 |
|
| 38 |
Lookups will return a string of the form "placename, countrycode". |
| 39 |
|
| 40 |
If you want to use the geonames postal code database (from |
| 41 |
<https://www.geonames.org/zip/>), use these commands: |
| 42 |
|
| 43 |
wget https://download.geonames.org/export/zip/allCountries.zip |
| 44 |
unzip allCountries.zip |
| 45 |
geo-latlon2place-makedb --extract geonames-postalcodes allCountries.txt allCountries.ll2p |
| 46 |
|
| 47 |
You can then use the resulting database like this: |
| 48 |
|
| 49 |
my $lookup = Geo::LatLon2Place->new ("allCountries.ll2p"); |
| 50 |
|
| 51 |
# and then do as many queries as you wish: |
| 52 |
my $res = $lookup->(49, 8.4); |
| 53 |
if (defined $res) { |
| 54 |
utf8::decode $res; # convert $res from utf-8 to unicode |
| 55 |
print "49, 8.4 found $res\n"; # should be Karlsruhe, DE for geonames |
| 56 |
} else { |
| 57 |
print "nothing found at 49, 8.4\n"; |
| 58 |
} |
| 59 |
|
| 60 |
THE Geo::LatLon2Place CLASS |
| 61 |
$lookup = Geo::LatLon2Place->new ($path) |
| 62 |
Opens a database created by geo-latlon2place-makedb and return an |
| 63 |
object that allows you to run queries against it. |
| 64 |
|
| 65 |
The database will be mmaped, so it will not be loaded into memory, |
| 66 |
but your operating system will cache it appropriately. |
| 67 |
|
| 68 |
$res = $lookup->lookup ($lat $lon[, $radius]) |
| 69 |
Looks up the point in the database that is "nearest" to "$lat, |
| 70 |
$lon", search at leats up to $radius kilometres. The default for |
| 71 |
$radius is the cell size the database is built with, and this |
| 72 |
usually works best, so you usually do not specify this parameter. |
| 73 |
|
| 74 |
If something is found, the associated data blob (always a binary |
| 75 |
string) is returned, otherwise you receive "undef". |
| 76 |
|
| 77 |
Unless you specify a custom format/extractor when building your |
| 78 |
database, the data blob is actually a UTF-8 string, so you might |
| 79 |
want to call "utf8::decode" on it to get a unicode string: |
| 80 |
|
| 81 |
my $res = $db->lookup (47, 37); # near mariupol, UA |
| 82 |
if (defined $res) { |
| 83 |
utf8::decode $res; |
| 84 |
# $res now contains the unicode result |
| 85 |
} |
| 86 |
|
| 87 |
ALGORITHM |
| 88 |
The algorithm that this module implements consists of two parts: binning |
| 89 |
and weighting (done when writing the database) and then finding the |
| 90 |
nearest point. |
| 91 |
|
| 92 |
The first part bins all data points into a grid which has its minimum |
| 93 |
cell size at the equator and poles, with somewhat larger cells in |
| 94 |
between. |
| 95 |
|
| 96 |
The lookup part will then read the cell that the coordinate is in and |
| 97 |
some neighbouring cells (depending on the search radius, by default it |
| 98 |
will read the eight cells around it). |
| 99 |
|
| 100 |
It will then calculate the (squared) distance to the search coordinate |
| 101 |
using an approximate euclidean distance on an equireactangular |
| 102 |
projection. The squared distance is multiplied with a weight (1..25 for |
| 103 |
the geonames database, based on population and adminstrative status, |
| 104 |
always 1 for postal codes), and the minimum distance wins. |
| 105 |
|
| 106 |
Binning should not introduce errors, but bigger bins can slow down |
| 107 |
lookup times due to having to look at more places. The lookup assumes a |
| 108 |
spherical shape for the earth, the equirectangular projection stretches |
| 109 |
distances unevenly and the euclidean distance calculation introduces |
| 110 |
further errors. For typical distance (<< 100km) and the intended usage, |
| 111 |
these errors should be considered negligible. |
| 112 |
|
| 113 |
SPEED |
| 114 |
On my machine, "lookup" typically does more than a million lookups per |
| 115 |
second - performance varies depending on result density and number of |
| 116 |
indexed points. |
| 117 |
|
| 118 |
TENTATIVE ROADMAP |
| 119 |
The database writer should be accessible via a module, so you can easily |
| 120 |
generate your own databases without having to run an external command. |
| 121 |
|
| 122 |
The API might be extended to allow for multiple lookups, multiple |
| 123 |
returns, or nearest neighbour search, or more return values (distance, |
| 124 |
coordinates). |
| 125 |
|
| 126 |
Longer lookups will take advantage of perlmulticore. |
| 127 |
|
| 128 |
PERL MULTICORE SUPPORT |
| 129 |
This is not yet implemented: |
| 130 |
|
| 131 |
This module supports the perl multicore specification |
| 132 |
(<http://perlmulticore.schmorp.de/>) when doing lookups. |
| 133 |
|
| 134 |
SEE ALSO |
| 135 |
geo-latlon2place-makedb to create databases from common formats. |
| 136 |
|
| 137 |
AUTHOR |
| 138 |
Marc Lehmann <schmorp@schmorp.de> |
| 139 |
http://home.schmorp.de/ |
| 140 |
|