| 1 |
root |
1.1 |
NAME |
| 2 |
root |
1.2 |
Geo::LatLon2Place - convert latitude and longitude to nearest place |
| 3 |
root |
1.1 |
|
| 4 |
|
|
SYNOPSIS |
| 5 |
root |
1.2 |
use Geo::LatLon2Place; |
| 6 |
|
|
|
| 7 |
|
|
my $db = Geo::LatLon2Place->new ("/var/lib/mydb.cdb"); |
| 8 |
root |
1.1 |
|
| 9 |
|
|
DESCRIPTION |
| 10 |
root |
1.2 |
This is a single-purpose module that tries to do one job: find the |
| 11 |
|
|
nearest placename for a point on earth. It doesn't claim to do a perfect |
| 12 |
|
|
job, but it tries to be simple to set up, simple to use and be fast. It |
| 13 |
|
|
doesn't attempt to provide many features or nifty algorithms, and is |
| 14 |
|
|
meant to be used in situations where you simply need a name for a |
| 15 |
|
|
coordinate without becoming a GIS expert first. |
| 16 |
|
|
|
| 17 |
|
|
BUILDING, SETTING UP AND USAGE |
| 18 |
|
|
To build this module, you need tinycdb, a cdb implementation by Michael |
| 19 |
|
|
Tokarev, or a compatible library. On GNU/Debian-based systems you can |
| 20 |
|
|
get this by executing apt-get install libcdb-dev. |
| 21 |
|
|
|
| 22 |
|
|
After install the module, you need to generate a database using the |
| 23 |
|
|
geo-latlon2place-makedb command. |
| 24 |
|
|
|
| 25 |
|
|
Currently, it accepts various databases from geonames |
| 26 |
|
|
(<https://www.geonames.org/export/>, note the license), for example, |
| 27 |
|
|
cities500.zip, which lists all places with population 500 or more: |
| 28 |
|
|
|
| 29 |
|
|
wget https://download.geonames.org/export/dump/cities500.zip |
| 30 |
|
|
unzip cities500.zip |
| 31 |
|
|
geo-latlon2place-makedb cities500.txt cities500.ll2p |
| 32 |
|
|
|
| 33 |
|
|
This will create a file ll2p.cdb that you can use for lookups with this |
| 34 |
|
|
module. At the time of this writing, the cities500 database results in |
| 35 |
|
|
about a 10MB file while the allCountries database results in about |
| 36 |
|
|
120MB. |
| 37 |
|
|
|
| 38 |
|
|
Lookups will return a string of the form "placename, countrycode". |
| 39 |
|
|
|
| 40 |
|
|
If you want to use the geonames postal code database (from |
| 41 |
|
|
<https://www.geonames.org/zip/>), use these commands: |
| 42 |
|
|
|
| 43 |
|
|
wget https://download.geonames.org/export/zip/allCountries.zip |
| 44 |
|
|
unzip allCountries.zip |
| 45 |
|
|
geo-latlon2place-makedb --extract geonames-postalcodes allCountries.txt allCountries.ll2p |
| 46 |
|
|
|
| 47 |
|
|
You can then use the resulting database like this: |
| 48 |
|
|
|
| 49 |
|
|
my $lookup = Geo::LatLon2Place->new ("allCountries.ll2p"); |
| 50 |
|
|
|
| 51 |
|
|
# and then do as many queries as you wish: |
| 52 |
|
|
my $res = $lookup->(49, 8.4); |
| 53 |
|
|
if (defined $res) { |
| 54 |
|
|
utf8::decode $res; # convert $res from utf-8 to unicode |
| 55 |
|
|
print "49, 8.4 found $res\n"; # should be Karlsruhe, DE for geonames |
| 56 |
|
|
} else { |
| 57 |
|
|
print "nothing found at 49, 8.4\n"; |
| 58 |
|
|
} |
| 59 |
|
|
|
| 60 |
|
|
THE Geo::LatLon2Place CLASS |
| 61 |
|
|
$lookup = Geo::LatLon2Place->new ($path) |
| 62 |
|
|
Opens a database created by geo-latlon2place-makedb and return an |
| 63 |
|
|
object that allows you to run queries against it. |
| 64 |
|
|
|
| 65 |
|
|
The database will be mmaped, so it will not be loaded into memory, |
| 66 |
|
|
but your operating system will cache it appropriately. |
| 67 |
|
|
|
| 68 |
|
|
$res = $lookup->lookup ($lat $lon[, $radius]) |
| 69 |
|
|
Looks up the point in the database that is "nearest" to "$lat, |
| 70 |
|
|
$lon", search at leats up to $radius kilometres. The default for |
| 71 |
|
|
$radius is the cell size the database is built with, and this |
| 72 |
|
|
usually works best, so you usually do not specify this parameter. |
| 73 |
|
|
|
| 74 |
|
|
If something is found, the associated data blob (always a binary |
| 75 |
|
|
string) is returned, otherwise you receive "undef". |
| 76 |
|
|
|
| 77 |
root |
1.3 |
Unless you specify a custom format/extractor when building your |
| 78 |
|
|
database, the data blob is actually a UTF-8 string, so you might |
| 79 |
|
|
want to call "utf8::decode" on it to get a unicode string: |
| 80 |
|
|
|
| 81 |
|
|
my $res = $db->lookup (47, 37); # near mariupol, UA |
| 82 |
|
|
if (defined $res) { |
| 83 |
|
|
utf8::decode $res; |
| 84 |
|
|
# $res now contains the unicode result |
| 85 |
|
|
} |
| 86 |
root |
1.2 |
|
| 87 |
|
|
ALGORITHM |
| 88 |
|
|
The algorithm that this module implements consists of two parts: binning |
| 89 |
|
|
and weighting (done when writing the database) and then finding the |
| 90 |
|
|
nearest point. |
| 91 |
|
|
|
| 92 |
|
|
The first part bins all data points into a grid which has its minimum |
| 93 |
|
|
cell size at the equator and poles, with somewhat larger cells in |
| 94 |
|
|
between. |
| 95 |
|
|
|
| 96 |
|
|
The lookup part will then read the cell that the coordinate is in and |
| 97 |
|
|
some neighbouring cells (depending on the search radius, by default it |
| 98 |
|
|
will read the eight cells around it). |
| 99 |
|
|
|
| 100 |
|
|
It will then calculate the (squared) distance to the search coordinate |
| 101 |
|
|
using an approximate euclidean distance on an equireactangular |
| 102 |
|
|
projection. The squared distance is multiplied with a weight (1..25 for |
| 103 |
|
|
the geonames database, based on population and adminstrative status, |
| 104 |
root |
1.3 |
always 1 for postal codes), and the minimum distance wins. |
| 105 |
root |
1.2 |
|
| 106 |
|
|
Binning should not introduce errors, but bigger bins can slow down |
| 107 |
|
|
lookup times due to having to look at more places. The lookup assumes a |
| 108 |
|
|
spherical shape for the earth, the equirectangular projection stretches |
| 109 |
|
|
distances unevenly and the euclidean distance calculation introduces |
| 110 |
|
|
further errors. For typical distance (<< 100km) and the intended usage, |
| 111 |
|
|
these errors should be considered negligible. |
| 112 |
|
|
|
| 113 |
|
|
SPEED |
| 114 |
root |
1.3 |
On my machine, "lookup" typically does more than a million lookups per |
| 115 |
|
|
second - performance varies depending on result density and number of |
| 116 |
|
|
indexed points. |
| 117 |
root |
1.2 |
|
| 118 |
|
|
TENTATIVE ROADMAP |
| 119 |
root |
1.3 |
The database writer should be accessible via a module, so you can easily |
| 120 |
root |
1.2 |
generate your own databases without having to run an external command. |
| 121 |
|
|
|
| 122 |
root |
1.3 |
The API might be extended to allow for multiple lookups, multiple |
| 123 |
|
|
returns, or nearest neighbour search, or more return values (distance, |
| 124 |
|
|
coordinates). |
| 125 |
|
|
|
| 126 |
|
|
Longer lookups will take advantage of perlmulticore. |
| 127 |
|
|
|
| 128 |
|
|
PERL MULTICORE SUPPORT |
| 129 |
|
|
This is not yet implemented: |
| 130 |
|
|
|
| 131 |
|
|
This module supports the perl multicore specification |
| 132 |
|
|
(<http://perlmulticore.schmorp.de/>) when doing lookups. |
| 133 |
root |
1.2 |
|
| 134 |
|
|
SEE ALSO |
| 135 |
|
|
geo-latlon2place-makedb to create databases from common formats. |
| 136 |
root |
1.1 |
|
| 137 |
|
|
AUTHOR |
| 138 |
|
|
Marc Lehmann <schmorp@schmorp.de> |
| 139 |
|
|
http://home.schmorp.de/ |
| 140 |
|
|
|