ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/Geo-LatLon2Place/README
Revision: 1.2
Committed: Mon Mar 14 03:14:41 2022 UTC (2 years, 2 months ago) by root
Branch: MAIN
CVS Tags: rel-0_01
Changes since 1.1: +118 -177 lines
Log Message:
0.01

File Contents

# Content
1 NAME
2 Geo::LatLon2Place - convert latitude and longitude to nearest place
3
4 SYNOPSIS
5 use Geo::LatLon2Place;
6
7 my $db = Geo::LatLon2Place->new ("/var/lib/mydb.cdb");
8
9 DESCRIPTION
10 This is a single-purpose module that tries to do one job: find the
11 nearest placename for a point on earth. It doesn't claim to do a perfect
12 job, but it tries to be simple to set up, simple to use and be fast. It
13 doesn't attempt to provide many features or nifty algorithms, and is
14 meant to be used in situations where you simply need a name for a
15 coordinate without becoming a GIS expert first.
16
17 BUILDING, SETTING UP AND USAGE
18 To build this module, you need tinycdb, a cdb implementation by Michael
19 Tokarev, or a compatible library. On GNU/Debian-based systems you can
20 get this by executing apt-get install libcdb-dev.
21
22 After install the module, you need to generate a database using the
23 geo-latlon2place-makedb command.
24
25 Currently, it accepts various databases from geonames
26 (<https://www.geonames.org/export/>, note the license), for example,
27 cities500.zip, which lists all places with population 500 or more:
28
29 wget https://download.geonames.org/export/dump/cities500.zip
30 unzip cities500.zip
31 geo-latlon2place-makedb cities500.txt cities500.ll2p
32
33 This will create a file ll2p.cdb that you can use for lookups with this
34 module. At the time of this writing, the cities500 database results in
35 about a 10MB file while the allCountries database results in about
36 120MB.
37
38 Lookups will return a string of the form "placename, countrycode".
39
40 If you want to use the geonames postal code database (from
41 <https://www.geonames.org/zip/>), use these commands:
42
43 wget https://download.geonames.org/export/zip/allCountries.zip
44 unzip allCountries.zip
45 geo-latlon2place-makedb --extract geonames-postalcodes allCountries.txt allCountries.ll2p
46
47 You can then use the resulting database like this:
48
49 my $lookup = Geo::LatLon2Place->new ("allCountries.ll2p");
50
51 # and then do as many queries as you wish:
52 my $res = $lookup->(49, 8.4);
53 if (defined $res) {
54 utf8::decode $res; # convert $res from utf-8 to unicode
55 print "49, 8.4 found $res\n"; # should be Karlsruhe, DE for geonames
56 } else {
57 print "nothing found at 49, 8.4\n";
58 }
59
60 THE Geo::LatLon2Place CLASS
61 $lookup = Geo::LatLon2Place->new ($path)
62 Opens a database created by geo-latlon2place-makedb and return an
63 object that allows you to run queries against it.
64
65 The database will be mmaped, so it will not be loaded into memory,
66 but your operating system will cache it appropriately.
67
68 $res = $lookup->lookup ($lat $lon[, $radius])
69 Looks up the point in the database that is "nearest" to "$lat,
70 $lon", search at leats up to $radius kilometres. The default for
71 $radius is the cell size the database is built with, and this
72 usually works best, so you usually do not specify this parameter.
73
74 If something is found, the associated data blob (always a binary
75 string) is returned, otherwise you receive "undef".
76
77 Unless you specify a cusotrm format, the data blob is actually a
78 UTF-8 string, so you might want to call "utf8::decode" on it to get
79 a unicode astring.
80
81 At the moment, the implementation is in pure perl, but will
82 eventually move to C.
83
84 ALGORITHM
85 The algorithm that this module implements consists of two parts: binning
86 and weighting (done when writing the database) and then finding the
87 nearest point.
88
89 The first part bins all data points into a grid which has its minimum
90 cell size at the equator and poles, with somewhat larger cells in
91 between.
92
93 The lookup part will then read the cell that the coordinate is in and
94 some neighbouring cells (depending on the search radius, by default it
95 will read the eight cells around it).
96
97 It will then calculate the (squared) distance to the search coordinate
98 using an approximate euclidean distance on an equireactangular
99 projection. The squared distance is multiplied with a weight (1..25 for
100 the geonames database, based on population and adminstrative status,
101 always 1 for postcal codes), and the minimum distance wins.
102
103 Binning should not introduce errors, but bigger bins can slow down
104 lookup times due to having to look at more places. The lookup assumes a
105 spherical shape for the earth, the equirectangular projection stretches
106 distances unevenly and the euclidean distance calculation introduces
107 further errors. For typical distance (<< 100km) and the intended usage,
108 these errors should be considered negligible.
109
110 SPEED
111 The current implementation is written in pure perl, and on my machine,
112 typically does 10000-200000 lookups per second. The goal for version 1.0
113 is to move the lookup to C.
114
115 TENTATIVE ROADMAP
116 The database writer should be accessible via a module, so you cna easily
117 generate your own databases without having to run an external command.
118
119 The api might be extended to allow for multiple returns, or nearest
120 neighbour search.
121
122 SEE ALSO
123 geo-latlon2place-makedb to create databases from common formats.
124
125 AUTHOR
126 Marc Lehmann <schmorp@schmorp.de>
127 http://home.schmorp.de/
128