--- Geo-LatLon2Place/README 2022/03/14 02:41:52 1.1
+++ Geo-LatLon2Place/README 2022/03/14 03:14:41 1.2
@@ -1,185 +1,126 @@
NAME
- Convert::Scalar - convert between different representations of perl
- scalars
+ Geo::LatLon2Place - convert latitude and longitude to nearest place
SYNOPSIS
- use Convert::Scalar;
+ use Geo::LatLon2Place;
+
+ my $db = Geo::LatLon2Place->new ("/var/lib/mydb.cdb");
DESCRIPTION
- This module exports various internal perl methods that change the
- internal representation or state of a perl scalar. All of these work
- in-place, that is, they modify their scalar argument. No functions are
- exported by default.
-
- The following export tags exist:
-
- :utf8 all functions with utf8 in their name
- :taint all functions with taint in their name
- :refcnt all functions with refcnt in their name
- :ok all *ok-functions.
-
- utf8 scalar[, mode]
- Returns true when the given scalar is marked as utf8, false
- otherwise. If the optional mode argument is given, also forces the
- interpretation of the string to utf8 (mode true) or plain bytes
- (mode false). The actual (byte-) content is not changed. The return
- value always reflects the state before any modification is done.
-
- This function is useful when you "import" utf8-data into perl, or
- when some external function (e.g. storing/retrieving from a
- database) removes the utf8-flag.
-
- utf8_on scalar
- Similar to "utf8 scalar, 1", but additionally returns the scalar
- (the argument is still modified in-place).
-
- utf8_off scalar
- Similar to "utf8 scalar, 0", but additionally returns the scalar
- (the argument is still modified in-place).
-
- utf8_valid scalar [Perl 5.7]
- Returns true if the bytes inside the scalar form a valid utf8
- string, false otherwise (the check is independent of the actual
- encoding perl thinks the string is in).
-
- utf8_upgrade scalar
- Convert the string content of the scalar in-place to its
- UTF8-encoded form (and also returns it).
-
- utf8_downgrade scalar[, fail_ok=0]
- Attempt to convert the string content of the scalar from
- UTF8-encoded to ISO-8859-1. This may not be possible if the string
- contains characters that cannot be represented in a single byte; if
- this is the case, it leaves the scalar unchanged and either returns
- false or, if "fail_ok" is not true (the default), croaks.
-
- utf8_encode scalar
- Convert the string value of the scalar to UTF8-encoded, but then
- turn off the "SvUTF8" flag so that it looks like bytes to perl
- again. (Might be removed in future versions).
-
- utf8_length scalar
- Returns the number of characters in the string, counting wide UTF8
- characters as a single character, independent of wether the scalar
- is marked as containing bytes or mulitbyte characters.
-
- $old = readonly scalar[, $new]
- Returns whether the scalar is currently readonly, and sets or clears
- the readonly status if a new status is given.
-
- readonly_on scalar
- Sets the readonly flag on the scalar.
-
- readonly_off scalar
- Clears the readonly flag on the scalar.
-
- unmagic scalar, type
- Remove the specified magic from the scalar (DANGEROUS!).
-
- weaken scalar
- Weaken a reference. (See also WeakRef).
-
- taint scalar
- Taint the scalar.
-
- tainted scalar
- returns true when the scalar is tainted, false otherwise.
-
- untaint scalar
- Remove the tainted flag from the specified scalar.
-
- length = len scalar
- Returns SvLEN (scalar), that is, the actual number of bytes
- allocated to the string value, or "undef", is the scalar has no
- string value.
-
- scalar = grow scalar, newlen
- Sets the memory area used for the scalar to the given length, if the
- current length is less than the new value. This does not affect the
- contents of the scalar, but is only useful to "pre-allocate" memory
- space if you know the scalar will grow. The return value is the
- modified scalar (the scalar is modified in-place).
-
- scalar = extend scalar, addlen=64
- Reserves enough space in the scalar so that addlen bytes can be
- appended without reallocating it. The actual contents of the scalar
- will not be affected. The modified scalar will also be returned.
-
- This function is meant to make append workloads efficient - if you
- append a short string to a scalar many times (millions of times),
- then perl will have to reallocate and copy the scalar basically
- every time.
-
- If you instead use "extend $scalar, length $shortstring", then
- Convert::Scalar will use a "size to next power of two, roughly"
- algorithm, so as the scalar grows, perl will have to resize and copy
- it less and less often.
-
- nread = extend_read fh, scalar, addlen=64
- Calls "extend scalar, addlen" to ensure some space is available,
- then do the equivalent of "sysread" to the end, to try to fill the
- extra space. Returns how many bytes have been read, 0 on EOF or
- undef> on eror, just like "sysread".
-
- This function is useful to implement many protocols where you read
- some data, see if it is enough to decode, and if not, read some
- more, where the naive or easy way of doing this would result in bad
- performance.
-
- nread = read_all fh, scalar, length
- Tries to read "length" bytes into "scalar". Unlike "read" or
- "sysread", it will try to read more bytes if not all bytes could be
- read in one go (this is often called "xread" in C).
-
- Returns the total nunmber of bytes read (normally "length", unless
- an error or EOF occured), 0 on EOF and "undef" on errors.
-
- nwritten = write_all fh, scalar
- Like "readall", but for writes - the equivalent of the "xwrite"
- function often seen in C.
-
- refcnt scalar[, newrefcnt]
- Returns the current reference count of the given scalar and
- optionally sets it to the given reference count.
-
- refcnt_inc scalar
- Increments the reference count of the given scalar inplace.
-
- refcnt_dec scalar
- Decrements the reference count of the given scalar inplace. Use
- "weaken" instead if you understand what this function is fore.
- Better yet: don't use this module in this case.
-
- refcnt_rv scalar[, newrefcnt]
- Works like "refcnt", but dereferences the given reference first.
- This is useful to find the reference count of arrays or hashes,
- which cannot be passed directly. Remember that taking a reference of
- some object increases it's reference count, so the reference count
- used by the *_rv-functions tend to be one higher.
-
- refcnt_inc_rv scalar
- Works like "refcnt_inc", but dereferences the given reference first.
-
- refcnt_dec_rv scalar
- Works like "refcnt_dec", but dereferences the given reference first.
-
- ok scalar
- uok scalar
- rok scalar
- pok scalar
- nok scalar
- niok scalar
- Calls SvOK, SvUOK, SvROK, SvPOK, SvNOK or SvNIOK on the given
- scalar, respectively.
-
- CANDIDATES FOR FUTURE RELEASES
- The following API functions (perlapi) are considered for future
- inclusion in this module If you want them, write me.
-
- sv_upgrade
- sv_pvn_force
- sv_pvutf8n_force
- the sv2xx family
+ This is a single-purpose module that tries to do one job: find the
+ nearest placename for a point on earth. It doesn't claim to do a perfect
+ job, but it tries to be simple to set up, simple to use and be fast. It
+ doesn't attempt to provide many features or nifty algorithms, and is
+ meant to be used in situations where you simply need a name for a
+ coordinate without becoming a GIS expert first.
+
+ BUILDING, SETTING UP AND USAGE
+ To build this module, you need tinycdb, a cdb implementation by Michael
+ Tokarev, or a compatible library. On GNU/Debian-based systems you can
+ get this by executing apt-get install libcdb-dev.
+
+ After install the module, you need to generate a database using the
+ geo-latlon2place-makedb command.
+
+ Currently, it accepts various databases from geonames
+ (, note the license), for example,
+ cities500.zip, which lists all places with population 500 or more:
+
+ wget https://download.geonames.org/export/dump/cities500.zip
+ unzip cities500.zip
+ geo-latlon2place-makedb cities500.txt cities500.ll2p
+
+ This will create a file ll2p.cdb that you can use for lookups with this
+ module. At the time of this writing, the cities500 database results in
+ about a 10MB file while the allCountries database results in about
+ 120MB.
+
+ Lookups will return a string of the form "placename, countrycode".
+
+ If you want to use the geonames postal code database (from
+ ), use these commands:
+
+ wget https://download.geonames.org/export/zip/allCountries.zip
+ unzip allCountries.zip
+ geo-latlon2place-makedb --extract geonames-postalcodes allCountries.txt allCountries.ll2p
+
+ You can then use the resulting database like this:
+
+ my $lookup = Geo::LatLon2Place->new ("allCountries.ll2p");
+
+ # and then do as many queries as you wish:
+ my $res = $lookup->(49, 8.4);
+ if (defined $res) {
+ utf8::decode $res; # convert $res from utf-8 to unicode
+ print "49, 8.4 found $res\n"; # should be Karlsruhe, DE for geonames
+ } else {
+ print "nothing found at 49, 8.4\n";
+ }
+
+THE Geo::LatLon2Place CLASS
+ $lookup = Geo::LatLon2Place->new ($path)
+ Opens a database created by geo-latlon2place-makedb and return an
+ object that allows you to run queries against it.
+
+ The database will be mmaped, so it will not be loaded into memory,
+ but your operating system will cache it appropriately.
+
+ $res = $lookup->lookup ($lat $lon[, $radius])
+ Looks up the point in the database that is "nearest" to "$lat,
+ $lon", search at leats up to $radius kilometres. The default for
+ $radius is the cell size the database is built with, and this
+ usually works best, so you usually do not specify this parameter.
+
+ If something is found, the associated data blob (always a binary
+ string) is returned, otherwise you receive "undef".
+
+ Unless you specify a cusotrm format, the data blob is actually a
+ UTF-8 string, so you might want to call "utf8::decode" on it to get
+ a unicode astring.
+
+ At the moment, the implementation is in pure perl, but will
+ eventually move to C.
+
+ALGORITHM
+ The algorithm that this module implements consists of two parts: binning
+ and weighting (done when writing the database) and then finding the
+ nearest point.
+
+ The first part bins all data points into a grid which has its minimum
+ cell size at the equator and poles, with somewhat larger cells in
+ between.
+
+ The lookup part will then read the cell that the coordinate is in and
+ some neighbouring cells (depending on the search radius, by default it
+ will read the eight cells around it).
+
+ It will then calculate the (squared) distance to the search coordinate
+ using an approximate euclidean distance on an equireactangular
+ projection. The squared distance is multiplied with a weight (1..25 for
+ the geonames database, based on population and adminstrative status,
+ always 1 for postcal codes), and the minimum distance wins.
+
+ Binning should not introduce errors, but bigger bins can slow down
+ lookup times due to having to look at more places. The lookup assumes a
+ spherical shape for the earth, the equirectangular projection stretches
+ distances unevenly and the euclidean distance calculation introduces
+ further errors. For typical distance (<< 100km) and the intended usage,
+ these errors should be considered negligible.
+
+SPEED
+ The current implementation is written in pure perl, and on my machine,
+ typically does 10000-200000 lookups per second. The goal for version 1.0
+ is to move the lookup to C.
+
+TENTATIVE ROADMAP
+ The database writer should be accessible via a module, so you cna easily
+ generate your own databases without having to run an external command.
+
+ The api might be extended to allow for multiple returns, or nearest
+ neighbour search.
+
+SEE ALSO
+ geo-latlon2place-makedb to create databases from common formats.
AUTHOR
Marc Lehmann