ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/Geo-LatLon2Place/README
(Generate patch)

Comparing Geo-LatLon2Place/README (file contents):
Revision 1.1 by root, Mon Mar 14 02:41:52 2022 UTC vs.
Revision 1.2 by root, Mon Mar 14 03:14:41 2022 UTC

1NAME 1NAME
2 Convert::Scalar - convert between different representations of perl 2 Geo::LatLon2Place - convert latitude and longitude to nearest place
3 scalars
4 3
5SYNOPSIS 4SYNOPSIS
6 use Convert::Scalar; 5 use Geo::LatLon2Place;
6
7 my $db = Geo::LatLon2Place->new ("/var/lib/mydb.cdb");
7 8
8DESCRIPTION 9DESCRIPTION
9 This module exports various internal perl methods that change the 10 This is a single-purpose module that tries to do one job: find the
10 internal representation or state of a perl scalar. All of these work 11 nearest placename for a point on earth. It doesn't claim to do a perfect
11 in-place, that is, they modify their scalar argument. No functions are 12 job, but it tries to be simple to set up, simple to use and be fast. It
12 exported by default. 13 doesn't attempt to provide many features or nifty algorithms, and is
14 meant to be used in situations where you simply need a name for a
15 coordinate without becoming a GIS expert first.
13 16
14 The following export tags exist: 17 BUILDING, SETTING UP AND USAGE
18 To build this module, you need tinycdb, a cdb implementation by Michael
19 Tokarev, or a compatible library. On GNU/Debian-based systems you can
20 get this by executing apt-get install libcdb-dev.
15 21
16 :utf8 all functions with utf8 in their name 22 After install the module, you need to generate a database using the
17 :taint all functions with taint in their name 23 geo-latlon2place-makedb command.
18 :refcnt all functions with refcnt in their name
19 :ok all *ok-functions.
20 24
21 utf8 scalar[, mode] 25 Currently, it accepts various databases from geonames
22 Returns true when the given scalar is marked as utf8, false 26 (<https://www.geonames.org/export/>, note the license), for example,
23 otherwise. If the optional mode argument is given, also forces the 27 cities500.zip, which lists all places with population 500 or more:
24 interpretation of the string to utf8 (mode true) or plain bytes
25 (mode false). The actual (byte-) content is not changed. The return
26 value always reflects the state before any modification is done.
27 28
28 This function is useful when you "import" utf8-data into perl, or 29 wget https://download.geonames.org/export/dump/cities500.zip
29 when some external function (e.g. storing/retrieving from a 30 unzip cities500.zip
30 database) removes the utf8-flag. 31 geo-latlon2place-makedb cities500.txt cities500.ll2p
31 32
32 utf8_on scalar 33 This will create a file ll2p.cdb that you can use for lookups with this
33 Similar to "utf8 scalar, 1", but additionally returns the scalar 34 module. At the time of this writing, the cities500 database results in
34 (the argument is still modified in-place). 35 about a 10MB file while the allCountries database results in about
36 120MB.
35 37
36 utf8_off scalar 38 Lookups will return a string of the form "placename, countrycode".
37 Similar to "utf8 scalar, 0", but additionally returns the scalar
38 (the argument is still modified in-place).
39 39
40 utf8_valid scalar [Perl 5.7] 40 If you want to use the geonames postal code database (from
41 Returns true if the bytes inside the scalar form a valid utf8 41 <https://www.geonames.org/zip/>), use these commands:
42 string, false otherwise (the check is independent of the actual
43 encoding perl thinks the string is in).
44 42
45 utf8_upgrade scalar 43 wget https://download.geonames.org/export/zip/allCountries.zip
46 Convert the string content of the scalar in-place to its 44 unzip allCountries.zip
47 UTF8-encoded form (and also returns it). 45 geo-latlon2place-makedb --extract geonames-postalcodes allCountries.txt allCountries.ll2p
48 46
49 utf8_downgrade scalar[, fail_ok=0] 47 You can then use the resulting database like this:
50 Attempt to convert the string content of the scalar from
51 UTF8-encoded to ISO-8859-1. This may not be possible if the string
52 contains characters that cannot be represented in a single byte; if
53 this is the case, it leaves the scalar unchanged and either returns
54 false or, if "fail_ok" is not true (the default), croaks.
55 48
56 utf8_encode scalar 49 my $lookup = Geo::LatLon2Place->new ("allCountries.ll2p");
57 Convert the string value of the scalar to UTF8-encoded, but then
58 turn off the "SvUTF8" flag so that it looks like bytes to perl
59 again. (Might be removed in future versions).
60 50
61 utf8_length scalar 51 # and then do as many queries as you wish:
62 Returns the number of characters in the string, counting wide UTF8 52 my $res = $lookup->(49, 8.4);
63 characters as a single character, independent of wether the scalar 53 if (defined $res) {
64 is marked as containing bytes or mulitbyte characters. 54 utf8::decode $res; # convert $res from utf-8 to unicode
55 print "49, 8.4 found $res\n"; # should be Karlsruhe, DE for geonames
56 } else {
57 print "nothing found at 49, 8.4\n";
58 }
65 59
66 $old = readonly scalar[, $new] 60THE Geo::LatLon2Place CLASS
67 Returns whether the scalar is currently readonly, and sets or clears 61 $lookup = Geo::LatLon2Place->new ($path)
68 the readonly status if a new status is given. 62 Opens a database created by geo-latlon2place-makedb and return an
63 object that allows you to run queries against it.
69 64
70 readonly_on scalar 65 The database will be mmaped, so it will not be loaded into memory,
71 Sets the readonly flag on the scalar. 66 but your operating system will cache it appropriately.
72 67
73 readonly_off scalar 68 $res = $lookup->lookup ($lat $lon[, $radius])
74 Clears the readonly flag on the scalar. 69 Looks up the point in the database that is "nearest" to "$lat,
70 $lon", search at leats up to $radius kilometres. The default for
71 $radius is the cell size the database is built with, and this
72 usually works best, so you usually do not specify this parameter.
75 73
76 unmagic scalar, type 74 If something is found, the associated data blob (always a binary
77 Remove the specified magic from the scalar (DANGEROUS!). 75 string) is returned, otherwise you receive "undef".
78 76
79 weaken scalar 77 Unless you specify a cusotrm format, the data blob is actually a
80 Weaken a reference. (See also WeakRef). 78 UTF-8 string, so you might want to call "utf8::decode" on it to get
79 a unicode astring.
81 80
82 taint scalar 81 At the moment, the implementation is in pure perl, but will
83 Taint the scalar. 82 eventually move to C.
84 83
85 tainted scalar 84ALGORITHM
86 returns true when the scalar is tainted, false otherwise. 85 The algorithm that this module implements consists of two parts: binning
86 and weighting (done when writing the database) and then finding the
87 nearest point.
87 88
88 untaint scalar 89 The first part bins all data points into a grid which has its minimum
89 Remove the tainted flag from the specified scalar. 90 cell size at the equator and poles, with somewhat larger cells in
91 between.
90 92
91 length = len scalar 93 The lookup part will then read the cell that the coordinate is in and
92 Returns SvLEN (scalar), that is, the actual number of bytes 94 some neighbouring cells (depending on the search radius, by default it
93 allocated to the string value, or "undef", is the scalar has no 95 will read the eight cells around it).
94 string value.
95 96
96 scalar = grow scalar, newlen 97 It will then calculate the (squared) distance to the search coordinate
97 Sets the memory area used for the scalar to the given length, if the 98 using an approximate euclidean distance on an equireactangular
98 current length is less than the new value. This does not affect the 99 projection. The squared distance is multiplied with a weight (1..25 for
99 contents of the scalar, but is only useful to "pre-allocate" memory 100 the geonames database, based on population and adminstrative status,
100 space if you know the scalar will grow. The return value is the 101 always 1 for postcal codes), and the minimum distance wins.
101 modified scalar (the scalar is modified in-place).
102 102
103 scalar = extend scalar, addlen=64 103 Binning should not introduce errors, but bigger bins can slow down
104 Reserves enough space in the scalar so that addlen bytes can be 104 lookup times due to having to look at more places. The lookup assumes a
105 appended without reallocating it. The actual contents of the scalar 105 spherical shape for the earth, the equirectangular projection stretches
106 will not be affected. The modified scalar will also be returned. 106 distances unevenly and the euclidean distance calculation introduces
107 further errors. For typical distance (<< 100km) and the intended usage,
108 these errors should be considered negligible.
107 109
108 This function is meant to make append workloads efficient - if you 110SPEED
109 append a short string to a scalar many times (millions of times), 111 The current implementation is written in pure perl, and on my machine,
110 then perl will have to reallocate and copy the scalar basically 112 typically does 10000-200000 lookups per second. The goal for version 1.0
111 every time. 113 is to move the lookup to C.
112 114
113 If you instead use "extend $scalar, length $shortstring", then 115TENTATIVE ROADMAP
114 Convert::Scalar will use a "size to next power of two, roughly" 116 The database writer should be accessible via a module, so you cna easily
115 algorithm, so as the scalar grows, perl will have to resize and copy 117 generate your own databases without having to run an external command.
116 it less and less often.
117 118
118 nread = extend_read fh, scalar, addlen=64 119 The api might be extended to allow for multiple returns, or nearest
119 Calls "extend scalar, addlen" to ensure some space is available, 120 neighbour search.
120 then do the equivalent of "sysread" to the end, to try to fill the
121 extra space. Returns how many bytes have been read, 0 on EOF or
122 undef> on eror, just like "sysread".
123 121
124 This function is useful to implement many protocols where you read 122SEE ALSO
125 some data, see if it is enough to decode, and if not, read some 123 geo-latlon2place-makedb to create databases from common formats.
126 more, where the naive or easy way of doing this would result in bad
127 performance.
128
129 nread = read_all fh, scalar, length
130 Tries to read "length" bytes into "scalar". Unlike "read" or
131 "sysread", it will try to read more bytes if not all bytes could be
132 read in one go (this is often called "xread" in C).
133
134 Returns the total nunmber of bytes read (normally "length", unless
135 an error or EOF occured), 0 on EOF and "undef" on errors.
136
137 nwritten = write_all fh, scalar
138 Like "readall", but for writes - the equivalent of the "xwrite"
139 function often seen in C.
140
141 refcnt scalar[, newrefcnt]
142 Returns the current reference count of the given scalar and
143 optionally sets it to the given reference count.
144
145 refcnt_inc scalar
146 Increments the reference count of the given scalar inplace.
147
148 refcnt_dec scalar
149 Decrements the reference count of the given scalar inplace. Use
150 "weaken" instead if you understand what this function is fore.
151 Better yet: don't use this module in this case.
152
153 refcnt_rv scalar[, newrefcnt]
154 Works like "refcnt", but dereferences the given reference first.
155 This is useful to find the reference count of arrays or hashes,
156 which cannot be passed directly. Remember that taking a reference of
157 some object increases it's reference count, so the reference count
158 used by the *_rv-functions tend to be one higher.
159
160 refcnt_inc_rv scalar
161 Works like "refcnt_inc", but dereferences the given reference first.
162
163 refcnt_dec_rv scalar
164 Works like "refcnt_dec", but dereferences the given reference first.
165
166 ok scalar
167 uok scalar
168 rok scalar
169 pok scalar
170 nok scalar
171 niok scalar
172 Calls SvOK, SvUOK, SvROK, SvPOK, SvNOK or SvNIOK on the given
173 scalar, respectively.
174
175 CANDIDATES FOR FUTURE RELEASES
176 The following API functions (perlapi) are considered for future
177 inclusion in this module If you want them, write me.
178
179 sv_upgrade
180 sv_pvn_force
181 sv_pvutf8n_force
182 the sv2xx family
183 124
184AUTHOR 125AUTHOR
185 Marc Lehmann <schmorp@schmorp.de> 126 Marc Lehmann <schmorp@schmorp.de>
186 http://home.schmorp.de/ 127 http://home.schmorp.de/
187 128

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines