ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/Geo-LatLon2Place/bin/geo-latlon2place-makedb
(Generate patch)

Comparing Geo-LatLon2Place/bin/geo-latlon2place-makedb (file contents):
Revision 1.1 by root, Mon Mar 14 02:41:52 2022 UTC vs.
Revision 1.2 by root, Mon Mar 14 03:26:20 2022 UTC

37The extraction method: the default is C<geonames>, which expects a 37The extraction method: the default is C<geonames>, which expects a
38geonames database (L<https://download.geonames.org/export/dump/>, for 38geonames database (L<https://download.geonames.org/export/dump/>, for
39example F<DE.txt>, F<cities500.txt> or F<allCountries.txt>) and extracts 39example F<DE.txt>, F<cities500.txt> or F<allCountries.txt>) and extracts
40I<placename, countrycode> strings from it. 40I<placename, countrycode> strings from it.
41 41
42The method C<geonames-postalcodes> (not yet implemented) 42The method C<geonames-postalcodes> does the same, but for a geonames
43does the same, but for a geonames postal code database
44L<https://download.geonames.org/export/zip>. 43postal code database L<https://download.geonames.org/export/zip>, and
44extracts C<zip name, countrycopde> strings.
45 45
46Lastly, you can specify a perl fragment that implements your own filtering 46Lastly, you can specify a perl fragment that implements your own filtering
47and extraction. 47and extraction.
48 48
49=back 49=back
59in C<$_>. The file is opened using the C<:perlio> layer, so if your input 59in C<$_>. The file is opened using the C<:perlio> layer, so if your input
60file is in UTF-8, so will be C<$_>. 60file is in UTF-8, so will be C<$_>.
61 61
62For example, the following would expect an input file with space separated 62For example, the following would expect an input file with space separated
63latitude, longitude, weight and name, where name can contain spaces, which 63latitude, longitude, weight and name, where name can contain spaces, which
64is useful when you wat to provide your own input data: 64is useful when you want to provide your own input data:
65 65
66 geo-latlon2place-makedb --extract 'chomp; split / /, 4' input output 66 geo-latlon2place-makedb --extract 'chomp; split / /, 4' input output
67 67
68A slighly more verbose example expecting only latitude, longitude and a 68A slighly more verbose example expecting only latitude, longitude and a
69name would be: 69name would be:
90weight, these should be self-explaining. The weight is used during search 90weight, these should be self-explaining. The weight is used during search
91and will be multiplied to the square of the distance, and is used to make 91and will be multiplied to the square of the distance, and is used to make
92larger cities win over small ones when the coordinate is somewhere between 92larger cities win over small ones when the coordinate is somewhere between
93them. 93them.
94 94
95The standard extractors (C<geonames> and C<geonames-postalcodes>) provide
96a UTF-8-encoded string as blob, but any binary data will do, for example,
97if you want to associate your coordinate pairs with some short-ish
98integer codes, you could do this:
99
100 geo-latlon2place-makedb --extract '
101 chomp;
102 my ($lat, $lon, $id) = split / /, 4;
103 ($lat, $lon, 1, pack "w", $id)
104 ' input output
105
106And later use C<unpack "w"> on the data returned by C<lookup>.
107
95The C<geonames> filter looks similar to this fragment, which shows off 108The C<geonames> filter looks similar to the following fragment, which
96more possibilities: 109shows off some more filtering possibilities:
97 110
98 my ($id, $name, undef, undef, $lat, $lon, $t1, $t2, $cc, undef, $a1, $s2, $a3, $a4, $pop, undef) = split /\t/; 111 my ($id, $name, undef, undef, $lat, $lon, $t1, $t2, $cc, undef, $a1, $s2, $a3, $a4, $pop, undef) = split /\t/;
99 112
100 return if $t1 ne "P"; # only places 113 return if $t1 ne "P"; # only places
101 114
114 # actually place names, so ignore very long names 127 # actually place names, so ignore very long names
115 60 > length $name 128 60 > length $name
116 or return; 129 or return;
117 130
118 # we estimate a weight by dividing 25 by the radius of the place, 131 # we estimate a weight by dividing 25 by the radius of the place,
119 # which we get by assuming a fixed population density of 5000 people/kmĀ², 132 # which we get by assuming a fixed population density of 5000 # people
120 # which is almost always a considerable over-estimate. 133 # per square km, # which is almost always a considerable over-estimate.
121 # 25 and 5000 are pretty much made-up, feel free to improve and 134 # 25 and 5000 are pretty much made-up, feel free to improve and
122 # send me the results. 135 # send me the results.
123 my $w = 25 / (1 + sqrt $pop / 5000); 136 my $w = 25 / (1 + sqrt $pop / 5000);
124 137
125 # administrative centers get a fixed low weight 138 # administrative centers get a fixed low weight

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines