| 1 |
NAME |
| 2 |
Perl::LibExtractor - determine perl library subsets for building |
| 3 |
distributions |
| 4 |
|
| 5 |
SYNOPSIS |
| 6 |
use Perl::LibExtractor; |
| 7 |
|
| 8 |
DESCRIPTION |
| 9 |
The purpose of this module is to determine subsets of your perl library, |
| 10 |
that is, a set of files needed to satisfy certain dependencies (e.g. of |
| 11 |
a program). |
| 12 |
|
| 13 |
The goal is to extract a part of your perl installation including |
| 14 |
dependencies. A typical use case for this module would be to find out |
| 15 |
which files are needed to be build a PAR distribution, to link into an |
| 16 |
App::Staticperl binary, or to pack with Urlader, to create stand-alone |
| 17 |
distributions tailormade to run your app. |
| 18 |
|
| 19 |
METHODS |
| 20 |
To use this module, first call the "new"-constructor and then as many |
| 21 |
other methods as you want, to generate a set of files. Then query the |
| 22 |
set of files and do whatever you want with them. |
| 23 |
|
| 24 |
The command-line utility perl-libextract can be a convenient alternative |
| 25 |
to using this module directly, and offers a few extra options, such as |
| 26 |
to copy out the files into a new directory, strip them and/or manipulate |
| 27 |
them in other ways. |
| 28 |
|
| 29 |
CREATION |
| 30 |
$extractor = new Perl::LibExtractor [key => value...] |
| 31 |
Creates a new extractor object. Each extractor object stores some |
| 32 |
configuration options and a subset of files that can be queried at |
| 33 |
any time,. |
| 34 |
|
| 35 |
Binary executables (such as the perl interpreter) are stored inside |
| 36 |
bin/, perl scripts are stored under script/, perl library files are |
| 37 |
stored under lib/ and shared libraries are stored under dll/. |
| 38 |
|
| 39 |
The following key-value pairs exist, with default values as |
| 40 |
specified. |
| 41 |
|
| 42 |
inc => \@INC without "." |
| 43 |
An arrayref with paths to perl library directories. The default |
| 44 |
is "\@INC", with . removed. |
| 45 |
|
| 46 |
To prepend custom dirs just do this: |
| 47 |
|
| 48 |
inc => ["mydir", @INC], |
| 49 |
|
| 50 |
use_packlist => 1 |
| 51 |
Enable (if true) or disable the use of ".packlist" files. If |
| 52 |
enabled, then each time a file is traced, the complete |
| 53 |
distribution that contains it is included (but not traced). |
| 54 |
|
| 55 |
If disabled, only shared objects and autoload files will be |
| 56 |
added. |
| 57 |
|
| 58 |
Debian GNU/Linux doesn't completely package perl or any perl |
| 59 |
modules, so this option will fail. Other perls should be fine. |
| 60 |
|
| 61 |
extra_deps => { file => [files...] } |
| 62 |
Some (mainly runtime dependencies in the perl core library) |
| 63 |
cannot be detected automatically by this module, especially if |
| 64 |
you don't use packlists and "add_core". |
| 65 |
|
| 66 |
This module comes with a set of default dependencies (such as |
| 67 |
Carp requiring Carp::Heavy), which you cna override with this |
| 68 |
parameter. |
| 69 |
|
| 70 |
To see the default set of dependencies that come with this |
| 71 |
module, use this: |
| 72 |
|
| 73 |
perl -MPerl::LibExtractor -MData::Dumper -e 'print Dumper $Perl::LibExtractor::EXTRA_DEPS' |
| 74 |
|
| 75 |
TRACE/PACKLIST BASED ADDING |
| 76 |
The following methods add various things to the set of files. |
| 77 |
|
| 78 |
Each time a perl file is added, it is scanned by tracing either loading, |
| 79 |
execution or compiling it, and seeing which other perl modules and |
| 80 |
libraries have been loaded. |
| 81 |
|
| 82 |
For each library file found this way, additional dependencies are added: |
| 83 |
if packlists are enabled, then all files of the distribution that |
| 84 |
contains the file will be added. If packlists are disabled, then only |
| 85 |
shared objects and autoload files for modules will be added. |
| 86 |
|
| 87 |
Only files from perl library directories will be added automatically. |
| 88 |
Any other files (such as manpages or scripts installed in the bin |
| 89 |
directory) are skipped. |
| 90 |
|
| 91 |
If there is an error, such as a module not being found, then this module |
| 92 |
croaks (as opposed to silently skipping). If you want to add something |
| 93 |
of which you are not sure it exists, then you can wrap the call into |
| 94 |
"eval {}". In some cases, you can avoid this by executing the code you |
| 95 |
want to work later using "add_eval" - see "add_core_support" for an |
| 96 |
actual example of this technique. |
| 97 |
|
| 98 |
Note that packlists are meant to add files not covered by other |
| 99 |
mechanisms, such as resource files and other data files loaded directly |
| 100 |
by a module - they are not meant to add dependencies that are missed |
| 101 |
because they only happen at runtime. |
| 102 |
|
| 103 |
For example, with packlists, when using AnyEvent, then all event loop |
| 104 |
backends are automatically added as well, but *not* any event loops |
| 105 |
(i.e. AnyEvent::Impl::POE is added, but POE itself is not). Without |
| 106 |
packlists, only the backend that is being used is added (i.e. normally |
| 107 |
none, as loading AnyEvent does not instantly load any backend). |
| 108 |
|
| 109 |
To catch the extra event loop dependencies, you can either initialise |
| 110 |
AnyEvent so it picks a suitable backend: |
| 111 |
|
| 112 |
$extractor->add_eval ("use AnyEvent; AnyEvent::detect"); |
| 113 |
|
| 114 |
Or you can directly load the backend modules you plan to use: |
| 115 |
|
| 116 |
$extractor->add_mod ("AnyEvent::Impl::EV", "AnyEvent::Impl::Perl"); |
| 117 |
|
| 118 |
An example of a program (or module) that has extra resource files is |
| 119 |
Deliantra::Client - the normal tracing (without packlist usage) will |
| 120 |
correctly add all submodules, but miss the fonts and textures. By using |
| 121 |
the packlist, those files are added correctly. |
| 122 |
|
| 123 |
$extractor->add_mod ($module[, $module...]) |
| 124 |
Adds the given module(s) to the file set - the module name must be |
| 125 |
specified as in "use", i.e. with "::" as separators and without .pm. |
| 126 |
|
| 127 |
The program will be loaded with the default import list, any |
| 128 |
dependent files, such as the shared object implementing xs |
| 129 |
functions, or autoload files, will also be added. |
| 130 |
|
| 131 |
If you want to use a different import list (for those rare modules |
| 132 |
wghere import lists trigger different backend modules to be loaded |
| 133 |
for example), you can use "add_eval" instead: |
| 134 |
|
| 135 |
$extractor->add_eval ("use Module qw(a b c)"); |
| 136 |
|
| 137 |
Example: add Coro.pm and AnyEvent/AIO.pm, and all relevant files |
| 138 |
from the distribution they are part of. |
| 139 |
|
| 140 |
$extractor->add_mod ("Coro", "AnyEvent::AIO"); |
| 141 |
|
| 142 |
$extractor->add_require ($name[, $name...]) |
| 143 |
Works like "add_mod", but uses "require $name" to load the module, |
| 144 |
i.e. the name must be a filename. |
| 145 |
|
| 146 |
Example: load Coro and AnyEvent::AIO, but using "add_require" |
| 147 |
instead of "add_mod". |
| 148 |
|
| 149 |
$extractor->add_require ("Coro.pm", "AnyEvent/AIO.pm"); |
| 150 |
|
| 151 |
$extractor->add_bin ($name[, $name...]) |
| 152 |
Adds the given (perl) program(s) to the file set, that is, a program |
| 153 |
installed by some perl module, written in perl (an example would be |
| 154 |
the perl-libextract program that is part of the "Perl::LibExtractor" |
| 155 |
distribution). |
| 156 |
|
| 157 |
Example: add the deliantra client program installed by the |
| 158 |
Deliantra::Client module and put it under bin/deliantra. |
| 159 |
|
| 160 |
$extractor->add_bin ("deliantra"); |
| 161 |
|
| 162 |
$extractor->add_eval ($string) |
| 163 |
Evaluates the string as perl code and adds all modules that are |
| 164 |
loaded by it. For example, this would add AnyEvent and the default |
| 165 |
backend implementation module and event loop module: |
| 166 |
|
| 167 |
$extractor->add_eval ("use AnyEvent; AnyEvent::detect"); |
| 168 |
|
| 169 |
Each code snippet will be executed in its own package and under "use |
| 170 |
strict". |
| 171 |
|
| 172 |
OTHER METHODS FOR ADDING FILES |
| 173 |
The following methods add commonly used files that are either not |
| 174 |
covered by other methods or add commonly-used dependencies. |
| 175 |
|
| 176 |
$extractor->add_perl |
| 177 |
Adds the perl binary itself to the file set, including the libperl |
| 178 |
dll, if needed. |
| 179 |
|
| 180 |
For example, on UNIX systems, this usually adds a exe/perl and |
| 181 |
possibly some dll/libperl.so.XXX. |
| 182 |
|
| 183 |
$extractor->add_core_support |
| 184 |
Try to add modules and files needed to support commonly-used builtin |
| 185 |
language features. For example to open a scalar for I/O you need the |
| 186 |
PerlIO::scalar module: |
| 187 |
|
| 188 |
open $fh, "<", \$scalar |
| 189 |
|
| 190 |
A number of regex and string features (e.g. "ucfirst") need some |
| 191 |
unicore files, e.g.: |
| 192 |
|
| 193 |
'my $x = chr 1234; "\u$x\U$x\l$x\L$x"; $x =~ /\d|\w|\s|\b|$x/i'; |
| 194 |
|
| 195 |
This call adds these files (simply by executing code similar to the |
| 196 |
above code fragments). |
| 197 |
|
| 198 |
Notable things that are missing are other PerlIO layers, such as |
| 199 |
PerlIO::encoding, and named character and character class matches. |
| 200 |
|
| 201 |
$extractor->add_unicore |
| 202 |
Adds (hopefully) all files from the unicore database that will ever |
| 203 |
be needed. |
| 204 |
|
| 205 |
If you are not sure which unicode character classes and similar |
| 206 |
unicore databases you need, and you do not care about an extra one |
| 207 |
thousand(!) files comprising 4MB of data, then you can just call |
| 208 |
this method, which adds basically all files from perl's unicode |
| 209 |
database. |
| 210 |
|
| 211 |
Note that "add_core_support" also adds some unicore files, but it's |
| 212 |
not a subset of "add_unicore" - the former adds all files neccessary |
| 213 |
to support core builtins (which includes some unicore files and |
| 214 |
other things), while the latter adds all unicore files (but nothing |
| 215 |
else). |
| 216 |
|
| 217 |
When in doubt, use both. |
| 218 |
|
| 219 |
$extractor->add_core |
| 220 |
This adds all files from the perl core distribution, that is, all |
| 221 |
library files that come with perl. |
| 222 |
|
| 223 |
This is a superset of "add_core_support" and "add_unicore". |
| 224 |
|
| 225 |
This is quite a lot, but on the plus side, you can be sure nothing |
| 226 |
is missing. |
| 227 |
|
| 228 |
This requires a full perl installation - Debian GNU/Linux doesn't |
| 229 |
package the full perl library, so this function will not work there. |
| 230 |
|
| 231 |
GLOB-BASED ADDING AND FILTERING |
| 232 |
These methods add or manipulate files by using glob-based patterns. |
| 233 |
|
| 234 |
These glob patterns work similarly to glob patterns in the shell: |
| 235 |
|
| 236 |
/ A / at the start of the pattern interprets the pattern as a file |
| 237 |
path inside the file set, almost the same as in the shell. For |
| 238 |
example, /bin/perl* would match all files whose names starting with |
| 239 |
perl inside the bin directory in the set. |
| 240 |
|
| 241 |
If the / is missing, then the pattern is interpreted as a module |
| 242 |
name (a .pm file). For example, Coro matches the file lib/Coro.pm , |
| 243 |
while Coro::* would match lib/Coro/*.pm. |
| 244 |
|
| 245 |
* A single star matches anything inside a single directory component. |
| 246 |
For example, /lib/Coro/*.pm would match all .pm files inside the |
| 247 |
lib/Coro/ directory, but not any files deeper in the hierarchy. |
| 248 |
|
| 249 |
Another way to look at it is that a single star matches anything but |
| 250 |
a slash (/). |
| 251 |
|
| 252 |
** A double star matches any number of characters in the path, |
| 253 |
including /. |
| 254 |
|
| 255 |
For example, AnyEvent::** would match all modules whose names start |
| 256 |
with "AnyEvent::", no matter how deep in the hierarchy they are. |
| 257 |
|
| 258 |
$extractor->add_glob ($modglob[, $modglob...]) |
| 259 |
Adds all files from the perl library that match the given glob |
| 260 |
pattern. |
| 261 |
|
| 262 |
For example, you could implement "add_unicore" yourself like this: |
| 263 |
|
| 264 |
$extractor->add_glob ("/unicore/**.pl"); |
| 265 |
|
| 266 |
$extractor->filter ($pattern[, $pattern...]) |
| 267 |
Applies a series of include/exclude filters. Each filter must start |
| 268 |
with either "+" or "-", to designate the pattern as *include* or |
| 269 |
*exclude* pattern. The rest of the pattern is a normal glob pattern. |
| 270 |
|
| 271 |
An exclude pattern ("-") instantly removes all matching files from |
| 272 |
the set. An include pattern ("+") protects matching files from later |
| 273 |
removals. |
| 274 |
|
| 275 |
That is, if you have an include pattern then all files that were |
| 276 |
matched by it will be included in the set, regardless of any further |
| 277 |
exclude patterns matching the same files. |
| 278 |
|
| 279 |
Likewise, any file excluded by a pattern will not be included in the |
| 280 |
set, even if matched by later include patterns. |
| 281 |
|
| 282 |
Any files not matched by any expression will simply stay in the set. |
| 283 |
|
| 284 |
For example, to remove most of the useless autoload functions by the |
| 285 |
POSIX module (they either do the same thing as a builtin or always |
| 286 |
raise an error), you would use this: |
| 287 |
|
| 288 |
$extractor->filter ("-/lib/auto/POSIX/*.al"); |
| 289 |
|
| 290 |
This does not remove all autoload files, only the ones not defined |
| 291 |
by a subclass (e.g. it leaves "POSIX::SigRt::xxx" alone). |
| 292 |
|
| 293 |
$extractor->runtime_only |
| 294 |
This removes all files that are not needed at runtime, such as |
| 295 |
static archives, header and other files needed only for compilation |
| 296 |
of modules, and pod and html files (which are unlikely to be needed |
| 297 |
at runtime). |
| 298 |
|
| 299 |
This is quite useful when you want to have only files actually |
| 300 |
needed to execute a program. |
| 301 |
|
| 302 |
RESULT SET |
| 303 |
$set = $extractor->set |
| 304 |
Returns a hash reference that represents the result set. The hash is |
| 305 |
the actual internal storage hash and can only be modified as |
| 306 |
described below. |
| 307 |
|
| 308 |
Each key in the hash is the path inside the set, without a leading |
| 309 |
slash, e.g.: |
| 310 |
|
| 311 |
bin/perl |
| 312 |
lib/unicore/lib/Blk/Superscr.pl |
| 313 |
lib/AnyEvent/Impl/EV.pm |
| 314 |
|
| 315 |
The value is an array reference with mostly unspecified contents, |
| 316 |
except the first element, which is the file system path where the |
| 317 |
actual file can be found. |
| 318 |
|
| 319 |
This code snippet lists all files inside the set: |
| 320 |
|
| 321 |
print "$_\n" |
| 322 |
for sort keys %{ $extractor->set }); |
| 323 |
|
| 324 |
This code fragment prints "filesystem_path => set_path" pairs for |
| 325 |
all files in the set: |
| 326 |
|
| 327 |
my $set = $extractor->set; |
| 328 |
while (my ($set,$fspath) = each %$set) { |
| 329 |
print "$fspath => $set\n"; |
| 330 |
} |
| 331 |
|
| 332 |
You can implement your own filtering by asking for the result set |
| 333 |
with "$extractor->set", and then deleting keys from the referenced |
| 334 |
hash - since you can ask for the result set at any time you can add |
| 335 |
things, filter them out this way, and add additional things. |
| 336 |
|
| 337 |
EXAMPLE |
| 338 |
To package he deliantra client (Deliantra::Client), finding all (perl) |
| 339 |
files needed to run it is a first step. This can be done by using |
| 340 |
something like the following code snippet: |
| 341 |
|
| 342 |
my $ex = new Perl::LibExtractor; |
| 343 |
|
| 344 |
$ex->add_perl; |
| 345 |
$ex->add_core_support; |
| 346 |
$ex->add_bin ("deliantra"); |
| 347 |
$ex->add_mod ("AnyEvent::Impl::EV"); |
| 348 |
$ex->add_mod ("AnyEvent::Impl::Perl"); |
| 349 |
$ex->add_mod ("Urlader"); |
| 350 |
$ex->filter ("-/*/auto/POSIX/**.al"); |
| 351 |
$ex->runtime_only; |
| 352 |
|
| 353 |
First it sets the perl library directory to pm and . (the latter to work |
| 354 |
around some AutoLoader bugs), so perl uses only the perl library files |
| 355 |
that came with the binary package. |
| 356 |
|
| 357 |
Then it sets some environment variable to override the system default |
| 358 |
(which might be incompatible). |
| 359 |
|
| 360 |
Then it runs the client itself, using "require". Since "require" only |
| 361 |
looks in the perl library directory this is the reaosn why the scripts |
| 362 |
were put there (of course, since . is also included it doesn't matter, |
| 363 |
but I refuse to yield to bugs). |
| 364 |
|
| 365 |
Finally it exits with a clean status to signal "ok" to Urlader. |
| 366 |
|
| 367 |
Back to the original "Perl::LibExtractor" script: after initialising a |
| 368 |
new set, the script simply adds the perl interpreter and core support |
| 369 |
files (just in case, not all are needed, but some are, and I am too lazy |
| 370 |
to find out which ones exactly). |
| 371 |
|
| 372 |
Then it adds the deliantra executable itself, which in turn adds most of |
| 373 |
the required modules. After that, the AnyEvent implementation modules |
| 374 |
are added because these dependencies are not picked up automatically. |
| 375 |
|
| 376 |
The Urlader module is added because the client itself does not depend on |
| 377 |
it at all, but the wrapper does. |
| 378 |
|
| 379 |
At this point, all required files are present, and it's time to slim |
| 380 |
down: most of the ueseless POSIX autoloaded functions are removed, not |
| 381 |
because they are so big, but because creating files is a costly |
| 382 |
operation in itself, so even small fiels have considerable overhead when |
| 383 |
unpacking. Then files not required for running the client are removed. |
| 384 |
|
| 385 |
And that concludes it, the set is now ready. |
| 386 |
|
| 387 |
SEE ALSO |
| 388 |
The utility program that comes with this module: perl-libextract. |
| 389 |
|
| 390 |
App::Staticperl, Urlader, Perl::Squish. |
| 391 |
|
| 392 |
LICENSE |
| 393 |
This software package is licensed under the GPL version 3 or any later |
| 394 |
version, see COPYING for details. |
| 395 |
|
| 396 |
This license does not, of course, apply to any output generated by this |
| 397 |
software. |
| 398 |
|
| 399 |
AUTHOR |
| 400 |
Marc Lehmann <schmorp@schmorp.de> |
| 401 |
http://home.schmorp.de/ |
| 402 |
|