ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/App-Staticperl/staticperl.pod
Revision: 1.38
Committed: Fri Mar 18 19:49:04 2011 UTC (13 years, 2 months ago) by root
Branch: MAIN
CVS Tags: rel-1_21
Changes since 1.37: +13 -4 lines
Log Message:
1.21

File Contents

# Content
1 =head1 NAME
2
3 staticperl - perl, libc, 100 modules, all in one 500kb file
4
5 =head1 SYNOPSIS
6
7 staticperl help # print the embedded documentation
8 staticperl fetch # fetch and unpack perl sources
9 staticperl configure # fetch and then configure perl
10 staticperl build # configure and then build perl
11 staticperl install # build and then install perl
12 staticperl clean # clean most intermediate files (restart at configure)
13 staticperl distclean # delete everything installed by this script
14 staticperl cpan # invoke CPAN shell
15 staticperl instmod path... # install unpacked modules
16 staticperl instcpan modulename... # install modules from CPAN
17 staticperl mkbundle <bundle-args...> # see documentation
18 staticperl mkperl <bundle-args...> # see documentation
19 staticperl mkapp appname <bundle-args...> # see documentation
20
21 Typical Examples:
22
23 staticperl install # fetch, configure, build and install perl
24 staticperl cpan # run interactive cpan shell
25 staticperl mkperl -MConfig_heavy.pl # build a perl that supports -V
26 staticperl mkperl -MAnyEvent::Impl::Perl -MAnyEvent::HTTPD -MURI -MURI::http
27 # build a perl with the above modules linked in
28 staticperl mkapp myapp --boot mainprog mymodules
29 # build a binary "myapp" from mainprog and mymodules
30
31 =head1 DESCRIPTION
32
33 This script helps you to create single-file perl interpreters
34 or applications, or embedding a perl interpreter in your
35 applications. Single-file means that it is fully self-contained - no
36 separate shared objects, no autoload fragments, no .pm or .pl files are
37 needed. And when linking statically, you can create (or embed) a single
38 file that contains perl interpreter, libc, all the modules you need, all
39 the libraries you need and of course your actual program.
40
41 With F<uClibc> and F<upx> on x86, you can create a single 500kb binary
42 that contains perl and 100 modules such as POSIX, AnyEvent, EV, IO::AIO,
43 Coro and so on. Or any other choice of modules.
44
45 To see how this turns out, you can try out smallperl and bigperl, two
46 pre-built static and compressed perl binaries with many and even more
47 modules: just follow the links at L<http://staticperl.schmorp.de/>.
48
49 The created files do not need write access to the file system (like PAR
50 does). In fact, since this script is in many ways similar to PAR::Packer,
51 here are the differences:
52
53 =over 4
54
55 =item * The generated executables are much smaller than PAR created ones.
56
57 Shared objects and the perl binary contain a lot of extra info, while
58 the static nature of F<staticperl> allows the linker to remove all
59 functionality and meta-info not required by the final executable. Even
60 extensions statically compiled into perl at build time will only be
61 present in the final executable when needed.
62
63 In addition, F<staticperl> can strip perl sources much more effectively
64 than PAR.
65
66 =item * The generated executables start much faster.
67
68 There is no need to unpack files, or even to parse Zip archives (which is
69 slow and memory-consuming business).
70
71 =item * The generated executables don't need a writable filesystem.
72
73 F<staticperl> loads all required files directly from memory. There is no
74 need to unpack files into a temporary directory.
75
76 =item * More control over included files, more burden.
77
78 PAR tries to be maintenance and hassle-free - it tries to include more
79 files than necessary to make sure everything works out of the box. It
80 mostly succeeds at this, but he extra files (such as the unicode database)
81 can take substantial amounts of memory and file size.
82
83 With F<staticperl>, the burden is mostly with the developer - only direct
84 compile-time dependencies and L<AutoLoader> are handled automatically.
85 This means the modules to include often need to be tweaked manually.
86
87 All this does not preclude more permissive modes to be implemented in
88 the future, but right now, you have to resolve state hidden dependencies
89 manually.
90
91 =item * PAR works out of the box, F<staticperl> does not.
92
93 Maintaining your own custom perl build can be a pain in the ass, and while
94 F<staticperl> tries to make this easy, it still requires a custom perl
95 build and possibly fiddling with some modules. PAR is likely to produce
96 results faster.
97
98 Ok, PAR never has worked for me out of the box, and for some people,
99 F<staticperl> does work out of the box, as they don't count "fiddling with
100 module use lists" against it, but nevertheless, F<staticperl> is certainly
101 a bit more difficult to use.
102
103 =back
104
105 =head1 HOW DOES IT WORK?
106
107 Simple: F<staticperl> downloads, compile and installs a perl version of
108 your choice in F<~/.staticperl>. You can add extra modules either by
109 letting F<staticperl> install them for you automatically, or by using CPAN
110 and doing it interactively. This usually takes 5-10 minutes, depending on
111 the speed of your computer and your internet connection.
112
113 It is possible to do program development at this stage, too.
114
115 Afterwards, you create a list of files and modules you want to include,
116 and then either build a new perl binary (that acts just like a normal perl
117 except everything is compiled in), or you create bundle files (basically C
118 sources you can use to embed all files into your project).
119
120 This step is very fast (a few seconds if PPI is not used for stripping, or
121 the stripped files are in the cache), and can be tweaked and repeated as
122 often as necessary.
123
124 =head1 THE F<STATICPERL> SCRIPT
125
126 This module installs a script called F<staticperl> into your perl
127 binary directory. The script is fully self-contained, and can be
128 used without perl (for example, in an uClibc chroot environment). In
129 fact, it can be extracted from the C<App::Staticperl> distribution
130 tarball as F<bin/staticperl>, without any installation. The
131 newest (possibly alpha) version can also be downloaded from
132 L<http://staticperl.schmorp.de/staticperl>.
133
134 F<staticperl> interprets the first argument as a command to execute,
135 optionally followed by any parameters.
136
137 There are two command categories: the "phase 1" commands which deal with
138 installing perl and perl modules, and the "phase 2" commands, which deal
139 with creating binaries and bundle files.
140
141 =head2 PHASE 1 COMMANDS: INSTALLING PERL
142
143 The most important command is F<install>, which does basically
144 everything. The default is to download and install perl 5.12.3 and a few
145 modules required by F<staticperl> itself, but all this can (and should) be
146 changed - see L<CONFIGURATION>, below.
147
148 The command
149
150 staticperl install
151
152 is normally all you need: It installs the perl interpreter in
153 F<~/.staticperl/perl>. It downloads, configures, builds and installs the
154 perl interpreter if required.
155
156 Most of the following F<staticperl> subcommands simply run one or more
157 steps of this sequence.
158
159 If it fails, then most commonly because the compiler options I selected
160 are not supported by your compiler - either edit the F<staticperl> script
161 yourself or create F<~/.staticperl> shell script where your set working
162 C<PERL_CCFLAGS> etc. variables.
163
164 To force recompilation or reinstallation, you need to run F<staticperl
165 distclean> first.
166
167 =over 4
168
169 =item F<staticperl version>
170
171 Prints some info about the version of the F<staticperl> script you are using.
172
173 =item F<staticperl fetch>
174
175 Runs only the download and unpack phase, unless this has already happened.
176
177 =item F<staticperl configure>
178
179 Configures the unpacked perl sources, potentially after downloading them first.
180
181 =item F<staticperl build>
182
183 Builds the configured perl sources, potentially after automatically
184 configuring them.
185
186 =item F<staticperl install>
187
188 Wipes the perl installation directory (usually F<~/.staticperl/perl>) and
189 installs the perl distribution, potentially after building it first.
190
191 =item F<staticperl cpan> [args...]
192
193 Starts an interactive CPAN shell that you can use to install further
194 modules. Installs the perl first if necessary, but apart from that,
195 no magic is involved: you could just as well run it manually via
196 F<~/.staticperl/perl/bin/cpan>.
197
198 Any additional arguments are simply passed to the F<cpan> command.
199
200 =item F<staticperl instcpan> module...
201
202 Tries to install all the modules given and their dependencies, using CPAN.
203
204 Example:
205
206 staticperl instcpan EV AnyEvent::HTTPD Coro
207
208 =item F<staticperl instsrc> directory...
209
210 In the unlikely case that you have unpacked perl modules around and want
211 to install from these instead of from CPAN, you can do this using this
212 command by specifying all the directories with modules in them that you
213 want to have built.
214
215 =item F<staticperl clean>
216
217 Deletes the perl source directory (and potentially cleans up other
218 intermediate files). This can be used to clean up files only needed for
219 building perl, without removing the installed perl interpreter.
220
221 At the moment, it doesn't delete downloaded tarballs.
222
223 The exact semantics of this command will probably change.
224
225 =item F<staticperl distclean>
226
227 This wipes your complete F<~/.staticperl> directory. Be careful with this,
228 it nukes your perl download, perl sources, perl distribution and any
229 installed modules. It is useful if you wish to start over "from scratch"
230 or when you want to uninstall F<staticperl>.
231
232 =back
233
234 =head2 PHASE 2 COMMANDS: BUILDING PERL BUNDLES
235
236 Building (linking) a new F<perl> binary is handled by a separate
237 script. To make it easy to use F<staticperl> from a F<chroot>, the script
238 is embedded into F<staticperl>, which will write it out and call for you
239 with any arguments you pass:
240
241 staticperl mkbundle mkbundle-args...
242
243 In the oh so unlikely case of something not working here, you
244 can run the script manually as well (by default it is written to
245 F<~/.staticperl/mkbundle>).
246
247 F<mkbundle> is a more conventional command and expect the argument
248 syntax commonly used on UNIX clones. For example, this command builds
249 a new F<perl> binary and includes F<Config.pm> (for F<perl -V>),
250 F<AnyEvent::HTTPD>, F<URI> and a custom F<httpd> script (from F<eg/httpd>
251 in this distribution):
252
253 # first make sure we have perl and the required modules
254 staticperl instcpan AnyEvent::HTTPD
255
256 # now build the perl
257 staticperl mkperl -MConfig_heavy.pl -MAnyEvent::Impl::Perl \
258 -MAnyEvent::HTTPD -MURI::http \
259 --add 'eg/httpd httpd.pm'
260
261 # finally, invoke it
262 ./perl -Mhttpd
263
264 As you can see, things are not quite as trivial: the L<Config> module has
265 a hidden dependency which is not even a perl module (F<Config_heavy.pl>),
266 L<AnyEvent> needs at least one event loop backend that we have to
267 specify manually (here L<AnyEvent::Impl::Perl>), and the F<URI> module
268 (required by L<AnyEvent::HTTPD>) implements various URI schemes as extra
269 modules - since L<AnyEvent::HTTPD> only needs C<http> URIs, we only need
270 to include that module. I found out about these dependencies by carefully
271 watching any error messages about missing modules...
272
273 Instead of building a new perl binary, you can also build a standalone
274 application:
275
276 # build the app
277 staticperl mkapp app --boot eg/httpd \
278 -MAnyEvent::Impl::Perl -MAnyEvent::HTTPD -MURI::http
279
280 # run it
281 ./app
282
283 Here are the three phase 2 commands:
284
285 =over 4
286
287 =item F<staticperl mkbundle> args...
288
289 The "default" bundle command - it interprets the given bundle options and
290 writes out F<bundle.h>, F<bundle.c>, F<bundle.ccopts> and F<bundle.ldopts>
291 files, useful for embedding.
292
293 =item F<staticperl mkperl> args...
294
295 Creates a bundle just like F<staticperl mkbundle> (in fact, it's the same
296 as invoking F<staticperl mkbundle --perl> args...), but then compiles and
297 links a new perl interpreter that embeds the created bundle, then deletes
298 all intermediate files.
299
300 =item F<staticperl mkapp> filename args...
301
302 Does the same as F<staticperl mkbundle> (in fact, it's the same as
303 invoking F<staticperl mkbundle --app> filename args...), but then compiles
304 and links a new standalone application that simply initialises the perl
305 interpreter.
306
307 The difference to F<staticperl mkperl> is that the standalone application
308 does not act like a perl interpreter would - in fact, by default it would
309 just do nothing and exit immediately, so you should specify some code to
310 be executed via the F<--boot> option.
311
312 =back
313
314 =head3 OPTION PROCESSING
315
316 All options can be given as arguments on the command line (typically
317 using long (e.g. C<--verbose>) or short option (e.g. C<-v>) style). Since
318 specifying a lot of options can make the command line very long and
319 unwieldy, you can put all long options into a "bundle specification file"
320 (one option per line, with or without C<--> prefix) and specify this
321 bundle file instead.
322
323 For example, the command given earlier to link a new F<perl> could also
324 look like this:
325
326 staticperl mkperl httpd.bundle
327
328 With all options stored in the F<httpd.bundle> file (one option per line,
329 everything after the option is an argument):
330
331 use "Config_heavy.pl"
332 use AnyEvent::Impl::Perl
333 use AnyEvent::HTTPD
334 use URI::http
335 add eg/httpd httpd.pm
336
337 All options that specify modules or files to be added are processed in the
338 order given on the command line.
339
340 =head3 BUNDLE CREATION WORKFLOW / STATICPELR MKBUNDLE OPTIONS
341
342 F<staticperl mkbundle> works by first assembling a list of candidate
343 files and modules to include, then filtering them by include/exclude
344 patterns. The remaining modules (together with their direct dependencies,
345 such as link libraries and L<AutoLoader> files) are then converted into
346 bundle files suitable for embedding. F<staticperl mkbundle> can then
347 optionally build a new perl interpreter or a standalone application.
348
349 =over 4
350
351 =item Step 0: Generic argument processing.
352
353 The following options influence F<staticperl mkbundle> itself.
354
355 =over 4
356
357 =item C<--verbose> | C<-v>
358
359 Increases the verbosity level by one (the default is C<1>).
360
361 =item C<--quiet> | C<-q>
362
363 Decreases the verbosity level by one.
364
365 =item any other argument
366
367 Any other argument is interpreted as a bundle specification file, which
368 supports all options (without extra quoting), one option per line, in the
369 format C<option> or C<option argument>. They will effectively be expanded
370 and processed as if they were directly written on the command line, in
371 place of the file name.
372
373 =back
374
375 =item Step 1: gather candidate files and modules
376
377 In this step, modules, perl libraries (F<.pl> files) and other files are
378 selected for inclusion in the bundle. The relevant options are executed
379 in order (this makes a difference mostly for C<--eval>, which can rely on
380 earlier C<--use> options to have been executed).
381
382 =over 4
383
384 =item C<--use> F<module> | C<-M>F<module>
385
386 Include the named module or perl library and trace direct
387 dependencies. This is done by loading the module in a subprocess and
388 tracing which other modules and files it actually loads.
389
390 Example: include AnyEvent and AnyEvent::Impl::Perl.
391
392 staticperl mkbundle --use AnyEvent --use AnyEvent::Impl::Perl
393
394 Sometimes you want to load old-style "perl libraries" (F<.pl> files), or
395 maybe other weirdly named files. To support this, the C<--use> option
396 actually tries to do what you mean, depending on the string you specify:
397
398 =over 4
399
400 =item a possibly valid module name, e.g. F<common::sense>, F<Carp>,
401 F<Coro::Mysql>.
402
403 If the string contains no quotes, no F</> and no F<.>, then C<--use>
404 assumes that it is a normal module name. It will create a new package and
405 evaluate a C<use module> in it, i.e. it will load the package and do a
406 default import.
407
408 The import step is done because many modules trigger more dependencies
409 when something is imported than without.
410
411 =item anything that contains F</> or F<.> characters,
412 e.g. F<utf8_heavy.pl>, F<Module/private/data.pl>.
413
414 The string will be quoted and passed to require, as if you used C<require
415 $module>. Nothing will be imported.
416
417 =item "path" or 'path', e.g. C<"utf8_heavy.pl">.
418
419 If you enclose the name into single or double quotes, then the quotes will
420 be removed and the resulting string will be passed to require. This syntax
421 is form compatibility with older versions of staticperl and should not be
422 used anymore.
423
424 =back
425
426 Example: C<use> AnyEvent::Socket, once using C<use> (importing the
427 symbols), and once via C<require>, not importing any symbols. The first
428 form is preferred as many modules load some extra dependencies when asked
429 to export symbols.
430
431 staticperl mkbundle -MAnyEvent::Socket # use + import
432 staticperl mkbundle -MAnyEvent/Socket.pm # require only
433
434 Example: include the required files for F<perl -V> to work in all its
435 glory (F<Config.pm> is included automatically by the dependency tracker).
436
437 # shell command
438 staticperl mkbundle -MConfig_heavy.pl
439
440 # bundle specification file
441 use Config_heavy.pl
442
443 The C<-M>module syntax is included as a convenience that might be easier
444 to remember than C<--use> - it's the same switch as perl itself uses
445 to load modules. Or maybe it confuses people. Time will tell. Or maybe
446 not. Sigh.
447
448 =item C<--eval> "perl code" | C<-e> "perl code"
449
450 Sometimes it is easier (or necessary) to specify dependencies using perl
451 code, or maybe one of the modules you use need a special use statement. In
452 that case, you can use C<--eval> to execute some perl snippet or set some
453 variables or whatever you need. All files C<require>'d or C<use>'d while
454 executing the snippet are included in the final bundle.
455
456 Keep in mind that F<mkbundle> will not import any symbols from the modules
457 named by the C<--use> option, so do not expect the symbols from modules
458 you C<--use>'d earlier on the command line to be available.
459
460 Example: force L<AnyEvent> to detect a backend and therefore include it
461 in the final bundle.
462
463 staticperl mkbundle --eval 'use AnyEvent; AnyEvent::detect'
464
465 # or like this
466 staticperl mkbundle -MAnyEvent --eval 'AnyEvent::detect'
467
468 Example: use a separate "bootstrap" script that C<use>'s lots of modules
469 and also include this in the final bundle, to be executed automatically
470 when the interpreter is initialised.
471
472 staticperl mkbundle --eval 'do "bootstrap"' --boot bootstrap
473
474 =item C<--boot> F<filename>
475
476 Include the given file in the bundle and arrange for it to be
477 executed (using C<require>) before the main program when the new perl
478 is initialised. This can be used to modify C<@INC> or do similar
479 modifications before the perl interpreter executes scripts given on the
480 command line (or via C<-e>). This works even in an embedded interpreter -
481 the file will be executed during interpreter initialisation in that case.
482
483 =item C<--incglob> pattern
484
485 This goes through all standard library directories and tries to match any
486 F<.pm> and F<.pl> files against the extended glob pattern (see below). If
487 a file matches, it is added. The pattern is matched against the full path
488 of the file (sans the library directory prefix), e.g. F<Sys/Syslog.pm>.
489
490 This is very useful to include "everything":
491
492 --incglob '*'
493
494 It is also useful for including perl libraries, or trees of those, such as
495 the unicode database files needed by some perl built-ins, the regex engine
496 and other modules.
497
498 --incglob '/unicore/**.pl'
499
500 =item C<--add> F<file> | C<--add> "F<file> alias"
501
502 Adds the given (perl) file into the bundle (and optionally call it
503 "alias"). The F<file> is either an absolute path or a path relative to the
504 current directory. If an alias is specified, then this is the name it will
505 use for C<@INC> searches, otherwise the path F<file> will be used as the
506 internal name.
507
508 This switch is used to include extra files into the bundle.
509
510 Example: embed the file F<httpd> in the current directory as F<httpd.pm>
511 when creating the bundle.
512
513 staticperl mkperl --add "httpd httpd.pm"
514
515 # can be accessed via "use httpd"
516
517 Example: add a file F<initcode> from the current directory.
518
519 staticperl mkperl --add 'initcode &initcode'
520
521 # can be accessed via "do '&initcode'"
522
523 Example: add local files as extra modules in the bundle.
524
525 # specification file
526 add file1 myfiles/file1.pm
527 add file2 myfiles/file2.pm
528 add file3 myfiles/file3.pl
529
530 # then later, in perl, use
531 use myfiles::file1;
532 require myfiles::file2;
533 my $res = do "myfiles/file3.pl";
534
535 =item C<--binadd> F<file> | C<--add> "F<file> alias"
536
537 Just like C<--add>, except that it treats the file as binary and adds it
538 without any postprocessing (perl files might get stripped to reduce their
539 size).
540
541 If you specify an alias you should probably add a C<&> prefix to avoid
542 clashing with embedded perl files (whose paths never start with C<&>),
543 and/or use a special directory prefix, such as C<&res/name>.
544
545 You can later get a copy of these files by calling C<staticperl::find
546 "alias">.
547
548 An alternative way to embed binary files is to convert them to perl and
549 use C<do> to get the contents - this method is a bit cumbersome, but works
550 both inside and outside of a staticperl bundle:
551
552 # a "binary" file, call it "bindata.pl"
553 <<'SOME_MARKER'
554 binary data NOT containing SOME_MARKER
555 SOME_MARKER
556
557 # load the binary
558 chomp (my $data = do "bindata.pl");
559
560 =back
561
562 =item Step 2: filter all files using C<--include> and C<--exclude> options.
563
564 After all candidate files and modules are added, they are I<filtered>
565 by a combination of C<--include> and C<--exclude> patterns (there is an
566 implicit C<--include *> at the end, so if no filters are specified, all
567 files are included).
568
569 All that this step does is potentially reduce the number of files that are
570 to be included - no new files are added during this step.
571
572 =over 4
573
574 =item C<--include> pattern | C<-i> pattern | C<--exclude> pattern | C<-x> pattern
575
576 These specify an include or exclude pattern to be applied to the candidate
577 file list. An include makes sure that the given files will be part of the
578 resulting file set, an exclude will exclude remaining files. The patterns
579 are "extended glob patterns" (see below).
580
581 The patterns are applied "in order" - files included via earlier
582 C<--include> specifications cannot be removed by any following
583 C<--exclude>, and likewise, and file excluded by an earlier C<--exclude>
584 cannot be added by any following C<--include>.
585
586 For example, to include everything except C<Devel> modules, but still
587 include F<Devel::PPPort>, you could use this:
588
589 --incglob '*' -i '/Devel/PPPort.pm' -x '/Devel/**'
590
591 =back
592
593 =item Step 3: add any extra or "hidden" dependencies.
594
595 F<staticperl> currently knows about three extra types of depdendencies
596 that are added automatically. Only one (F<.packlist> files) is currently
597 optional and can be influenced, the others are always included:
598
599 =over 4
600
601 =item C<--usepacklists>
602
603 Read F<.packlist> files for each distribution that happens to match a
604 module name you specified. Sounds weird, and it is, so expect semantics to
605 change somehow in the future.
606
607 The idea is that most CPAN distributions have a F<.pm> file that matches
608 the name of the distribution (which is rather reasonable after all).
609
610 If this switch is enabled, then if any of the F<.pm> files that have been
611 selected match an install distribution, then all F<.pm>, F<.pl>, F<.al>
612 and F<.ix> files installed by this distribution are also included.
613
614 For example, using this switch, when the L<URI> module is specified, then
615 all L<URI> submodules that have been installed via the CPAN distribution
616 are included as well, so you don't have to manually specify them.
617
618 =item L<AutoLoader> splitfiles
619
620 Some modules use L<AutoLoader> - less commonly (hopefully) used functions
621 are split into separate F<.al> files, and an index (F<.ix>) file contains
622 the prototypes.
623
624 Both F<.ix> and F<.al> files will be detected automatically and added to
625 the bundle.
626
627 =item link libraries (F<.a> files)
628
629 Modules using XS (or any other non-perl language extension compiled at
630 installation time) will have a static archive (typically F<.a>). These
631 will automatically be added to the linker options in F<bundle.ldopts>.
632
633 Should F<staticperl> find a dynamic link library (typically F<.so>) it
634 will warn about it - obviously this shouldn't happen unless you use
635 F<staticperl> on the wrong perl, or one (probably wrongly) configured to
636 use dynamic loading.
637
638 =item extra libraries (F<extralibs.ld>)
639
640 Some modules need linking against external libraries - these are found in
641 F<extralibs.ld> and added to F<bundle.ldopts>.
642
643 =back
644
645 =item Step 4: write bundle files and optionally link a program
646
647 At this point, the select files will be read, processed (stripped) and
648 finally the bundle files get written to disk, and F<staticperl mkbundle>
649 is normally finished. Optionally, it can go a step further and either link
650 a new F<perl> binary with all selected modules and files inside, or build
651 a standalone application.
652
653 Both the contents of the bundle files and any extra linking is controlled
654 by these options:
655
656 =over 4
657
658 =item C<--strip> C<none>|C<pod>|C<ppi>
659
660 Specify the stripping method applied to reduce the file of the perl
661 sources included.
662
663 The default is C<pod>, which uses the L<Pod::Strip> module to remove all
664 pod documentation, which is very fast and reduces file size a lot.
665
666 The C<ppi> method uses L<PPI> to parse and condense the perl sources. This
667 saves a lot more than just L<Pod::Strip>, and is generally safer,
668 but is also a lot slower (some files take almost a minute to strip -
669 F<staticperl> maintains a cache of stripped files to speed up subsequent
670 runs for this reason). Note that this method doesn't optimise for raw file
671 size, but for best compression (that means that the uncompressed file size
672 is a bit larger, but the files compress better, e.g. with F<upx>).
673
674 Last not least, if you need accurate line numbers in error messages,
675 or in the unlikely case where C<pod> is too slow, or some module gets
676 mistreated, you can specify C<none> to not mangle included perl sources in
677 any way.
678
679 =item C<--perl>
680
681 After writing out the bundle files, try to link a new perl interpreter. It
682 will be called F<perl> and will be left in the current working
683 directory. The bundle files will be removed.
684
685 This switch is automatically used when F<staticperl> is invoked with the
686 C<mkperl> command instead of C<mkbundle>.
687
688 Example: build a new F<./perl> binary with only L<common::sense> inside -
689 it will be even smaller than the standard perl interpreter as none of the
690 modules of the base distribution (such as L<Fcntl>) will be included.
691
692 staticperl mkperl -Mcommon::sense
693
694 =item C<--app> F<name>
695
696 After writing out the bundle files, try to link a new standalone
697 program. It will be called C<name>, and the bundle files get removed after
698 linking it.
699
700 This switch is automatically used when F<staticperl> is invoked with the
701 C<mkapp> command instead of C<mkbundle>.
702
703 The difference to the (mutually exclusive) C<--perl> option is that the
704 binary created by this option will not try to act as a perl interpreter -
705 instead it will simply initialise the perl interpreter, clean it up and
706 exit.
707
708 This means that, by default, it will do nothing but burn a few CPU cycles
709 - for it to do something useful you I<must> add some boot code, e.g. with
710 the C<--boot> option.
711
712 Example: create a standalone perl binary called F<./myexe> that will
713 execute F<appfile> when it is started.
714
715 staticperl mkbundle --app myexe --boot appfile
716
717 =item C<--ignore-env>
718
719 Generates extra code to unset some environment variables before
720 initialising/running perl. Perl supports a lot of environment variables
721 that might alter execution in ways that might be undesirablre for
722 standalone applications, and this option removes those known to cause
723 trouble.
724
725 Specifically, these are removed:
726
727 C<PERL_HASH_SEED_DEBUG> and C<PERL_DEBUG_MSTATS> can cause underaible
728 output, C<PERL5OPT>, C<PERL_DESTRUCT_LEVEL>, C<PERL_HASH_SEED> and
729 C<PERL_SIGNALS> can alter execution significantly, and C<PERL_UNICODE>,
730 C<PERLIO_DEBUG> and C<PERLIO> can affect input and output.
731
732 The variables C<PERL_LIB> and C<PERL5_LIB> are always ignored because the
733 startup code used by F<staticperl> overrides C<@INC> in all cases.
734
735 This option will not make your program more secure (unless you are
736 running with elevated privileges), but it will reduce the surprise effect
737 when a user has these environment variables set and doesn't expect your
738 standalone program to act like a perl interpreter.
739
740 =item C<--static>
741
742 Add C<-static> to F<bundle.ldopts>, which means a fully static (if
743 supported by the OS) executable will be created. This is not immensely
744 useful when just creating the bundle files, but is most useful when
745 linking a binary with the C<--perl> or C<--app> options.
746
747 The default is to link the new binary dynamically (that means all perl
748 modules are linked statically, but all external libraries are still
749 referenced dynamically).
750
751 Keep in mind that Solaris doesn't support static linking at all, and
752 systems based on GNU libc don't really support it in a very usable
753 fashion either. Try uClibc if you want to create fully statically linked
754 executables, or try the C<--staticlib> option to link only some libraries
755 statically.
756
757 =item C<--staticlib> libname
758
759 When not linking fully statically, this option allows you to link specific
760 libraries statically. What it does is simply replace all occurrences of
761 C<-llibname> with the GCC-specific C<-Wl,-Bstatic -llibname -Wl,-Bdynamic>
762 option.
763
764 This will have no effect unless the library is actually linked against,
765 specifically, C<--staticlib> will not link against the named library
766 unless it would be linked against anyway.
767
768 Example: link libcrypt statically into the final binary.
769
770 staticperl mkperl -MIO::AIO --staticlib crypt
771
772 # ldopts might now contain:
773 # -lm -Wl,-Bstatic -lcrypt -Wl,-Bdynamic -lpthread
774
775 =back
776
777 =back
778
779 =head3 EXTENDED GLOB PATTERNS
780
781 Some options of F<staticperl mkbundle> expect an I<extended glob
782 pattern>. This is neither a normal shell glob nor a regex, but something
783 in between. The idea has been copied from rsync, and there are the current
784 matching rules:
785
786 =over 4
787
788 =item Patterns starting with F</> will be a anchored at the root of the library tree.
789
790 That is, F</unicore> will match the F<unicore> directory in C<@INC>, but
791 nothing inside, and neither any other file or directory called F<unicore>
792 anywhere else in the hierarchy.
793
794 =item Patterns not starting with F</> will be anchored at the end of the path.
795
796 That is, F<idna.pl> will match any file called F<idna.pl> anywhere in the
797 hierarchy, but not any directories of the same name.
798
799 =item A F<*> matches anything within a single path component.
800
801 That is, F</unicore/*.pl> would match all F<.pl> files directly inside
802 C</unicore>, not any deeper level F<.pl> files. Or in other words, F<*>
803 will not match slashes.
804
805 =item A F<**> matches anything.
806
807 That is, F</unicore/**.pl> would match all F<.pl> files under F</unicore>,
808 no matter how deeply nested they are inside subdirectories.
809
810 =item A F<?> matches a single character within a component.
811
812 That is, F</Encode/??.pm> matches F</Encode/JP.pm>, but not the
813 hypothetical F</Encode/J/.pm>, as F<?> does not match F</>.
814
815 =back
816
817 =head2 F<STATICPERL> CONFIGURATION AND HOOKS
818
819 During (each) startup, F<staticperl> tries to source some shell files to
820 allow you to fine-tune/override configuration settings.
821
822 In them you can override shell variables, or define shell functions
823 ("hooks") to be called at specific phases during installation. For
824 example, you could define a C<postinstall> hook to install additional
825 modules from CPAN each time you start from scratch.
826
827 If the env variable C<$STATICPERLRC> is set, then F<staticperl> will try
828 to source the file named with it only. Otherwise, it tries the following
829 shell files in order:
830
831 /etc/staticperlrc
832 ~/.staticperlrc
833 $STATICPERL/rc
834
835 Note that the last file is erased during F<staticperl distclean>, so
836 generally should not be used.
837
838 =head3 CONFIGURATION VARIABLES
839
840 =head4 Variables you I<should> override
841
842 =over 4
843
844 =item C<EMAIL>
845
846 The e-mail address of the person who built this binary. Has no good
847 default, so should be specified by you.
848
849 =item C<CPAN>
850
851 The URL of the CPAN mirror to use (e.g. L<http://mirror.netcologne.de/cpan/>).
852
853 =item C<EXTRA_MODULES>
854
855 Additional modules installed during F<staticperl install>. Here you can
856 set which modules you want have to installed from CPAN.
857
858 Example: I really really need EV, AnyEvent, Coro and AnyEvent::AIO.
859
860 EXTRA_MODULES="EV AnyEvent Coro AnyEvent::AIO"
861
862 Note that you can also use a C<postinstall> hook to achieve this, and
863 more.
864
865 =back
866
867 =head4 Variables you might I<want> to override
868
869 =over 4
870
871 =item C<STATICPERL>
872
873 The directory where staticperl stores all its files
874 (default: F<~/.staticperl>).
875
876 =item C<PERL_MM_USE_DEFAULT>, C<EV_EXTRA_DEFS>, ...
877
878 Usually set to C<1> to make modules "less inquisitive" during their
879 installation, you can set any environment variable you want - some modules
880 (such as L<Coro> or L<EV>) use environment variables for further tweaking.
881
882 =item C<PERL_VERSION>
883
884 The perl version to install - default is currently C<5.12.3>, but C<5.8.9>
885 is also a good choice (5.8.9 is much smaller than 5.12.3, while 5.10.1 is
886 about as big as 5.12.3).
887
888 =item C<PERL_PREFIX>
889
890 The prefix where perl gets installed (default: F<$STATICPERL/perl>),
891 i.e. where the F<bin> and F<lib> subdirectories will end up.
892
893 =item C<PERL_CONFIGURE>
894
895 Additional Configure options - these are simply passed to the perl
896 Configure script. For example, if you wanted to enable dynamic loading,
897 you could pass C<-Dusedl>. To enable ithreads (Why would you want that
898 insanity? Don't! Use L<forks> instead!) you would pass C<-Duseithreads>
899 and so on.
900
901 More commonly, you would either activate 64 bit integer support
902 (C<-Duse64bitint>), or disable large files support (-Uuselargefiles), to
903 reduce filesize further.
904
905 =item C<PERL_CC>, C<PERL_CCFLAGS>, C<PERL_OPTIMIZE>, C<PERL_LDFLAGS>, C<PERL_LIBS>
906
907 These flags are passed to perl's F<Configure> script, and are generally
908 optimised for small size (at the cost of performance). Since they also
909 contain subtle workarounds around various build issues, changing these
910 usually requires understanding their default values - best look at
911 the top of the F<staticperl> script for more info on these, and use a
912 F<~/.staticperlrc> to override them.
913
914 Most of the variables override (or modify) the corresponding F<Configure>
915 variable, except C<PERL_CCFLAGS>, which gets appended.
916
917 =back
918
919 =head4 Variables you probably I<do not want> to override
920
921 =over 4
922
923 =item C<MAKE>
924
925 The make command to use - default is C<make>.
926
927 =item C<MKBUNDLE>
928
929 Where F<staticperl> writes the C<mkbundle> command to
930 (default: F<$STATICPERL/mkbundle>).
931
932 =item C<STATICPERL_MODULES>
933
934 Additional modules needed by C<mkbundle> - should therefore not be changed
935 unless you know what you are doing.
936
937 =back
938
939 =head3 OVERRIDABLE HOOKS
940
941 In addition to environment variables, it is possible to provide some
942 shell functions that are called at specific times. To provide your own
943 commands, just define the corresponding function.
944
945 The actual order in which hooks are invoked during a full install
946 from scratch is C<preconfigure>, C<patchconfig>, C<postconfigure>,
947 C<postbuild>, C<postinstall>.
948
949 Example: install extra modules from CPAN and from some directories
950 at F<staticperl install> time.
951
952 postinstall() {
953 rm -rf lib/threads* # weg mit Schaden
954 instcpan IO::AIO EV
955 instsrc ~/src/AnyEvent
956 instsrc ~/src/XML-Sablotron-1.0100001
957 instcpan Anyevent::AIO AnyEvent::HTTPD
958 }
959
960 =over 4
961
962 =item preconfigure
963
964 Called just before running F<./Configure> in the perl source
965 directory. Current working directory is the perl source directory.
966
967 This can be used to set any C<PERL_xxx> variables, which might be costly
968 to compute.
969
970 =item patchconfig
971
972 Called after running F<./Configure> in the perl source directory to create
973 F<./config.sh>, but before running F<./Configure -S> to actually apply the
974 config. Current working directory is the perl source directory.
975
976 Can be used to tailor/patch F<config.sh> or do any other modifications.
977
978 =item postconfigure
979
980 Called after configuring, but before building perl. Current working
981 directory is the perl source directory.
982
983 =item postbuild
984
985 Called after building, but before installing perl. Current working
986 directory is the perl source directory.
987
988 I have no clue what this could be used for - tell me.
989
990 =item postinstall
991
992 Called after perl and any extra modules have been installed in C<$PREFIX>,
993 but before setting the "installation O.K." flag.
994
995 The current working directory is C<$PREFIX>, but maybe you should not rely
996 on that.
997
998 This hook is most useful to customise the installation, by deleting files,
999 or installing extra modules using the C<instcpan> or C<instsrc> functions.
1000
1001 The script must return with a zero exit status, or the installation will
1002 fail.
1003
1004 =back
1005
1006 =head1 ANATOMY OF A BUNDLE
1007
1008 When not building a new perl binary, C<mkbundle> will leave a number of
1009 files in the current working directory, which can be used to embed a perl
1010 interpreter in your program.
1011
1012 Intimate knowledge of L<perlembed> and preferably some experience with
1013 embedding perl is highly recommended.
1014
1015 C<mkperl> (or the C<--perl> option) basically does this to link the new
1016 interpreter (it also adds a main program to F<bundle.>):
1017
1018 $Config{cc} $(cat bundle.ccopts) -o perl bundle.c $(cat bundle.ldopts)
1019
1020 =over 4
1021
1022 =item bundle.h
1023
1024 A header file that contains the prototypes of the few symbols "exported"
1025 by bundle.c, and also exposes the perl headers to the application.
1026
1027 =over 4
1028
1029 =item staticperl_init (xs_init = 0)
1030
1031 Initialises the perl interpreter. You can use the normal perl functions
1032 after calling this function, for example, to define extra functions or
1033 to load a .pm file that contains some initialisation code, or the main
1034 program function:
1035
1036 XS (xsfunction)
1037 {
1038 dXSARGS;
1039
1040 // now we have items, ST(i) etc.
1041 }
1042
1043 static void
1044 run_myapp(void)
1045 {
1046 staticperl_init (0);
1047 newXSproto ("myapp::xsfunction", xsfunction, __FILE__, "$$;$");
1048 eval_pv ("require myapp::main", 1); // executes "myapp/main.pm"
1049 }
1050
1051 When your bootcode already wants to access some XS functions at
1052 compiletime, then you need to supply an C<xs_init> function pointer that
1053 is called as soon as perl is initialised enough to define XS functions,
1054 but before the preamble code is executed:
1055
1056 static void
1057 xs_init (pTHX)
1058 {
1059 newXSproto ("myapp::xsfunction", xsfunction, __FILE__, "$$;$");
1060 }
1061
1062 static void
1063 run_myapp(void)
1064 {
1065 staticperl_init (xs_init);
1066 }
1067
1068 =item staticperl_cleanup ()
1069
1070 In the unlikely case that you want to destroy the perl interpreter, here
1071 is the corresponding function.
1072
1073 =item staticperl_xs_init (pTHX)
1074
1075 Sometimes you need direct control over C<perl_parse> and C<perl_run>, in
1076 which case you do not want to use C<staticperl_init> but call them on your
1077 own.
1078
1079 Then you need this function - either pass it directly as the C<xs_init>
1080 function to C<perl_parse>, or call it as one of the first things from your
1081 own C<xs_init> function.
1082
1083 =item PerlInterpreter *staticperl
1084
1085 The perl interpreter pointer used by staticperl. Not normally so useful,
1086 but there it is.
1087
1088 =back
1089
1090 =item bundle.ccopts
1091
1092 Contains the compiler options required to compile at least F<bundle.c> and
1093 any file that includes F<bundle.h> - you should probably use it in your
1094 C<CFLAGS>.
1095
1096 =item bundle.ldopts
1097
1098 The linker options needed to link the final program.
1099
1100 =back
1101
1102 =head1 RUNTIME FUNCTIONALITY
1103
1104 Binaries created with C<mkbundle>/C<mkperl> contain extra functions, which
1105 are required to access the bundled perl sources, but might be useful for
1106 other purposes.
1107
1108 In addition, for the embedded loading of perl files to work, F<staticperl>
1109 overrides the C<@INC> array.
1110
1111 =over 4
1112
1113 =item $file = staticperl::find $path
1114
1115 Returns the data associated with the given C<$path>
1116 (e.g. C<Digest/MD5.pm>, C<auto/POSIX/autosplit.ix>), which is basically
1117 the UNIX path relative to the perl library directory.
1118
1119 Returns C<undef> if the file isn't embedded.
1120
1121 =item @paths = staticperl::list
1122
1123 Returns the list of all paths embedded in this binary.
1124
1125 =back
1126
1127 =head1 FULLY STATIC BINARIES - UCLIBC AND BUILDROOT
1128
1129 To make truly static (Linux-) libraries, you might want to have a look at
1130 buildroot (L<http://buildroot.uclibc.org/>).
1131
1132 Buildroot is primarily meant to set up a cross-compile environment (which
1133 is not so useful as perl doesn't quite like cross compiles), but it can also compile
1134 a chroot environment where you can use F<staticperl>.
1135
1136 To do so, download buildroot, and enable "Build options => development
1137 files in target filesystem" and optionally "Build options => gcc
1138 optimization level (optimize for size)". At the time of writing, I had
1139 good experiences with GCC 4.4.x but not GCC 4.5.
1140
1141 To minimise code size, I used C<-pipe -ffunction-sections -fdata-sections
1142 -finline-limit=8 -fno-builtin-strlen -mtune=i386>. The C<-mtune=i386>
1143 doesn't decrease codesize much, but it makes the file much more
1144 compressible.
1145
1146 If you don't need Coro or threads, you can go with "linuxthreads.old" (or
1147 no thread support). For Coro, it is highly recommended to switch to a
1148 uClibc newer than 0.9.31 (at the time of this writing, I used the 20101201
1149 snapshot) and enable NPTL, otherwise Coro needs to be configured with the
1150 ultra-slow pthreads backend to work around linuxthreads bugs (it also uses
1151 twice the address space needed for stacks).
1152
1153 If you use C<linuxthreads.old>, then you should also be aware that
1154 uClibc shares C<errno> between all threads when statically linking. See
1155 L<http://lists.uclibc.org/pipermail/uclibc/2010-June/044157.html> for a
1156 workaround (And L<https://bugs.uclibc.org/2089> for discussion).
1157
1158 C<ccache> support is also recommended, especially if you want
1159 to play around with buildroot options. Enabling the C<miniperl>
1160 package will probably enable all options required for a successful
1161 perl build. F<staticperl> itself additionally needs either C<wget>
1162 (recommended, for CPAN) or C<curl>.
1163
1164 As for shells, busybox should provide all that is needed, but the default
1165 busybox configuration doesn't include F<comm> which is needed by perl -
1166 either make a custom busybox config, or compile coreutils.
1167
1168 For the latter route, you might find that bash has some bugs that keep
1169 it from working properly in a chroot - either use dash (and link it to
1170 F</bin/sh> inside the chroot) or link busybox to F</bin/sh>, using it's
1171 built-in ash shell.
1172
1173 Finally, you need F</dev/null> inside the chroot for many scripts to work
1174 - F<cp /dev/null output/target/dev> or bind-mounting your F</dev> will
1175 both provide this.
1176
1177 After you have compiled and set up your buildroot target, you can copy
1178 F<staticperl> from the C<App::Staticperl> distribution or from your
1179 perl f<bin> directory (if you installed it) into the F<output/target>
1180 filesystem, chroot inside and run it.
1181
1182 =head1 RECIPES / SPECIFIC MODULES
1183
1184 This section contains some common(?) recipes and information about
1185 problems with some common modules or perl constructs that require extra
1186 files to be included.
1187
1188 =head2 MODULES
1189
1190 =over 4
1191
1192 =item utf8
1193
1194 Some functionality in the utf8 module, such as swash handling (used
1195 for unicode character ranges in regexes) is implemented in the
1196 C<"utf8_heavy.pl"> library:
1197
1198 -Mutf8_heavy.pl
1199
1200 Many Unicode properties in turn are defined in separate modules,
1201 such as C<"unicore/Heavy.pl"> and more specific data tables such as
1202 C<"unicore/To/Digit.pl"> or C<"unicore/lib/Perl/Word.pl">. These tables
1203 are big (7MB uncompressed, although F<staticperl> contains special
1204 handling for those files), so including them on demand by your application
1205 only might pay off.
1206
1207 To simply include the whole unicode database, use:
1208
1209 --incglob '/unicore/**.pl'
1210
1211 =item AnyEvent
1212
1213 AnyEvent needs a backend implementation that it will load in a delayed
1214 fashion. The L<AnyEvent::Impl::Perl> backend is the default choice
1215 for AnyEvent if it can't find anything else, and is usually a safe
1216 fallback. If you plan to use e.g. L<EV> (L<POE>...), then you need to
1217 include the L<AnyEvent::Impl::EV> (L<AnyEvent::Impl::POE>...) backend as
1218 well.
1219
1220 If you want to handle IRIs or IDNs (L<AnyEvent::Util> punycode and idn
1221 functions), you also need to include C<"AnyEvent/Util/idna.pl"> and
1222 C<"AnyEvent/Util/uts46data.pl">.
1223
1224 Or you can use C<--usepacklists> and specify C<-MAnyEvent> to include
1225 everything.
1226
1227 =item Carp
1228
1229 Carp had (in older versions of perl) a dependency on L<Carp::Heavy>. As of
1230 perl 5.12.2 (maybe earlier), this dependency no longer exists.
1231
1232 =item Config
1233
1234 The F<perl -V> switch (as well as many modules) needs L<Config>, which in
1235 turn might need L<"Config_heavy.pl">. Including the latter gives you
1236 both.
1237
1238 =item Term::ReadLine::Perl
1239
1240 Also needs L<Term::ReadLine::readline>, or C<--usepacklists>.
1241
1242 =item URI
1243
1244 URI implements schemes as separate modules - the generic URL scheme is
1245 implemented in L<URI::_generic>, HTTP is implemented in L<URI::http>. If
1246 you need to use any of these schemes, you should include these manually,
1247 or use C<--usepacklists>.
1248
1249 =back
1250
1251 =head2 RECIPES
1252
1253 =over 4
1254
1255 =item Just link everything in
1256
1257 To link just about everything installed in the perl library into a new
1258 perl, try this (the first time this runs it will take a long time, as a
1259 lot of files need to be parsed):
1260
1261 staticperl mkperl -v --strip ppi --incglob '*'
1262
1263 If you don't mind the extra megabytes, this can be a very effective way of
1264 creating bundles without having to worry about forgetting any modules.
1265
1266 You get even more useful variants of this method by first selecting
1267 everything, and then excluding stuff you are reasonable sure not to need -
1268 L<bigperl|http://staticperl.schmorp.de/bigperl.html> uses this approach.
1269
1270 =item Getting rid of netdb functions
1271
1272 The perl core has lots of netdb functions (C<getnetbyname>, C<getgrent>
1273 and so on) that few applications use. You can avoid compiling them in by
1274 putting the following fragment into a C<preconfigure> hook:
1275
1276 preconfigure() {
1277 for sym in \
1278 d_getgrnam_r d_endgrent d_endgrent_r d_endhent \
1279 d_endhostent_r d_endnent d_endnetent_r d_endpent \
1280 d_endprotoent_r d_endpwent d_endpwent_r d_endsent \
1281 d_endservent_r d_getgrent d_getgrent_r d_getgrgid_r \
1282 d_getgrnam_r d_gethbyaddr d_gethent d_getsbyport \
1283 d_gethostbyaddr_r d_gethostbyname_r d_gethostent_r \
1284 d_getlogin_r d_getnbyaddr d_getnbyname d_getnent \
1285 d_getnetbyaddr_r d_getnetbyname_r d_getnetent_r \
1286 d_getpent d_getpbyname d_getpbynumber d_getprotobyname_r \
1287 d_getprotobynumber_r d_getprotoent_r d_getpwent \
1288 d_getpwent_r d_getpwnam_r d_getpwuid_r d_getsent \
1289 d_getservbyname_r d_getservbyport_r d_getservent_r \
1290 d_getspnam_r d_getsbyname
1291 # d_gethbyname
1292 do
1293 PERL_CONFIGURE="$PERL_CONFIGURE -U$sym"
1294 done
1295 }
1296
1297 This mostly gains space when linking statically, as the functions will
1298 likely not be linked in. The gain for dynamically-linked binaries is
1299 smaller.
1300
1301 Also, this leaves C<gethostbyname> in - not only is it actually used
1302 often, the L<Socket> module also exposes it, so leaving it out usually
1303 gains little. Why Socket exposes a C function that is in the core already
1304 is anybody's guess.
1305
1306 =back
1307
1308 =head1 AUTHOR
1309
1310 Marc Lehmann <schmorp@schmorp.de>
1311 http://software.schmorp.de/pkg/staticperl.html