--- App-Staticperl/staticperl.pod 2010/12/08 23:03:21 1.16 +++ App-Staticperl/staticperl.pod 2010/12/10 02:35:54 1.18 @@ -69,17 +69,21 @@ F loads all required files directly from memory. There is no need to unpack files into a temporary directory. -=item * More control over included files. +=item * More control over included files, more burden. PAR tries to be maintenance and hassle-free - it tries to include more -files than necessary to make sure everything works out of the box. The -extra files (such as the unicode database) can take substantial amounts of -memory and file size. +files than necessary to make sure everything works out of the box. It +mostly succeeds at this, but he extra files (such as the unicode database) +can take substantial amounts of memory and file size. With F, the burden is mostly with the developer - only direct compile-time dependencies and L are handled automatically. This means the modules to include often need to be tweaked manually. +All this does not preclude more permissive modes to be implemented in +the future, but right now, you have to resolve state hidden dependencies +manually. + =item * PAR works out of the box, F does not. Maintaining your own custom perl build can be a pain in the ass, and while @@ -109,9 +113,9 @@ except everything is compiled in), or you create bundle files (basically C sources you can use to embed all files into your project). -This step is very fast (a few seconds if PPI is not used for stripping, -more seconds otherwise, as PPI is very slow), and can be tweaked and -repeated as often as necessary. +This step is very fast (a few seconds if PPI is not used for stripping, or +the stripped files are in the cache), and can be tweaked and repeated as +often as necessary. =head1 THE F SCRIPT @@ -305,11 +309,12 @@ pod documentation, which is very fast and reduces file size a lot. The C method uses L to parse and condense the perl sources. This -saves a lot more than just L, and is generally safer, but -is also a lot slower, so is best used for production builds. Note that -this method doesn't optimise for raw file size, but for best compression -(that means that the uncompressed file size is a bit larger, but the files -compress better, e.g. with F). +saves a lot more than just L, and is generally safer, +but is also a lot slower (some files take almost a minute to strip - +F maintains a cache of stripped files to speed up subsequent +runs for this reason). Note that this method doesn't optimise for raw file +size, but for best compression (that means that the uncompressed file size +is a bit larger, but the files compress better, e.g. with F). Last not least, if you need accurate line numbers in error messages, or in the unlikely case where C is too slow, or some module gets @@ -413,7 +418,24 @@ the perl interpreter executes scripts given on the command line (or via C<-e>). This works even in an embedded interpreter. -=item --add "file" | --add "file alias" +=item --incglob pattern + +This goes through all library directories and tries to match any F<.pm> +and F<.pl> files against the extended glob pattern (see below). If a file +matches, it is added. This switch will automatically detect L +files and the required link libraries for XS modules, but it will I +scan the file for dependencies (at the moment). + +This is mainly useful to include "everything": + + --incglob '*' + +Or to include perl libraries, or trees of those, such as the unicode +database files needed by many other modules: + + --incglob '/unicore/**.pl' + +=item --add file | --add "file alias" Adds the given (perl) file into the bundle (and optionally call it "alias"). This is useful to include any custom files into the bundle. @@ -429,7 +451,7 @@ add file2 myfiles/file2 add file3 myfiles/file3 -=item --binadd "file" | --add "file alias" +=item --binadd file | --add "file alias" Just like C<--add>, except that it treats the file as binary and adds it without any processing. @@ -441,6 +463,19 @@ You can later get a copy of these files by calling C. +=item --include pattern | -i pattern | --exclude pattern | -x pattern + +These two options define an include/exclude filter that is used after all +files selected by the other options have been found. Each include/exclude +is applied to all files found so far - an include makes sure that the +given files will be part of the resulting file set, an exclude will +exclude files. The patterns are "extended glob patterns" (see below). + +For example, to include everything, except C modules, but still +include F, you could use this: + + --incglob '*' -i '/Devel/PPPort.pm' -x '/Devel/**' + =item --static When C<--perl> is also given, link statically instead of dynamically. The @@ -454,6 +489,24 @@ executables, or try the C<--staticlibs> option to link only some libraries statically. +=item --staticlib libname + +When not linking fully statically, this option allows you to link specific +libraries statically. What it does is simply replace all occurances of +C<-llibname> with the GCC-specific C<-Wl,-Bstatic -llibname -Wl,-Bdynamic> +option. + +This will have no effect unless the library is actually linked against, +specifically, C<--staticlib> will not link against the named library +unless it would be linked against anyway. + +Example: link libcrypt statically into the binary. + + staticperl mkperl -MIO::AIO --staticlib crypt + + # ldopts might nwo contain: + # -lm -Wl,-Bstatic -lcrypt -Wl,-Bdynamic -lpthread + =item any other argument Any other argument is interpreted as a bundle specification file, which @@ -461,6 +514,44 @@ =back +=head3 EXTENDED GLOB PATTERNS + +Some options of F expect an I. This is neither a normal shell glob nor a regex, but something +in between. The idea has been copied from rsync, and there are the current +matching rules: + +=over 4 + +=item Patterns starting with F will be a anchored at the root of the library tree. + +That is, F will match the F directory in C<@INC>, but +nothing inside, and neither any other file or directory called F +anywhere else in the hierarchy. + +=item Patterns not starting with F will be anchored at the end of the path. + +That is, F will match any file called F anywhere in the +hierarchy, but not any directories of the same name. + +=item A F<*> matches any single component. + +That is, F would match all F<.pl> files directly inside +C, not any deeper level F<.pl> files. Or in other words, F<*> +will not match slashes. + +=item A F<**> matches anything. + +That is, F would match all F<.pl> files under F, +no matter how deeply nested they are inside subdirectories. + +=item A F matches a single character within a component. + +That is, F matches F, but not the +hypothetical F, as F does not match F. + +=back + =head2 F CONFIGURATION AND HOOKS During (each) startup, F tries to source the following shell @@ -785,6 +876,120 @@ perl f directory (if you installed it) into the F filesystem, chroot inside and run it. +=head1 RECIPES / SPECIFIC MODULES + +This section contains some common(?) recipes and information about +problems with some common modules or perl constructs that require extra +files to be included. + +=head2 MODULES + +=over 4 + +=item utf8 + +Some functionality in the utf8 module, such as swash handling (used +for unicode character ranges in regexes) is implemented in the +C<"utf8_heavy.pl"> library: + + -M'"utf8_heavy.pl"' + +Many Unicode properties in turn are defined in separate modules, +such as C<"unicore/Heavy.pl"> and more specific data tables such as +C<"unicore/To/Digit.pl"> or C<"unicore/lib/Perl/Word.pl">. These tables +are big (7MB uncompressed, although F contains special +handling for those files), so including them on demand by your application +only might pay off. + +To simply include the whole unicode database, use: + + --incglob '/unicore/*.pl' + +=item AnyEvent + +AnyEvent needs a backend implementation that it will load in a delayed +fashion. The L backend is the default choice +for AnyEvent if it can't find anything else, and is usually a safe +fallback. If you plan to use e.g. L (L...), then you need to +include the L (L...) backend as +well. + +If you want to handle IRIs or IDNs (L punycode and idn +functions), you also need to include C<"AnyEvent/Util/idna.pl"> and +C<"AnyEvent/Util/uts46data.pl">. + +=item Carp + +Carp had (in older versions of perl) a dependency on L. As of +perl 5.12.2 (maybe earlier), this dependency no longer exists. + +=item Config + +The F switch (as well as many modules) needs L, which in +turn might need L<"Config_heavy.pl">. Including the latter gives you +both. + +=item Term::ReadLine::Perl + +Also needs L. + +=item URI + +URI implements schemes as separate modules - the generic URL scheme is +implemented in L, HTTP is implemented in L. If +you need to use any of these schemes, you should include these manually. + +=back + +=head2 RECIPES + +=over 4 + +=item Linking everything in + +To link just about everything installed in the perl library into a new +perl, try this: + + staticperl mkperl --strip ppi --incglob '*' + +=item Getting rid of netdb function + +The perl core has lots of netdb functions (C, C +and so on) that few applications use. You can avoid compiling them in by +putting the following fragment into a C hook: + + preconfigure() { + for sym in \ + d_getgrnam_r d_endgrent d_endgrent_r d_endhent \ + d_endhostent_r d_endnent d_endnetent_r d_endpent \ + d_endprotoent_r d_endpwent d_endpwent_r d_endsent \ + d_endservent_r d_getgrent d_getgrent_r d_getgrgid_r \ + d_getgrnam_r d_gethbyaddr d_gethent d_getsbyport \ + d_gethostbyaddr_r d_gethostbyname_r d_gethostent_r \ + d_getlogin_r d_getnbyaddr d_getnbyname d_getnent \ + d_getnetbyaddr_r d_getnetbyname_r d_getnetent_r \ + d_getpent d_getpbyname d_getpbynumber d_getprotobyname_r \ + d_getprotobynumber_r d_getprotoent_r d_getpwent \ + d_getpwent_r d_getpwnam_r d_getpwuid_r d_getsent \ + d_getservbyname_r d_getservbyport_r d_getservent_r \ + d_getspnam_r d_getsbyname + # d_gethbyname + do + PERL_CONFIGURE="$PERL_CONFIGURE -U$sym" + done + } + +This mostly gains space when linking staticaly, as the functions will +liekly not be linked in. The gain for dynamically-linked binaries is +smaller. + +Also, this leaves C in - not only is it actually used +often, the L module also exposes it, so leaving it out usually +gains little. Why Socket exposes a C function that is in the core already +is anybody's guess. + +=back + =head1 AUTHOR Marc Lehmann