=head1 NAME staticperl - perl, libc, 100 modules, all in one standalone 500kb file =head1 SYNOPSIS staticperl help # print the embedded documentation staticperl fetch # fetch and unpack perl sources staticperl configure # fetch and then configure perl staticperl build # configure and then build perl staticperl install # build and then install perl staticperl clean # clean most intermediate files (restart at configure) staticperl distclean # delete everything installed by this script staticperl perl ... # invoke the perlinterpreter staticperl cpan # invoke CPAN shell staticperl instsrc path... # install unpacked modules staticperl instcpan modulename... # install modules from CPAN staticperl mkbundle # see documentation staticperl mkperl # see documentation staticperl mkapp appname # see documentation Typical Examples: staticperl install # fetch, configure, build and install perl staticperl cpan # run interactive cpan shell staticperl mkperl -MConfig_heavy.pl # build a perl that supports -V staticperl mkperl -MAnyEvent::Impl::Perl -MAnyEvent::HTTPD -MURI -MURI::http # build a perl with the above modules linked in staticperl mkapp myapp --boot mainprog mymodules # build a binary "myapp" from mainprog and mymodules =head1 DESCRIPTION This script helps you to create single-file perl interpreters or applications, or embedding a perl interpreter in your applications. Single-file means that it is fully self-contained - no separate shared objects, no autoload fragments, no .pm or .pl files are needed. And when linking statically, you can create (or embed) a single file that contains perl interpreter, libc, all the modules you need, all the libraries you need and of course your actual program. With F and F on x86, you can create a single 500kb binary that contains perl and 100 modules such as POSIX, AnyEvent, EV, IO::AIO, Coro and so on. Or any other choice of modules (and some other size :). To see how this turns out, you can try out smallperl and bigperl, two pre-built static and compressed perl binaries with many and even more modules: just follow the links at L. The created files do not need write access to the file system (like PAR does). In fact, since this script is in many ways similar to PAR::Packer, here are the differences: =over 4 =item * The generated executables are much smaller than PAR created ones. Shared objects and the perl binary contain a lot of extra info, while the static nature of F allows the linker to remove all functionality and meta-info not required by the final executable. Even extensions statically compiled into perl at build time will only be present in the final executable when needed. In addition, F can strip perl sources much more effectively than PAR. =item * The generated executables start much faster. There is no need to unpack files, or even to parse Zip archives (which is slow and memory-consuming business). =item * The generated executables don't need a writable filesystem. F loads all required files directly from memory. There is no need to unpack files into a temporary directory. =item * More control over included files, more burden. PAR tries to be maintenance and hassle-free - it tries to include more files than necessary to make sure everything works out of the box. It mostly succeeds at this, but he extra files (such as the unicode database) can take substantial amounts of memory and file size. With F, the burden is mostly with the developer - only direct compile-time dependencies and L are handled automatically. This means the modules to include often need to be tweaked manually. All this does not preclude more permissive modes to be implemented in the future, but right now, you have to resolve hidden dependencies manually. =item * PAR works out of the box, F does not. Maintaining your own custom perl build can be a pain in the ass, and while F tries to make this easy, it still requires a custom perl build and possibly fiddling with some modules. PAR is likely to produce results faster. Ok, PAR never has worked for me out of the box, and for some people, F does work out of the box, as they don't count "fiddling with module use lists" against it, but nevertheless, F is certainly a bit more difficult to use. =back =head1 HOW DOES IT WORK? Simple: F downloads, compile and installs a perl version of your choice in F<~/.staticperl>. You can add extra modules either by letting F install them for you automatically, or by using CPAN and doing it interactively. This usually takes 5-10 minutes, depending on the speed of your computer and your internet connection. It is possible to do program development at this stage, too. Afterwards, you create a list of files and modules you want to include, and then either build a new perl binary (that acts just like a normal perl except everything is compiled in), or you create bundle files (basically C sources you can use to embed all files into your project). This step is very fast (a few seconds if PPI is not used for stripping, or the stripped files are in the cache), and can be tweaked and repeated as often as necessary. =head1 THE F SCRIPT This module installs a script called F into your perl binary directory. The script is fully self-contained, and can be used without perl (for example, in an uClibc chroot environment). In fact, it can be extracted from the C distribution tarball as F, without any installation. The newest (possibly alpha) version can also be downloaded from L. F interprets the first argument as a command to execute, optionally followed by any parameters. There are two command categories: the "phase 1" commands which deal with installing perl and perl modules, and the "phase 2" commands, which deal with creating binaries and bundle files. =head2 PHASE 1 COMMANDS: INSTALLING PERL The most important command is F, which does basically everything. The default is to download and install perl 5.12.3 and a few modules required by F itself, but all this can (and should) be changed - see L, below. The command staticperl install is normally all you need: It installs the perl interpreter in F<~/.staticperl/perl>. It downloads, configures, builds and installs the perl interpreter if required. Most of the following F subcommands simply run one or more steps of this sequence. If it fails, then most commonly because the compiler options I selected are not supported by your compiler - either edit the F script yourself or create F<~/.staticperl> shell script where your set working C etc. variables. To force recompilation or reinstallation, you need to run F first. =over 4 =item F Prints some info about the version of the F script you are using. =item F Runs only the download and unpack phase, unless this has already happened. =item F Configures the unpacked perl sources, potentially after downloading them first. =item F Builds the configured perl sources, potentially after automatically configuring them. =item F Wipes the perl installation directory (usually F<~/.staticperl/perl>) and installs the perl distribution, potentially after building it first. =item F [args...] Invokes the compiled perl interpreter with the given args. Basically the same as starting perl directly (usually via F<~/.staticperl/bin/perl>), but beats typing the path sometimes. Example: check that the Gtk2 module is installed and loadable. staticperl perl -MGtk2 -e0 =item F [args...] Starts an interactive CPAN shell that you can use to install further modules. Installs the perl first if necessary, but apart from that, no magic is involved: you could just as well run it manually via F<~/.staticperl/perl/bin/cpan>, except that F additionally sets the environment variable C<$PERL> to the path of the perl interpreter, which is handy in subshells. Any additional arguments are simply passed to the F command. =item F module... Tries to install all the modules given and their dependencies, using CPAN. Example: staticperl instcpan EV AnyEvent::HTTPD Coro =item F directory... In the unlikely case that you have unpacked perl modules around and want to install from these instead of from CPAN, you can do this using this command by specifying all the directories with modules in them that you want to have built. =item F Deletes the perl source directory (and potentially cleans up other intermediate files). This can be used to clean up files only needed for building perl, without removing the installed perl interpreter. At the moment, it doesn't delete downloaded tarballs. The exact semantics of this command will probably change. =item F This wipes your complete F<~/.staticperl> directory. Be careful with this, it nukes your perl download, perl sources, perl distribution and any installed modules. It is useful if you wish to start over "from scratch" or when you want to uninstall F. =back =head2 PHASE 2 COMMANDS: BUILDING PERL BUNDLES Building (linking) a new F binary is handled by a separate script. To make it easy to use F from a F, the script is embedded into F, which will write it out and call for you with any arguments you pass: staticperl mkbundle mkbundle-args... In the oh so unlikely case of something not working here, you can run the script manually as well (by default it is written to F<~/.staticperl/mkbundle>). F is a more conventional command and expect the argument syntax commonly used on UNIX clones. For example, this command builds a new F binary and includes F (for F), F, F and a custom F script (from F in this distribution): # first make sure we have perl and the required modules staticperl instcpan AnyEvent::HTTPD # now build the perl staticperl mkperl -MConfig_heavy.pl -MAnyEvent::Impl::Perl \ -MAnyEvent::HTTPD -MURI::http \ --add 'eg/httpd httpd.pm' # finally, invoke it ./perl -Mhttpd As you can see, things are not quite as trivial: the L module has a hidden dependency which is not even a perl module (F), L needs at least one event loop backend that we have to specify manually (here L), and the F module (required by L) implements various URI schemes as extra modules - since L only needs C URIs, we only need to include that module. I found out about these dependencies by carefully watching any error messages about missing modules... Instead of building a new perl binary, you can also build a standalone application: # build the app staticperl mkapp app --boot eg/httpd \ -MAnyEvent::Impl::Perl -MAnyEvent::HTTPD -MURI::http # run it ./app Here are the three phase 2 commands: =over 4 =item F args... The "default" bundle command - it interprets the given bundle options and writes out F, F, F and F files, useful for embedding. =item F args... Creates a bundle just like F (in fact, it's the same as invoking F args...), but then compiles and links a new perl interpreter that embeds the created bundle, then deletes all intermediate files. =item F filename args... Does the same as F (in fact, it's the same as invoking F filename args...), but then compiles and links a new standalone application that simply initialises the perl interpreter. The difference to F is that the standalone application does not act like a perl interpreter would - in fact, by default it would just do nothing and exit immediately, so you should specify some code to be executed via the F<--boot> option. =back =head3 OPTION PROCESSING All options can be given as arguments on the command line (typically using long (e.g. C<--verbose>) or short option (e.g. C<-v>) style). Since specifying a lot of options can make the command line very long and unwieldy, you can put all long options into a "bundle specification file" (one option per line, with or without C<--> prefix) and specify this bundle file instead. For example, the command given earlier to link a new F could also look like this: staticperl mkperl httpd.bundle With all options stored in the F file (one option per line, everything after the option is an argument): use "Config_heavy.pl" use AnyEvent::Impl::Perl use AnyEvent::HTTPD use URI::http add eg/httpd httpd.pm All options that specify modules or files to be added are processed in the order given on the command line. =head3 BUNDLE CREATION WORKFLOW / STATICPELR MKBUNDLE OPTIONS F works by first assembling a list of candidate files and modules to include, then filtering them by include/exclude patterns. The remaining modules (together with their direct dependencies, such as link libraries and L files) are then converted into bundle files suitable for embedding. F can then optionally build a new perl interpreter or a standalone application. =over 4 =item Step 0: Generic argument processing. The following options influence F itself. =over 4 =item C<--verbose> | C<-v> Increases the verbosity level by one (the default is C<1>). =item C<--quiet> | C<-q> Decreases the verbosity level by one. =item any other argument Any other argument is interpreted as a bundle specification file, which supports all options (without extra quoting), one option per line, in the format C, using it's built-in ash shell. Finally, you need F inside the chroot for many scripts to work - either F or bind-mounting your F will provide this. After you have compiled and set up your buildroot target, you can copy F from the C distribution or from your perl F directory (if you installed it) into the F filesystem, chroot inside and run it. =head1 RECIPES / SPECIFIC MODULES This section contains some common(?) recipes and information about problems with some common modules or perl constructs that require extra files to be included. =head2 MODULES =over 4 =item utf8 Some functionality in the utf8 module, such as swash handling (used for unicode character ranges in regexes) is implemented in the C<"utf8_heavy.pl"> library: -Mutf8_heavy.pl Many Unicode properties in turn are defined in separate modules, such as C<"unicore/Heavy.pl"> and more specific data tables such as C<"unicore/To/Digit.pl"> or C<"unicore/lib/Perl/Word.pl">. These tables are big (7MB uncompressed, although F contains special handling for those files), so including them on demand by your application only might pay off. To simply include the whole unicode database, use: --incglob '/unicore/**.pl' =item AnyEvent AnyEvent needs a backend implementation that it will load in a delayed fashion. The L backend is the default choice for AnyEvent if it can't find anything else, and is usually a safe fallback. If you plan to use e.g. L (L...), then you need to include the L (L...) backend as well. If you want to handle IRIs or IDNs (L punycode and idn functions), you also need to include C<"AnyEvent/Util/idna.pl"> and C<"AnyEvent/Util/uts46data.pl">. Or you can use C<--usepacklists> and specify C<-MAnyEvent> to include everything. =item Cairo See Glib, same problem, same solution. =item Carp Carp had (in older versions of perl) a dependency on L. As of perl 5.12.2 (maybe earlier), this dependency no longer exists. =item Config The F switch (as well as many modules) needs L, which in turn might need L<"Config_heavy.pl">. Including the latter gives you both. =item Glib Glib literally requires Glib to be installed already to build - it tries to fake this by running Glib out of the build directory before being built. F tries to work around this by forcing C and C to be empty via the C environment variable. =item Gtk2 See Pango, same problems, same solution. =item Pango In addition to the C problem in Glib, Pango also routes around L by compiling its files on its own. F tries to patch L to route around Pango. =item Term::ReadLine::Perl Also needs L, or C<--usepacklists>. =item URI URI implements schemes as separate modules - the generic URL scheme is implemented in L, HTTP is implemented in L. If you need to use any of these schemes, you should include these manually, or use C<--usepacklists>. =back =head2 RECIPES =over 4 =item Just link everything in To link just about everything installed in the perl library into a new perl, try this (the first time this runs it will take a long time, as a lot of files need to be parsed): staticperl mkperl -v --strip ppi --incglob '*' If you don't mind the extra megabytes, this can be a very effective way of creating bundles without having to worry about forgetting any modules. You get even more useful variants of this method by first selecting everything, and then excluding stuff you are reasonable sure not to need - L uses this approach. =item Getting rid of netdb functions The perl core has lots of netdb functions (C, C and so on) that few applications use. You can avoid compiling them in by putting the following fragment into a C hook: preconfigure() { for sym in \ d_getgrnam_r d_endgrent d_endgrent_r d_endhent \ d_endhostent_r d_endnent d_endnetent_r d_endpent \ d_endprotoent_r d_endpwent d_endpwent_r d_endsent \ d_endservent_r d_getgrent d_getgrent_r d_getgrgid_r \ d_getgrnam_r d_gethbyaddr d_gethent d_getsbyport \ d_gethostbyaddr_r d_gethostbyname_r d_gethostent_r \ d_getlogin_r d_getnbyaddr d_getnbyname d_getnent \ d_getnetbyaddr_r d_getnetbyname_r d_getnetent_r \ d_getpent d_getpbyname d_getpbynumber d_getprotobyname_r \ d_getprotobynumber_r d_getprotoent_r d_getpwent \ d_getpwent_r d_getpwnam_r d_getpwuid_r d_getsent \ d_getservbyname_r d_getservbyport_r d_getservent_r \ d_getspnam_r d_getsbyname # d_gethbyname do PERL_CONFIGURE="$PERL_CONFIGURE -U$sym" done } This mostly gains space when linking statically, as the functions will likely not be linked in. The gain for dynamically-linked binaries is smaller. Also, this leaves C in - not only is it actually used often, the L module also exposes it, so leaving it out usually gains little. Why Socket exposes a C function that is in the core already is anybody's guess. =back =head1 AUTHOR Marc Lehmann http://software.schmorp.de/pkg/staticperl.html