ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/Faster/Faster.pm
(Generate patch)

Comparing Faster/Faster.pm (file contents):
Revision 1.25 by root, Sat Mar 11 04:58:53 2006 UTC vs.
Revision 1.33 by root, Mon Mar 13 17:10:32 2006 UTC

8 8
9 perl -MFaster ... 9 perl -MFaster ...
10 10
11=head1 DESCRIPTION 11=head1 DESCRIPTION
12 12
13This module implements a very simple-minded JIT. It works by more or less 13This module implements a very simple-minded "JIT" (or actually AIT, ahead
14translating every function it sees into a C program, compiling it and then 14of time compiler). It works by more or less translating every function it
15replacing the function by the compiled code. 15sees into a C program, compiling it and then replacing the function by the
16compiled code.
16 17
17As a result, startup times are immense, as every function might lead to a 18As a result, startup times are immense, as every function might lead to a
18full-blown compilation. 19full-blown compilation.
19 20
20The speed improvements are also not great, you can expect 20% or so on 21The speed improvements are also not great, you can expect 20% or so on
21average, for code that runs very often. 22average, for code that runs very often. The reason for this is that data
23handling is mostly being done by the same old code, it just gets called
24a bit faster. Regexes and string operations won't get faster. Airhtmetic
25doresn't become any faster. Just the operands and other stuff is put on
26the stack faster, and the opcodes themselves have a bit less overhead.
22 27
23Faster is in the early stages of development. Due to its design its 28Faster is in the early stages of development. Due to its design its
24relatively safe to use (it will either work or simply slowdown the program 29relatively safe to use (it will either work or simply slowdown the program
25immensely, but rarely cause bugs). 30immensely, but rarely cause bugs).
26 31
32More intelligent algorithms (loop optimisation, type inference) could
33improve that easily, but requires a much more elaborate presentation and
34optimiser than what is in place. There are no plans to improve Faster in
35this way, yet, but it would provide a reasonably good place to start.
36
27Usage is very easy, just C<use Faster> and every function called from then 37Usage is very easy, just C<use Faster> and every function called from then
28on will be compiled. 38on will be compiled.
29 39
30Right now, Faster will leave lots of F<*.c>, F<*.o> and F<*.so> files in 40Right now, Faster can leave lots of F<*.c> and F<*.so> files in your
31F</tmp>, and it will even create those temporary files in an insecure 41F<$FASTER_CACHEDIR> (by default F<$HOME/.perl-faster-cache>), and it will
32manner, so watch out. 42even create those temporary files in an insecure manner, so watch out.
33 43
34=over 4 44=over 4
35 45
36=cut 46=cut
37 47
40no warnings; 50no warnings;
41 51
42use strict; 52use strict;
43use Config; 53use Config;
44use B (); 54use B ();
45#use Digest::MD5 ();
46use DynaLoader (); 55use DynaLoader ();
47use File::Temp (); 56use Digest::MD5 ();
57use Storable ();
58use Fcntl ();
48 59
49BEGIN { 60BEGIN {
50 our $VERSION = '0.01'; 61 our $VERSION = '0.01';
51 62
52 require XSLoader; 63 require XSLoader;
53 XSLoader::load __PACKAGE__, $VERSION; 64 XSLoader::load __PACKAGE__, $VERSION;
54} 65}
66
67my $CACHEDIR =
68 $ENV{FASTER_CACHE}
69 || (exists $ENV{HOME} && "$ENV{HOME}/.perl-faster-cache")
70 || do {
71 require File::Temp;
72 File::Temp::tempdir (CLEANUP => 1)
73 };
55 74
56my $COMPILE = "$Config{cc} -c -I$Config{archlibexp}/CORE $Config{optimize} $Config{ccflags} $Config{cccdlflags}"; 75my $COMPILE = "$Config{cc} -c -I$Config{archlibexp}/CORE $Config{optimize} $Config{ccflags} $Config{cccdlflags}";
57my $LINK = "$Config{ld} $Config{ldflags} $Config{lddlflags} $Config{ccdlflags}"; 76my $LINK = "$Config{ld} $Config{ldflags} $Config{lddlflags} $Config{ccdlflags}";
58my $LIBS = "$Config{libs}"; 77my $LIBS = "$Config{libs}";
59my $_o = $Config{_o}; 78my $_o = $Config{_o};
61 80
62# we don't need no steenking PIC on x86 81# we don't need no steenking PIC on x86
63$COMPILE =~ s/-f(?:PIC|pic)//g 82$COMPILE =~ s/-f(?:PIC|pic)//g
64 if $Config{archname} =~ /^(i[3456]86)-/; 83 if $Config{archname} =~ /^(i[3456]86)-/;
65 84
66my $opt_assert = $ENV{FASTER_DEBUG}; 85my $opt_assert = $ENV{FASTER_DEBUG} > 1;
67my $verbose = $ENV{FASTER_VERBOSE}+0; 86my $verbose = $ENV{FASTER_VERBOSE}+0;
87
88warn "Faster: CACHEDIR is $CACHEDIR\n" if $verbose > 2;
68 89
69our $source; 90our $source;
70 91
71our @ops; 92our @ops;
72our $insn; 93our $insn;
133# ops that do not need an ASYNC_CHECK 154# ops that do not need an ASYNC_CHECK
134my %f_noasync = map +($_ => undef), qw( 155my %f_noasync = map +($_ => undef), qw(
135 mapstart grepstart match entereval 156 mapstart grepstart match entereval
136 enteriter entersub leaveloop 157 enteriter entersub leaveloop
137 158
138 pushmark nextstate 159 pushmark nextstate caller
139 160
140 const stub unstack 161 const stub unstack
141 last next redo seq 162 last next redo goto seq
142 padsv padav padhv padany 163 padsv padav padhv padany
143 aassign sassign orassign 164 aassign sassign orassign
144 rv2av rv2cv rv2gv rv2hv refgen 165 rv2av rv2cv rv2gv rv2hv refgen
145 gv gvsv 166 gv gvsv
146 add subtract multiply divide 167 add subtract multiply divide
147 complement cond_expr and or not 168 complement cond_expr and or not
169 bit_and bit_or bit_xor
148 defined 170 defined
149 method method_named bless 171 method method_named bless
150 preinc postinc predec postdec 172 preinc postinc predec postdec
151 aelem aelemfast helem delete exists 173 aelem aelemfast helem delete exists
152 pushre subst list join split concat 174 pushre subst list lslice join split concat
153 length substr stringify ord 175 length substr stringify ord
154 push pop shift unshift 176 push pop shift unshift
155 eq ne gt lt ge le 177 eq ne gt lt ge le
156 regcomp regcreset regcmaybe 178 regcomp regcreset regcmaybe
157); 179);
158 180
159my %callop = ( 181my %callop = (
160 entersub => "(PL_ppaddr [OP_ENTERSUB]) (aTHX)", 182 entersub => "(PL_op->op_ppaddr) (aTHX)",
161 mapstart => "Perl_pp_grepstart (aTHX)", 183 mapstart => "Perl_pp_grepstart (aTHX)",
162); 184);
163 185
164sub callop { 186sub callop {
165 $callop{$op_name} || "Perl_pp_$op_name (aTHX)" 187 $callop{$op_name} || "Perl_pp_$op_name (aTHX)"
213 235
214 out_next; 236 out_next;
215} 237}
216 238
217sub op_pushmark { 239sub op_pushmark {
218 $source .= " PUSHMARK (PL_stack_sp);\n"; 240 $source .= " faster_PUSHMARK (PL_stack_sp);\n";
219 241
220 out_next; 242 out_next;
221} 243}
222 244
223if ($Config{useithreads} ne "define") { 245if ($Config{useithreads} ne "define") {
224 # disable optimisations on ithreads 246 # disable optimisations on ithreads
225 247
226 *op_const = sub { 248 *op_const = sub {
227 $source .= " { dSP; PUSHs ((SV *)${$op->sv}L); PUTBACK; }\n"; 249 $source .= " { dSP; PUSHs ((SV *)${$op->sv}L); PUTBACK; }\n";
250
251 $ops[0]{follows_const}++ if @ops;#d#
228 252
229 out_next; 253 out_next;
230 }; 254 };
231 255
232 *op_gv = \&op_const; 256 *op_gv = \&op_const;
333 357
334sub op_padsv { 358sub op_padsv {
335 my $flags = $op->flags; 359 my $flags = $op->flags;
336 my $padofs = "(PADOFFSET)" . $op->targ; 360 my $padofs = "(PADOFFSET)" . $op->targ;
337 361
338 #d#TODO: why does our version break
339 # breaks gce with can't coerce array....
340 if (($flags & B::OPf_MOD) && ($op->private & B::OPpDEREF)) {#d#
341 return out_linear;#d#
342 }#d#
343
344 $source .= <<EOF; 362 $source .= <<EOF;
345 { 363 {
346 dSP; 364 dSP;
347 SV *sv = PAD_SVl ($padofs); 365 SV *sv = PAD_SVl ($padofs);
348EOF 366EOF
349 367
350 if (($flags & B::OPf_MOD) && ($op->private & B::OPpLVAL_INTRO)) { 368 if (($flags & B::OPf_MOD) && ($op->private & B::OPpLVAL_INTRO)) {
351 $source .= " SAVECLEARSV (PAD_SVl ($padofs));\n"; 369 $source .= " SAVECLEARSV (PAD_SVl ($padofs));\n";
352 $ops[0]{pre_padsv_lval_intro}++ if @ops;#d# 370 $ops[0]{follows_padsv_lval_intro}++ if @ops;#d#
353 } 371 }
354 372
355 $source .= <<EOF; 373 $source .= <<EOF;
356 PUSHs (sv); 374 PUSHs (sv);
357 PUTBACK; 375 PUTBACK;
358EOF 376EOF
359 377
360 if (($flags & B::OPf_MOD) && ($op->private & B::OPpDEREF)) { 378 if (($flags & B::OPf_MOD) && ($op->private & B::OPpDEREF)) {
361 $source .= " vivify_ref (sv, $flags & OPpDEREF);\n"; 379 $source .= " if (!SvROK (sv)) vivify_ref (sv, " . $op->private . " & OPpDEREF);\n";
362 } 380 }
363 $source .= " }\n"; 381 $source .= " }\n";
364 382
365 out_next; 383 out_next;
366} 384}
372 dPOPTOPssrl; 390 dPOPTOPssrl;
373EOF 391EOF
374 $source .= " SV *temp = left; left = right; right = temp;\n" 392 $source .= " SV *temp = left; left = right; right = temp;\n"
375 if $op->private & B::OPpASSIGN_BACKWARDS; 393 if $op->private & B::OPpASSIGN_BACKWARDS;
376 394
377 if ($insn->{pre_padsv_lval_intro} && !($op->private & B::OPpASSIGN_BACKWARDS)) { 395 if ($insn->{follows_padsv_lval_intro} && !($op->private & B::OPpASSIGN_BACKWARDS)) {
378 # simple assignment - the target exists, but is basically undef 396 # simple assignment - the target exists, but is basically undef
379 $source .= " SvSetSV (right, left);\n"; 397 $source .= " SvSetSV (right, left);\n";
380 } else { 398 } else {
381 $source .= " SvSetMagicSV (right, left);\n"; 399 $source .= " SvSetMagicSV (right, left);\n";
382 } 400 }
389 407
390 out_next; 408 out_next;
391} 409}
392 410
393# pattern const+ (or general push1) 411# pattern const+ (or general push1)
394# pattern pushmark return(?)
395# pattern pushmark gv rv2av pushmark padsv+o.ä. aassign 412# pattern pushmark gv rv2av pushmark padsv+o.ä. aassign
396 413
397# pattern const method_named
398sub op_method_named { 414sub op_method_named {
415 if ($insn->{follows_const}) {
399 $source .= <<EOF; 416 $source .= <<EOF;
417 {
418 dSP;
419 static SV *last_cv;
420 static U32 last_sub_generation;
421
422 /* simple "polymorphic" inline cache */
423 if (PL_sub_generation == last_sub_generation)
424 {
425 PUSHs (last_cv);
426 PUTBACK;
427 }
428 else
429 {
430 PL_op = nextop; nextop = Perl_pp_method_named (aTHX);
431
432 SPAGAIN;
433 last_sub_generation = PL_sub_generation;
434 last_cv = TOPs;
435 }
436 }
437EOF
438 } else {
439 $source .= <<EOF;
400 { 440 {
401 static HV *last_stash; 441 static HV *last_stash;
402 static SV *last_cv; 442 static SV *last_cv;
403 static U32 last_sub_generation; 443 static U32 last_sub_generation;
404 444
431 /* error case usually */ 471 /* error case usually */
432 PL_op = nextop; nextop = Perl_pp_method_named (aTHX); 472 PL_op = nextop; nextop = Perl_pp_method_named (aTHX);
433 } 473 }
434 } 474 }
435EOF 475EOF
476 }
436 477
437 out_next; 478 out_next;
438} 479}
439 480
440sub op_grepstart { 481sub op_grepstart {
484 local %op_regcomp; 525 local %op_regcomp;
485 526
486 my %opsseen; 527 my %opsseen;
487 my @todo = $cv->START; 528 my @todo = $cv->START;
488 my %op_target; 529 my %op_target;
530 my $numpushmark;
489 531
490 while (my $op = shift @todo) { 532 while (my $op = shift @todo) {
491 for (; $$op; $op = $op->next) { 533 for (; $$op; $op = $op->next) {
492 last if $opsseen{$$op}++; 534 last if $opsseen{$$op}++;
493 535
527 569
528 push @op_loop, \@targ; 570 push @op_loop, \@targ;
529 push @todo, @targ; 571 push @todo, @targ;
530 572
531 $op_target{$$_}++ for @targ; 573 $op_target{$$_}++ for @targ;
574
532 } elsif ($class eq "COP") { 575 } elsif ($class eq "COP") {
533 $insn->{bblock}++ if defined $op->label; 576 $insn->{bblock}++ if defined $op->label;
577
578 } else {
579 if ($name eq "pushmark") {
580 $numpushmark++;
581 }
534 } 582 }
535 } 583 }
536 } 584 }
537 585
538 $_->{bblock}++ for grep $op_target{${$_->{op}}}, @ops; 586 $_->{bblock}++ for grep $op_target{${$_->{op}}}, @ops;
540 local $source = <<EOF; 588 local $source = <<EOF;
541OP *%%%FUNC%%% (pTHX) 589OP *%%%FUNC%%% (pTHX)
542{ 590{
543 register OP *nextop = (OP *)${$ops[0]->{op}}L; 591 register OP *nextop = (OP *)${$ops[0]->{op}}L;
544EOF 592EOF
593
594 $source .= " faster_PUSHMARK_PREALLOC ($numpushmark);\n"
595 if $numpushmark;
545 596
546 while (@ops) { 597 while (@ops) {
547 $insn = shift @ops; 598 $insn = shift @ops;
548 599
549 $op = $insn->{op}; 600 $op = $insn->{op};
612 663
613 $source 664 $source
614} 665}
615 666
616my $uid = "aaaaaaa0"; 667my $uid = "aaaaaaa0";
668my %so;
617 669
618sub source2ptr { 670sub func2ptr {
619 my (@source) = @_; 671 my (@func) = @_;
620 672
621 my $stem = "/tmp/Faster-$$-" . $uid++; 673 #LOCK
674 mkdir $CACHEDIR, 0777;
675 sysopen my $meta_fh, "$CACHEDIR/meta", &Fcntl::O_RDWR | &Fcntl::O_CREAT, 0666
676 or die "$$CACHEDIR/meta: $!";
677 binmode $meta_fh, ":raw:perlio";
678 fcntl_lock fileno $meta_fh
679 or die "$CACHEDIR/meta: $!";
622 680
681 my $meta = eval { Storable::fd_retrieve $meta_fh } || { version => 1 };
682
683 for my $f (@func) {
684 $f->{func} = "F" . Digest::MD5::md5_hex ($f->{source});
685 $f->{so} = $meta->{$f->{func}};
686 }
687
688 if (grep !$_->{so}, @func) {
689 my $stem;
690
691 do {
692 $stem = "$CACHEDIR/$$-" . $uid++;
693 } while -e "$stem$_so";
694
623 open FILE, ">:raw", "$stem.c"; 695 open my $fh, ">:raw", "$stem.c";
624 print FILE <<EOF; 696 print $fh <<EOF;
625#define PERL_NO_GET_CONTEXT 697#define PERL_NO_GET_CONTEXT
626#define PERL_CORE 698#define PERL_CORE
627 699
628#include <assert.h> 700#include <assert.h>
629 701
630#include "EXTERN.h" 702#include "EXTERN.h"
631#include "perl.h" 703#include "perl.h"
632#include "XSUB.h" 704#include "XSUB.h"
633 705
706#if 1
707# define faster_PUSHMARK_PREALLOC(count) while (PL_markstack_ptr + (count) >= PL_markstack_max) markstack_grow ()
708# define faster_PUSHMARK(p) *++PL_markstack_ptr = (p) - PL_stack_base
709#else
710# define faster_PUSHMARK_PREALLOC(count) 1
711# define faster_PUSHMARK(p) PUSHMARK(p)
712#endif
713
634#define RUNOPS_TILL(op) \\ 714#define RUNOPS_TILL(op) \\
635while (nextop != (op)) \\ 715 while (nextop != (op)) \\
636 { \\ 716 { \\
637 PERL_ASYNC_CHECK (); \\ 717 PERL_ASYNC_CHECK (); \\
638 PL_op = nextop; nextop = (PL_op->op_ppaddr)(aTHX); \\ 718 PL_op = nextop; nextop = (PL_op->op_ppaddr)(aTHX); \\
639 }
640
641EOF
642 for (@source) {
643 my $func = $uid++;
644 $_ =~ s/%%%FUNC%%%/$func/g;
645 print FILE $_;
646 $_ = $func;
647 } 719 }
648 720
649 close FILE; 721EOF
722 for my $f (grep !$_->{so}, @func) {
723 next if $f->{so} = $meta->{$f->{func}}; # some cv's alias others
724
725 warn "compiling $f->{name} to $stem$_so:$f->{func}\n" if $verbose > 1;
726 my $source = $f->{source};
727 $source =~ s/%%%FUNC%%%/$f->{func}/g;
728 print $fh $source;
729 $meta->{$f->{func}} = $f->{so} = $stem;
730 }
731
732 close $fh;
650 system "$COMPILE -o $stem$_o $stem.c"; 733 system "$COMPILE -o $stem$_o $stem.c";
651 #d#unlink "$stem.c"; 734 unlink "$stem.c" unless $ENV{FASTER_DEBUG} > 0;
652 system "$LINK -o $stem$_so $stem$_o $LIBS"; 735 system "$LINK -o $stem$_so $stem$_o $LIBS";
653 unlink "$stem$_o"; 736 unlink "$stem$_o";
737 }
654 738
739 for my $f (@func) {
740 my $stem = $f->{so};
741
655 my $so = DynaLoader::dl_load_file "$stem$_so" 742 my $so = ($so{$stem} ||= DynaLoader::dl_load_file "$stem$_so")
656 or die "$stem$_so: $!"; 743 or die "$stem$_so: $!";
657 744
658 #unlink "$stem$_so"; 745 #unlink "$stem$_so";
659 746
660 map +(DynaLoader::dl_find_symbol $so, $_), @source 747 $f->{ptr} = DynaLoader::dl_find_symbol $so, $f->{func}
748 or die "$f->{func} not found in $stem$_so: $!";
749 }
750
751 seek $meta_fh, 0, 0 or die "$CACHEDIR/meta: $!";
752 Storable::nstore_fd $meta, $meta_fh;
753 truncate $meta_fh, tell $meta_fh;
754
755 # UNLOCK (by closing $meta_fh)
661} 756}
662 757
663my %ignore; 758my %ignore;
664 759
665sub entersub { 760sub entersub {
667 762
668 my $pkg = $cv->STASH->NAME; 763 my $pkg = $cv->STASH->NAME;
669 764
670 return if $ignore{$pkg}; 765 return if $ignore{$pkg};
671 766
672 warn "compiling ", $cv->STASH->NAME, "\n" 767 warn "optimising ", $cv->STASH->NAME, "\n"
673 if $verbose; 768 if $verbose;
674 769
675 eval { 770 eval {
676 my @cv; 771 my @func;
677 my @cv_source; 772
773 push @func, {
774 cv => $cv,
775 name => "<>",
776 source => cv2c $cv,
777 };
678 778
679 # always compile the whole stash 779 # always compile the whole stash
680 my %stash = $cv->STASH->ARRAY; 780 my %stash = $cv->STASH->ARRAY;
681 while (my ($k, $v) = each %stash) { 781 while (my ($k, $v) = each %stash) {
682 $v->isa (B::GV::) 782 $v->isa (B::GV::)
685 my $cv = $v->CV; 785 my $cv = $v->CV;
686 786
687 if ($cv->isa (B::CV::) 787 if ($cv->isa (B::CV::)
688 && ${$cv->START} 788 && ${$cv->START}
689 && $cv->START->name ne "null") { 789 && $cv->START->name ne "null") {
790
690 push @cv, $cv; 791 push @func, {
792 cv => $cv,
793 name => $k,
691 push @cv_source, cv2c $cv; 794 source => cv2c $cv,
795 };
692 } 796 }
693 } 797 }
694 798
695 my @ptr = source2ptr @cv_source; 799 func2ptr @func;
696 800
697 for (0 .. $#cv) { 801 for my $f (@func) {
698 patch_cv $cv[$_], $ptr[$_]; 802 patch_cv $f->{cv}, $f->{ptr};
699 } 803 }
700 }; 804 };
701 805
702 if ($@) { 806 if ($@) {
703 $ignore{$pkg}++; 807 $ignore{$pkg}++;
718=over 4 822=over 4
719 823
720=item FASTER_VERBOSE 824=item FASTER_VERBOSE
721 825
722Faster will output more informational messages when set to values higher 826Faster will output more informational messages when set to values higher
723than C<0>. Currently, C<1> outputs which packages are being compiled. 827than C<0>. Currently, C<1> outputs which packages are being compiled, C<3>
828outputs the cache directory and C<10> outputs information on which perl
829function is compiled into which shared object.
724 830
725=item FASTER_DEBUG 831=item FASTER_DEBUG
726 832
727Add debugging code when set to values higher than C<0>. Currently, this 833Add debugging code when set to values higher than C<0>. Currently, this
728adds 1-3 C<assert>'s per perl op, to ensure that opcode order and C 834adds 1-3 C<assert>'s per perl op (FASTER_DEBUG > 1), to ensure that opcode
729execution order are compatible. 835order and C execution order are compatible.
730 836
731=item FASTER_CACHE 837=item FASTER_CACHE
732 838
733NOT YET IMPLEMENTED
734
735Set a persistent cache directory that caches compiled code 839Set a persistent cache directory that caches compiled code fragments. The
736fragments. Normally, code compiled by Faster will be deleted immediately, 840default is C<$HOME/.perl-faster-cache> if C<HOME> is set and a temporary
737and every restart will recompile everything. Setting this variable to a 841directory otherwise.
738directory makes Faster cache the generated files for re-use.
739 842
740This directory will always grow in contents, so you might need to erase it 843This directory will always grow in size, so you might need to erase it
741from time to time. 844from time to time.
742 845
743=back 846=back
744 847
745=head1 BUGS/LIMITATIONS 848=head1 BUGS/LIMITATIONS

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines