1 |
# Format: |
2 |
# NAME \t TYPE, arg-description [num-args] [longjump-len] \t DESCRIPTION |
3 |
|
4 |
# Empty rows and #-comment rows are ignored. |
5 |
|
6 |
# Exit points |
7 |
END END, no End of program. |
8 |
SUCCEED END, no Return from a subroutine, basically. |
9 |
|
10 |
# Anchors: |
11 |
BOL BOL, no Match "" at beginning of line. |
12 |
MBOL BOL, no Same, assuming multiline. |
13 |
SBOL BOL, no Same, assuming singleline. |
14 |
EOS EOL, no Match "" at end of string. |
15 |
EOL EOL, no Match "" at end of line. |
16 |
MEOL EOL, no Same, assuming multiline. |
17 |
SEOL EOL, no Same, assuming singleline. |
18 |
BOUND BOUND, no Match "" at any word boundary |
19 |
BOUNDL BOUND, no Match "" at any word boundary |
20 |
NBOUND NBOUND, no Match "" at any word non-boundary |
21 |
NBOUNDL NBOUND, no Match "" at any word non-boundary |
22 |
GPOS GPOS, no Matches where last m//g left off. |
23 |
|
24 |
# [Special] alternatives |
25 |
REG_ANY REG_ANY, no Match any one character (except newline). |
26 |
SANY REG_ANY, no Match any one character. |
27 |
CANY REG_ANY, no Match any one byte. |
28 |
ANYOF ANYOF, sv Match character in (or not in) this class. |
29 |
ALNUM ALNUM, no Match any alphanumeric character |
30 |
ALNUML ALNUM, no Match any alphanumeric char in locale |
31 |
NALNUM NALNUM, no Match any non-alphanumeric character |
32 |
NALNUML NALNUM, no Match any non-alphanumeric char in locale |
33 |
SPACE SPACE, no Match any whitespace character |
34 |
SPACEL SPACE, no Match any whitespace char in locale |
35 |
NSPACE NSPACE, no Match any non-whitespace character |
36 |
NSPACEL NSPACE, no Match any non-whitespace char in locale |
37 |
DIGIT DIGIT, no Match any numeric character |
38 |
DIGITL DIGIT, no Match any numeric character in locale |
39 |
NDIGIT NDIGIT, no Match any non-numeric character |
40 |
NDIGITL NDIGIT, no Match any non-numeric character in locale |
41 |
CLUMP CLUMP, no Match any combining character sequence |
42 |
|
43 |
# BRANCH The set of branches constituting a single choice are hooked |
44 |
# together with their "next" pointers, since precedence prevents |
45 |
# anything being concatenated to any individual branch. The |
46 |
# "next" pointer of the last BRANCH in a choice points to the |
47 |
# thing following the whole choice. This is also where the |
48 |
# final "next" pointer of each individual branch points; each |
49 |
# branch starts with the operand node of a BRANCH node. |
50 |
# |
51 |
BRANCH BRANCH, node Match this alternative, or the next... |
52 |
|
53 |
# BACK Normal "next" pointers all implicitly point forward; BACK |
54 |
# exists to make loop structures possible. |
55 |
# not used |
56 |
BACK BACK, no Match "", "next" ptr points backward. |
57 |
|
58 |
# Literals |
59 |
EXACT EXACT, sv Match this string (preceded by length). |
60 |
EXACTF EXACT, sv Match this string, folded (prec. by length). |
61 |
EXACTFL EXACT, sv Match this string, folded in locale (w/len). |
62 |
|
63 |
# Do nothing |
64 |
NOTHING NOTHING,no Match empty string. |
65 |
# A variant of above which delimits a group, thus stops optimizations |
66 |
TAIL NOTHING,no Match empty string. Can jump here from outside. |
67 |
|
68 |
# STAR,PLUS '?', and complex '*' and '+', are implemented as circular |
69 |
# BRANCH structures using BACK. Simple cases (one character |
70 |
# per match) are implemented with STAR and PLUS for speed |
71 |
# and to minimize recursive plunges. |
72 |
# |
73 |
STAR STAR, node Match this (simple) thing 0 or more times. |
74 |
PLUS PLUS, node Match this (simple) thing 1 or more times. |
75 |
|
76 |
CURLY CURLY, sv 2 Match this simple thing {n,m} times. |
77 |
CURLYN CURLY, no 2 Match next-after-this simple thing |
78 |
# {n,m} times, set parenths. |
79 |
CURLYM CURLY, no 2 Match this medium-complex thing {n,m} times. |
80 |
CURLYX CURLY, sv 2 Match this complex thing {n,m} times. |
81 |
|
82 |
# This terminator creates a loop structure for CURLYX |
83 |
WHILEM WHILEM, no Do curly processing and see if rest matches. |
84 |
|
85 |
# OPEN,CLOSE,GROUPP ...are numbered at compile time. |
86 |
OPEN OPEN, num 1 Mark this point in input as start of #n. |
87 |
CLOSE CLOSE, num 1 Analogous to OPEN. |
88 |
|
89 |
REF REF, num 1 Match some already matched string |
90 |
REFF REF, num 1 Match already matched string, folded |
91 |
REFFL REF, num 1 Match already matched string, folded in loc. |
92 |
|
93 |
# grouping assertions |
94 |
IFMATCH BRANCHJ,off 1 2 Succeeds if the following matches. |
95 |
UNLESSM BRANCHJ,off 1 2 Fails if the following matches. |
96 |
SUSPEND BRANCHJ,off 1 1 "Independent" sub-RE. |
97 |
IFTHEN BRANCHJ,off 1 1 Switch, should be preceeded by switcher . |
98 |
GROUPP GROUPP, num 1 Whether the group matched. |
99 |
|
100 |
# Support for long RE |
101 |
LONGJMP LONGJMP,off 1 1 Jump far away. |
102 |
BRANCHJ BRANCHJ,off 1 1 BRANCH with long offset. |
103 |
|
104 |
# The heavy worker |
105 |
EVAL EVAL, evl 1 Execute some Perl code. |
106 |
|
107 |
# Modifiers |
108 |
MINMOD MINMOD, no Next operator is not greedy. |
109 |
LOGICAL LOGICAL,no Next opcode should set the flag only. |
110 |
|
111 |
# This is not used yet |
112 |
RENUM BRANCHJ,off 1 1 Group with independently numbered parens. |
113 |
|
114 |
# This is not really a node, but an optimized away piece of a "long" node. |
115 |
# To simplify debugging output, we mark it as if it were a node |
116 |
OPTIMIZED NOTHING,off Placeholder for dump. |