File-INC/INC.pm

=head1 NAME

File::INC - find files via @INC

=head1 SYNOPSIS

   use File::INC;

=head1 DESCRIPTION

Perl has two almost identical facilities to load "library"
files: C<require> and C<do> (C<use> is basically a compile-time
C<require>).

These search C<@INC> for the file and compile it.

Unfortunately, sometimes you want to locate data (or even source) files
using the same mechanism. The problem is that Perl I<laways> evaluates the
files as Perl code, which is not something you want for data files.

This module implements the same (hopefully :) C<@INC> walk code as perl,
and gives you access to the resulting files, without evaluating them.

=head2 ALTERNATIVES

There are alternatives to this module, both of which I regularly employ,
but they come with their own drawbacks:

=over 4

=item Walking C<@INC> manually and treating it as an array of directories.

Basically, this:

   sub find_rcfile($) {
      my $path;

      for (@RC_PATH, "") {
         $path = "$RC_BASE/$_/$_[0]";
         return $path if -e $path;
      }

      die "FATAL: can't find required file \"$_[0]\" in \"$RC_BASE\"\n";
   }

   my $path = find_rcfile "My/Module/datafile.dat";

This is simple, but fails for any entries in C<@INC> that are not
directories. Not handling these makes it fail in many environments, such
as in L<App::Staticperl> or L<PAR::Packer>.

This module basically implements a full-featured version of this that
hopefully gets everything right.

=item Embed datafiles using C<do>.

You can embed datafiles by, say, uuencoding them and putting them into
a F<.pm> file (I think L<PAR::Packer>) does this. This is extremely
wasteful, but at least works around the C<@INC> problematic. A slightly
better way is to use C<do>. Take this file, put as F<My/Module/data.pl>
into the perl library:

   <<EOF
   text data...
   EOF

Then you can load the "text data..." part using a simple C<do> call:

   my $data = do "My/Module/data.pl";

You can handle arbirary binary data either by finding a suitable C<EOF>
marker that isn't part of it, or you could use utf8 encoding to the
rescue, by using a >255 character code as text delimiter:

   use utf8;
   # use U+2026 as delimiter
   q…arbitary binary data....…

While this takes care of C<@INC>, perl still has to compile and execute
it every time (which might not be a significant problem), and it requires
the data to be formatted specifically for this occasion, so is hard for
accessing the sources of third-party modules.

=back

=head1 FUNCTIONS

This module only provides one function, not exported, that creates
C<File::INC> objects.

=over 4

=cut

package File::INC;

use common::sense;

use Carp ();

our $VERSION = 0.2;

=item $inc = File::INC::find $name

The name is a file name of the same syntax as expected by C<do EXPR> or
C<require EXPR>, e.g. F<File/INC.pm> or F<My/Module/data.bin>. Unlike
C<require> et al., absolute paths are not supported.

If the file is found, it returns a C<File::INC> object. Otherwise, it
returns C<undef>.

To work around potential bugs in C<@INC> hooks, it is recommended to only
keep one C<File::INC> around at any single time.

=cut

sub find($) {
   my $file = shift;

   for my $inc (@INC) {
      if (ref $inc) {
         my $func;

         if (CODE:: eq ref $inc) {
            $func = $inc;
         } elsif (ARRAY:: eq ref $inc) {
            $func = $inc->[0];
         } else {
            $func = $inc->can ("INC");
         }

         if ($func) {
            my @r = $func->($inc, $file);;
            my ($cache, $fh, $sub, $state);

            # undocumented and unsupported
            if (!eval { fileno $r[0] } and CODE:: ne ref $r[0]) {
               $cache = shift @r;
            }

            if (defined eval { fileno $r[0] }) {
               $fh = shift @r;
            }
            
            if (CODE:: eq ref $r[0]) {
               $sub = shift @r;
            }

            $state = shift @r;

            if (defined $fh or defined $cache or defined $sub) {
               defined $cache
                  and Carp::croak "File::INC: undocumented filter caches are not supported\n";

               return bless [undef, $fh, $sub, $state, $cache];
            }
         }

      } elsif (-e "$inc/$file") {
         # what a relief, the simple case
         return bless ["$inc/$file"];
      }
   }

   undef
}

=back

=head1 THE C<File::INC> CLASS

Library files are represented by C<File::INC> objects. You can query it to
find more information about the file (such as whether it even is a file
:).

=over 4

=item $path = $inc->path

If the file is actually a file on disk (the most common, but not
guaranteed, case), then this method returns the path to it. In other
cases, it returns C<undef>.

=cut

sub path {
   $_[0][0]
}

=item $path = $inc->force_path

Like C<$path>, but always returns a path. Or tries to, by creating a
temporary file and writing the data to it, if neccessary.

#todo

=cut

sub force_path {
   die;
}

=item $fh = $inc->fh

=cut

sub fh {
   if (defined $_[0][0]) { # path?
      open my $fh, "<", $_[0][0];
      return $fh;
   } elsif (defined $_[0][1] && !defined $_[0][2]) { # fh. and no filter?
      return $_[0][1];
   } else {
      return undef;
   }
}

=item $fh = $inc->force_fh

#todo

=cut

sub force_fh {
   die;
}

=item $data = $inc->data

=cut

sub data {
   if ($_[0][0]) { # path
      open my $fh, "<", $_[0][0]
        or die "$_[0][0]: $!";

     local $/;
     return scalar <$fh>;

   } elsif (defined $_[0][1]) { # fh
      if (defined $_[0][2]) { # filter sub
         my @data;
         my $status;

         local $_;
         while () {
            $_ = <$_[0][1]>;
            $status = $_[0][2]->($_[0][3]);
            push @data, $_;
            last if $status <= 0;
         }

         return join "", @data;
      }

      # plain file, slurp
      local $/;
      return scalar <$_[0][1]>;

   } elsif (defined $_[0][2]) {
      # "filter" sub

      my @data;
      my $status;

      local $_;
      while () {
         $_ = "";
         $status = $_[0][2]->($_[0][3]);
         push @data, $_;
         last if $status <= 0;
      }

      return join "", @data;

   } else {
      die "internal error";
   }
}

=back

=head1 LIMITATIONS

As with many of the newer additions to perl, C<@INC> hooks have been
poorly designed (the interface is pretty chaotic and does not follow Perl
language rules, and the subroutine magically has to know whether the line
passed to it is the last line of the file) and documented (for example,
the filter cache is completely undocumented), and as a result, it's not
(to the best of my knowledge) possible implement this module in pure
Perl. I hope this implementation is good enough for real world C<@INC>
hooks.

Known deficiencies are that file handles are detected differently,
only a copy of the state argument is passed to the subroutine, and the
undocumented filter cache return value is ignored.

=head1 AUTHOR AND CONTACT INFORMATION

 Marc Lehmann <schmorp@schmorp.de>
 http://software.schmorp.de/pkg/File-INC

=cut

1

Revision:	1.4
Committed:	Wed May 1 03:19:22 2013 UTC (11 years, 1 month ago) by root
Branch:	MAIN
CVS Tags:	HEAD
Changes since 1.3:	+3 -0 lines
Log Message:	* empty log message *
#	Content
1	=head1 NAME
2
3	File::INC - find files via @INC
4
5	=head1 SYNOPSIS
6
7	use File::INC;
8
9	=head1 DESCRIPTION
10
11	Perl has two almost identical facilities to load "library"
12	files: C<require> and C<do> (C<use> is basically a compile-time
13	C<require>).
14
15	These search C<@INC> for the file and compile it.
16
17	Unfortunately, sometimes you want to locate data (or even source) files
18	using the same mechanism. The problem is that Perl I<laways> evaluates the
19	files as Perl code, which is not something you want for data files.
20
21	This module implements the same (hopefully :) C<@INC> walk code as perl,
22	and gives you access to the resulting files, without evaluating them.
23
24	=head2 ALTERNATIVES
25
26	There are alternatives to this module, both of which I regularly employ,
27	but they come with their own drawbacks:
28
29	=over 4
30
31	=item Walking C<@INC> manually and treating it as an array of directories.
32
33	Basically, this:
34
35	sub find_rcfile($) {
36	my $path;
37
38	for (@RC_PATH, "") {
39	$path = "$RC_BASE/$_/$_[0]";
40	return $path if -e $path;
41	}
42
43	die "FATAL: can't find required file \"$_[0]\" in \"$RC_BASE\"\n";
44	}
45
46	my $path = find_rcfile "My/Module/datafile.dat";
47
48	This is simple, but fails for any entries in C<@INC> that are not
49	directories. Not handling these makes it fail in many environments, such
50	as in L<App::Staticperl> or L<PAR::Packer>.
51
52	This module basically implements a full-featured version of this that
53	hopefully gets everything right.
54
55	=item Embed datafiles using C<do>.
56
57	You can embed datafiles by, say, uuencoding them and putting them into
58	a F<.pm> file (I think L<PAR::Packer>) does this. This is extremely
59	wasteful, but at least works around the C<@INC> problematic. A slightly
60	better way is to use C<do>. Take this file, put as F<My/Module/data.pl>
61	into the perl library:
62
63	<<EOF
64	text data...
65	EOF
66
67	Then you can load the "text data..." part using a simple C<do> call:
68
69	my $data = do "My/Module/data.pl";
70
71	You can handle arbirary binary data either by finding a suitable C<EOF>
72	marker that isn't part of it, or you could use utf8 encoding to the
73	rescue, by using a >255 character code as text delimiter:
74
75	use utf8;
76	# use U+2026 as delimiter
77	q…arbitary binary data....…
78
79	While this takes care of C<@INC>, perl still has to compile and execute
80	it every time (which might not be a significant problem), and it requires
81	the data to be formatted specifically for this occasion, so is hard for
82	accessing the sources of third-party modules.
83
84	=back
85
86	=head1 FUNCTIONS
87
88	This module only provides one function, not exported, that creates
89	C<File::INC> objects.
90
91	=over 4
92
93	=cut
94
95	package File::INC;
96
97	use common::sense;
98
99	use Carp ();
100
101	our $VERSION = 0.2;
102
103	=item $inc = File::INC::find $name
104
105	The name is a file name of the same syntax as expected by C<do EXPR> or
106	C<require EXPR>, e.g. F<File/INC.pm> or F<My/Module/data.bin>. Unlike
107	C<require> et al., absolute paths are not supported.
108
109	If the file is found, it returns a C<File::INC> object. Otherwise, it
110	returns C<undef>.
111
112	To work around potential bugs in C<@INC> hooks, it is recommended to only
113	keep one C<File::INC> around at any single time.
114
115	=cut
116
117	sub find($) {
118	my $file = shift;
119
120	for my $inc (@INC) {
121	if (ref $inc) {
122	my $func;
123
124	if (CODE:: eq ref $inc) {
125	$func = $inc;
126	} elsif (ARRAY:: eq ref $inc) {
127	$func = $inc->[0];
128	} else {
129	$func = $inc->can ("INC");
130	}
131
132	if ($func) {
133	my @r = $func->($inc, $file);;
134	my ($cache, $fh, $sub, $state);
135
136	# undocumented and unsupported
137	if (!eval { fileno $r[0] } and CODE:: ne ref $r[0]) {
138	$cache = shift @r;
139	}
140
141	if (defined eval { fileno $r[0] }) {
142	$fh = shift @r;
143	}
144
145	if (CODE:: eq ref $r[0]) {
146	$sub = shift @r;
147	}
148
149	$state = shift @r;
150
151	if (defined $fh or defined $cache or defined $sub) {
152	defined $cache
153	and Carp::croak "File::INC: undocumented filter caches are not supported\n";
154
155	return bless [undef, $fh, $sub, $state, $cache];
156	}
157	}
158
159	} elsif (-e "$inc/$file") {
160	# what a relief, the simple case
161	return bless ["$inc/$file"];
162	}
163	}
164
165	undef
166	}
167
168	=back
169
170	=head1 THE C<File::INC> CLASS
171
172	Library files are represented by C<File::INC> objects. You can query it to
173	find more information about the file (such as whether it even is a file
174	:).
175
176	=over 4
177
178	=item $path = $inc->path
179
180	If the file is actually a file on disk (the most common, but not
181	guaranteed, case), then this method returns the path to it. In other
182	cases, it returns C<undef>.
183
184	=cut
185
186	sub path {
187	$_[0][0]
188	}
189
190	=item $path = $inc->force_path
191
192	Like C<$path>, but always returns a path. Or tries to, by creating a
193	temporary file and writing the data to it, if neccessary.
194
195	#todo
196
197	=cut
198
199	sub force_path {
200	die;
201	}
202
203	=item $fh = $inc->fh
204
205	=cut
206
207	sub fh {
208	if (defined $_[0][0]) { # path?
209	open my $fh, "<", $_[0][0];
210	return $fh;
211	} elsif (defined $_[0][1] && !defined $_[0][2]) { # fh. and no filter?
212	return $_[0][1];
213	} else {
214	return undef;
215	}
216	}
217
218	=item $fh = $inc->force_fh
219
220	#todo
221
222	=cut
223
224	sub force_fh {
225	die;
226	}
227
228	=item $data = $inc->data
229
230	=cut
231
232	sub data {
233	if ($_[0][0]) { # path
234	open my $fh, "<", $_[0][0]
235	or die "$_[0][0]: $!";
236
237	local $/;
238	return scalar <$fh>;
239
240	} elsif (defined $_[0][1]) { # fh
241	if (defined $_[0][2]) { # filter sub
242	my @data;
243	my $status;
244
245	local $_;
246	while () {
247	$_ = <$_[0][1]>;
248	$status = $_[0][2]->($_[0][3]);
249	push @data, $_;
250	last if $status <= 0;
251	}
252
253	return join "", @data;
254	}
255
256	# plain file, slurp
257	local $/;
258	return scalar <$_[0][1]>;
259
260	} elsif (defined $_[0][2]) {
261	# "filter" sub
262
263	my @data;
264	my $status;
265
266	local $_;
267	while () {
268	$_ = "";
269	$status = $_[0][2]->($_[0][3]);
270	push @data, $_;
271	last if $status <= 0;
272	}
273
274	return join "", @data;
275
276	} else {
277	die "internal error";
278	}
279	}
280
281	=back
282
283	=head1 LIMITATIONS
284
285	As with many of the newer additions to perl, C<@INC> hooks have been
286	poorly designed (the interface is pretty chaotic and does not follow Perl
287	language rules, and the subroutine magically has to know whether the line
288	passed to it is the last line of the file) and documented (for example,
289	the filter cache is completely undocumented), and as a result, it's not
290	(to the best of my knowledge) possible implement this module in pure
291	Perl. I hope this implementation is good enough for real world C<@INC>
292	hooks.
293
294	Known deficiencies are that file handles are detected differently,
295	only a copy of the state argument is passed to the subroutine, and the
296	undocumented filter cache return value is ignored.
297
298	=head1 AUTHOR AND CONTACT INFORMATION
299
300	Marc Lehmann <schmorp@schmorp.de>
301	http://software.schmorp.de/pkg/File-INC
302
303	=cut
304
305	1
306