ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent/lib/AnyEvent/IO.pm
Revision: 1.26
Committed: Wed Dec 19 18:00:27 2018 UTC (5 years, 11 months ago) by root
Branch: MAIN
CVS Tags: rel-7_16, rel-7_15, HEAD
Changes since 1.25: +1 -1 lines
Log Message:
*** empty log message ***

File Contents

# Content
1 =head1 NAME
2
3 AnyEvent::IO - the DBI of asynchronous I/O implementations
4
5 =head1 SYNOPSIS
6
7 use AnyEvent::IO;
8
9 # load /etc/passwd, call callback with the file data when done.
10 aio_load "/etc/passwd", sub {
11 my ($data) = @_
12 or return AE::log error => "/etc/passwd: $!";
13
14 warn "/etc/passwd contains ", ($data =~ y/://) , " colons.\n";
15 };
16
17 # the rest of the SYNOPSIS does the same, but with individual I/O calls
18
19 # also import O_XXX flags
20 use AnyEvent::IO qw(:DEFAULT :flags);
21
22 my $filedata = AE::cv;
23
24 # first open the file
25 aio_open "/etc/passwd", O_RDONLY, 0, sub {
26 my ($fh) = @_
27 or return AE::log error => "/etc/passwd: $!";
28
29 # now stat the file to get the size
30 aio_stat $fh, sub {
31 @_
32 or return AE::log error => "/etc/passwd: $!";
33
34 my $size = -s _;
35
36 # now read all the file data
37 aio_read $fh, $size, sub {
38 my ($data) = @_
39 or return AE::log error => "/etc/passwd: $!";
40
41 $size == length $data
42 or return AE::log error => "/etc/passwd: short read, file changed?";
43
44 # mostly the same as aio_load, above - $data contains
45 # the file contents now.
46 $filedata->($data);
47 };
48 };
49 };
50
51 my $passwd = $filedata->recv;
52 warn length $passwd, " octets.\n";
53
54 =head1 DESCRIPTION
55
56 This module provides functions that do I/O in an asynchronous fashion. It
57 is to I/O the same as L<AnyEvent> is to event libraries - it only
58 I<interfaces> to other implementations or to a portable pure-perl
59 implementation (which does not, however, do asynchronous I/O).
60
61 The only other implementation that is supported (or even known to the
62 author) is L<IO::AIO>, which is used automatically when it can be loaded
63 (via L<AnyEvent::AIO>, which also needs to be installed). If it is not
64 available, then L<AnyEvent::IO> falls back to its synchronous pure-perl
65 implementation.
66
67 Unlike L<AnyEvent>, which model to use is currently decided at module load
68 time, not at first use. Future releases might change this.
69
70 =head2 RATIONALE
71
72 While disk I/O often seems "instant" compared to, say, socket I/O, there
73 are many situations where your program can block for extended time periods
74 when doing disk I/O. For example, you access a disk on an NFS server and
75 it is gone - can take ages to respond again, if ever. Or your system is
76 extremely busy because it creates or restores a backup - reading data from
77 disk can then take seconds. Or you use Linux, which for so many years has
78 a close-to-broken VM/IO subsystem that can often induce minutes or more of
79 delay for disk I/O, even under what I would consider light I/O loads.
80
81 Whatever the situation, some programs just can't afford to block for long
82 times (say, half a second or more), because they need to respond as fast
83 as possible.
84
85 For those cases, you need asynchronous I/O.
86
87 The problem is, AnyEvent itself sometimes reads disk files (for example,
88 when looking at F</etc/hosts>), and under the above situations, this can
89 bring your program to a complete halt even if your program otherwise
90 takes care to only use asynchronous I/O for everything (e.g. by using
91 L<IO::AIO>).
92
93 On the other hand, requiring L<IO::AIO> for AnyEvent is clearly
94 impossible, as AnyEvent promises to stay pure-perl, and the overhead of
95 IO::AIO for small programs would be immense, especially when asynchronous
96 I/O isn't even needed.
97
98 Clearly, this calls for an abstraction layer, and that is what you are
99 looking at right now :-)
100
101 =head2 ASYNCHRONOUS VS. NON-BLOCKING
102
103 Many people are continuously confused on what the difference is between
104 asynchronous I/O and non-blocking I/O. In fact, those two terms are
105 not well defined, which often makes it hard to even talk about the
106 difference. Here is a short guideline that should leave you less
107 confused. It only talks about read operations, but the reasoning works
108 with other I/O operations as well.
109
110 Non-blocking I/O means that data is delivered by some external means,
111 automatically - that is, something I<pushes> data towards your file
112 handle, without you having to do anything. Non-blocking means that if
113 your operating system currently has no data (or EOF, or some error)
114 available for you, it will not wait ("block") as it would normally do,
115 but immediately return with an error (e.g. C<EWOULDBLOCK> - "I would have
116 blocked, but you forbid it").
117
118 Your program can then wait for data to arrive by other means, for example,
119 an I/O watcher which tells you when to re-attempt the read, after which it
120 can try to read again, and so on.
121
122 Often, you would expect this to work for disk files as well - if the data
123 isn't already in memory, one might want to wait for it and then re-attempt
124 the read for example. While this is sound reasoning, the POSIX API does
125 not support this, because disk drives and file systems do not send data
126 "on their own", and more so, the OS already knows that data is there, it
127 doesn't need to "wait" until it arrives from some external entity, it only
128 needs to transfer the data from disk to your memory buffer.
129
130 So basically, while the concept is sound, the existing OS APIs do not
131 support this. Therefore, it makes no sense to switch a disk file handle
132 into non-blocking mode - it will behave exactly the same as in blocking
133 mode, namely it will block until the data has been read from the disk.
134
135 The alternative to non-blocking I/O that actually works with disk files
136 is usually called I<asynchronous I/O>. Asynchronous, because the actual
137 I/O is done while your program does something else: there is no need to
138 call the read function to see if data is there, you only order the read
139 once, and it will notify you when the read has finished and the data is
140 your buffer - all the work is done in the background.
141
142 This works with disk files, and even with sockets and other sources. It
143 is, however, not very efficient when used with sources that could be
144 driven in a non-blocking way, because it usually has higher overhead
145 in the OS than non-blocking I/O, because it ties memory buffers for a
146 potentially unlimited time and often only a limited number of operations
147 can be done in parallel.
148
149 That's why asynchronous I/O makes most sense when confronted with disk
150 files, and non-blocking I/O only makes sense with sockets, pipes and
151 similar streaming sources.
152
153 =head1 IMPORT TAGS
154
155 By default, this module exports all C<aio_>xxx functions. In addition,
156 the following import tags can be used:
157
158 :aio all aio_* functions, same as :DEFAULT
159 :flags the fcntl open flags (O_CREAT, O_RDONLY, ...)
160
161 =head1 API NOTES
162
163 The functions in this module are not meant to be the most versatile or
164 the highest-performers (they are not very slow either, of course). They
165 are primarily meant to give users of your code the option to do the I/O
166 asynchronously (by installing L<IO::AIO> and L<AnyEvent::AIO>),
167 without adding a dependency on those modules.
168
169 =head2 NAMING
170
171 All the functions in this module implement an I/O operation, usually with
172 the same or similar name as the Perl built-in that they mimic, but with
173 an C<aio_> prefix. If you like you can think of the C<aio_>xxx functions as
174 "AnyEvent I/O" or "Asynchronous I/O" variants of Perl built-ins.
175
176 =head2 CALLING CONVENTIONS AND ERROR REPORTING
177
178 Each function expects a callback as their last argument. The callback is
179 usually called with the result data or result code. An error is usually
180 signalled by passing no arguments to the callback, which is then free to
181 look at C<$!> for the error code.
182
183 This makes all of the following forms of error checking valid:
184
185 aio_open ...., sub {
186 my $fh = shift # scalar assignment - will assign undef on error
187 or return AE::log error => "...";
188
189 my ($fh) = @_ # list assignment - will be 0 elements on error
190 or return AE::log error => "...";
191
192 @_ # check the number of elements directly
193 or return AE::log error => "...";
194
195 =head2 CAVEAT: RELATIVE PATHS
196
197 When a path is specified, this path I<must be an absolute> path, unless
198 you make certain that nothing in your process calls C<chdir> or an
199 equivalent function while the request executes.
200
201 =head2 CAVEAT: OTHER SHARED STATE
202
203 Changing the C<umask> while any requests execute that create files (or
204 otherwise rely on the current umask) results in undefined behaviour -
205 likewise changing anything else that would change the outcome, such as
206 your effective user or group ID.
207
208 =head2 CALLBACKS MIGHT BE CALLED BEFORE FUNCTION RETURNS TO CALLER
209
210 Unlike other functions in the AnyEvent module family, these functions
211 I<may> call your callback instantly, before returning. This should not be
212 a real problem, as these functions never return anything useful.
213
214 =head2 BEHAVIOUR AT PROGRAM EXIT
215
216 Both L<AnyEvent::IO::Perl> and L<AnyEvent::IO::IOAIO> implementations
217 make sure that operations that have started will be finished on a clean
218 programs exit. That makes programs work that start some I/O operations and
219 then exit. For example this complete program:
220
221 use AnyEvent::IO;
222
223 aio_stat "path1", sub {
224 aio_stat "path2", sub {
225 warn "both stats done\n";
226 };
227 };
228
229 Starts a C<stat> operation and then exits by "falling off the end" of
230 the program. Nevertheless, I<both> C<stat> operations will be executed,
231 as AnyEvent::IO waits for all outstanding requests to finish and you can
232 start new requests from request callbacks.
233
234 In fact, since L<AnyEvent::IO::Perl> is currently synchronous, the
235 program will do both stats before falling off the end, but with
236 L<AnyEvent::IO::IOAIO>, the program first falls of the end, then the stats
237 are executed.
238
239 While not guaranteed, this behaviour will be present in future versions,
240 if reasonably possible (which is extreemly likely :).
241
242 =cut
243
244 package AnyEvent::IO;
245
246 use AnyEvent (); BEGIN { AnyEvent::common_sense }
247
248 use base "Exporter";
249
250 our @AIO_REQ = qw(
251 aio_load aio_open aio_close aio_seek aio_read aio_write aio_truncate
252 aio_utime aio_chown aio_chmod aio_stat aio_lstat
253 aio_link aio_symlink aio_readlink aio_rename aio_unlink
254 aio_mkdir aio_rmdir aio_readdir
255 );
256 *EXPORT = \@AIO_REQ;
257 our @FLAGS = qw(O_RDONLY O_WRONLY O_RDWR O_CREAT O_EXCL O_TRUNC O_APPEND);
258 *EXPORT_OK = \@FLAGS;
259 our %EXPORT_TAGS = (flags => \@FLAGS, aio => \@AIO_REQ);
260
261 our $MODEL;
262
263 if ($MODEL) {
264 AE::log 7 => "Found preloaded IO model '$MODEL', using it.";
265 } else {
266 if ($ENV{PERL_ANYEVENT_IO_MODEL} =~ /^([a-zA-Z0-9:]+)$/) {
267 if (eval { require "AnyEvent/IO/$ENV{PERL_ANYEVENT_IO_MODEL}.pm" }) {
268 AE::log 7 => "Loaded IO model '$MODEL' (forced by \$ENV{PERL_ANYEVENT_IO_MODEL}), using it.";
269 } else {
270 undef $MODEL;
271 AE::log 4 => "Unable to load IO model '$ENV{PERL_ANYEVENT_IO_MODEL}' (from \$ENV{PERL_ANYEVENT_IO_MODEL}):\n$@";
272 }
273 }
274
275 unless ($MODEL) {
276 if (eval { require IO::AIO; require AnyEvent::AIO; require AnyEvent::IO::IOAIO }) {
277 AE::log 7 => "Autoloaded IO model 'IOAIO', using it.";
278 } else {
279 require AnyEvent::IO::Perl;
280 AE::log 7 => "Autoloaded IO model 'Perl', using it.";
281 }
282 }
283 }
284
285 =head1 GLOBAL VARIABLES AND FUNCTIONS
286
287 =over 4
288
289 =item $AnyEvent::IO::MODEL
290
291 Contains the package name of the backend I/O model in use - at the moment,
292 this is usually C<AnyEvent::IO::Perl> or C<AnyEvent::IO::IOAIO>.
293
294 =item aio_load $path, $cb->($data)
295
296 Tries to open C<$path> and read its contents into memory (obviously,
297 should only be used on files that are "small enough"), then passes them to
298 the callback as a string.
299
300 Example: load F</etc/hosts>.
301
302 aio_load "/etc/hosts", sub {
303 my ($hosts) = @_
304 or return AE::log error => "/etc/hosts: $!";
305
306 AE::log info => "/etc/hosts contains ", ($hosts =~ y/\n/), " lines\n";
307 };
308
309 =item aio_open $path, $flags, $mode, $cb->($fh)
310
311 Tries to open the file specified by C<$path> with the O_XXX-flags
312 C<$flags> (from the Fcntl module, or see below) and the mode C<$mode> (a
313 good value is 0666 for C<O_CREAT>, and C<0> otherwise).
314
315 The (normal, standard, perl) file handle associated with the opened file
316 is then passed to the callback.
317
318 This works very much like Perl's C<sysopen> function.
319
320 Changing the C<umask> while this request executes results in undefined
321 behaviour - likewise changing anything else that would change the outcome,
322 such as your effective user or group ID.
323
324 To avoid having to load L<Fcntl>, this module provides constants
325 for C<O_RDONLY>, C<O_WRONLY>, C<O_RDWR>, C<O_CREAT>, C<O_EXCL>,
326 C<O_TRUNC> and C<O_APPEND> - you can either access them directly
327 (C<AnyEvent::IO::O_RDONLY>) or import them by specifying the C<:flags>
328 import tag (see SYNOPSIS).
329
330 Example: securely open a file in F</var/tmp>, fail if it exists or is a symlink.
331
332 use AnyEvent::IO qw(:flags);
333
334 aio_open "/var/tmp/mytmp$$", O_CREAT | O_EXCL | O_RDWR, 0600, sub {
335 my ($fh) = @_
336 or return AE::log error => "$! - denial of service attack?";
337
338 # now we have $fh
339 };
340
341 =item aio_close $fh, $cb->($success)
342
343 Closes the file handle (yes, close can block your process indefinitely)
344 and passes a true value to the callback on success.
345
346 Due to idiosyncrasies in perl, instead of calling C<close>, the file
347 handle might get closed by C<dup2>'ing another file descriptor over
348 it, that is, the C<$fh> might still be open, but can be closed safely
349 afterwards and must not be used for anything.
350
351 Example: close a file handle, and dirty as we are, do not even bother
352 to check for errors.
353
354 aio_close $fh, sub { };
355
356 =item aio_read $fh, $length, $cb->($data)
357
358 Tries to read C<$length> octets from the current position from C<$fh> and
359 passes these bytes to C<$cb>. Otherwise the semantics are very much like
360 those of Perl's C<sysread>.
361
362 If less than C<$length> octets have been read, C<$data> will contain
363 only those bytes actually read. At EOF, C<$data> will be a zero-length
364 string. If an error occurs, then nothing is passed to the callback.
365
366 Obviously, multiple C<aio_read>'s or C<aio_write>'s at the same time on file
367 handles sharing the underlying open file description results in undefined
368 behaviour, due to sharing of the current file offset (and less obviously
369 so, because OS X is not thread safe and corrupts data when you try).
370
371 Example: read 128 octets from a file.
372
373 aio_read $fh, 128, sub {
374 my ($data) = @_
375 or return AE::log error "read from fh: $!";
376
377 if (length $data) {
378 print "read ", length $data, " octets.\n";
379 } else {
380 print "EOF\n";
381 }
382 };
383
384 =item aio_seek $fh, $offset, $whence, $callback->($offs)
385
386 Seeks the filehandle to the new C<$offset>, similarly to Perl's
387 C<sysseek>. The C<$whence> are the traditional values (C<0> to count from
388 start, C<1> to count from the current position and C<2> to count from the
389 end).
390
391 The resulting absolute offset will be passed to the callback on success.
392
393 Example: measure the size of the file in the old-fashioned way using seek.
394
395 aio_seek $fh, 0, 2, sub {
396 my ($size) = @_
397 or return AE::log error => "seek to end failed: $!";
398
399 # maybe we need to seek to the beginning again?
400 aio_seek $fh, 0, 0, sub {
401 # now we are hopefully at the beginning
402 };
403 };
404
405 =item aio_write $fh, $data, $cb->($length)
406
407 Tries to write the octets in C<$data> to the current position of C<$fh>
408 and passes the actual number of bytes written to the C<$cb>. Otherwise the
409 semantics are very much like those of Perl's C<syswrite>.
410
411 If less than C<length $data> octets have been written, C<$length> will
412 reflect that. If an error occurs, then nothing is passed to the callback.
413
414 Obviously, multiple C<aio_read>'s or C<aio_write>'s at the same time on file
415 handles sharing the underlying open file description results in undefined
416 behaviour, due to sharing of the current file offset (and less obviously
417 so, because OS X is not thread safe and corrupts data when you try).
418
419 =item aio_truncate $fh_or_path, $new_length, $cb->($success)
420
421 Calls C<truncate> on the path or perl file handle and passes a true value
422 to the callback on success.
423
424 Example: truncate F</etc/passwd> to zero length - this only works on
425 systems that support C<truncate>, should not be tried out for obvious
426 reasons and debian will probably open yte another security bug about this
427 example.
428
429 aio_truncate "/etc/passwd", sub {
430 @_
431 or return AE::log error => "/etc/passwd: $! - are you root enough?";
432 };
433
434 =item aio_utime $fh_or_path, $atime, $mtime, $cb->($success)
435
436 Calls C<utime> on the path or perl file handle and passes a true value to
437 the callback on success.
438
439 The special case of both C<$atime> and C<$mtime> being C<undef> sets the
440 times to the current time, on systems that support this.
441
442 Example: try to touch F<file>.
443
444 aio_utime "file", undef, undef, sub { };
445
446 =item aio_chown $fh_or_path, $uid, $gid, $cb->($success)
447
448 Calls C<chown> on the path or perl file handle and passes a true value to
449 the callback on success.
450
451 If C<$uid> or C<$gid> can be specified as C<undef>, in which case the
452 uid or gid of the file is not changed. This differs from Perl's C<chown>
453 built-in, which wants C<-1> for this.
454
455 Example: update the group of F<file> to 0 (root), but leave the owner alone.
456
457 aio_chown "file", undef, 0, sub {
458 @_
459 or return AE::log error => "chown 'file': $!";
460 };
461
462 =item aio_chmod $fh_or_path, $perms, $cb->($success)
463
464 Calls C<chmod> on the path or perl file handle and passes a true value to
465 the callback on success.
466
467 Example: change F<file> to be user/group/world-readable, but leave the other flags
468 alone.
469
470 aio_stat "file", sub {
471 @_
472 or return AE::log error => "file: $!";
473
474 aio_chmod "file", (stat _)[2] & 07777 | 00444, sub { };
475 };
476
477 =item aio_stat $fh_or_path, $cb->($success)
478
479 =item aio_lstat $path, $cb->($success)
480
481 Calls C<stat> or C<lstat> on the path or perl file handle and passes a
482 true value to the callback on success.
483
484 The stat data will be available by C<stat>'ing the C<_> file handle
485 (e.g. C<-x _>, C<stat _> and so on).
486
487 Example: see if we can find the number of subdirectories of F</etc>.
488
489 aio_stat "/etc", sub {
490 @_
491 or return AE::log error => "/etc: $!";
492
493 (stat _)[3] >= 2
494 or return AE::log warn => "/etc has low link count - non-POSIX filesystem?";
495
496 print "/etc has ", (stat _)[3] - 2, " subdirectories.\n";
497 };
498
499 =item aio_link $oldpath, $newpath, $cb->($success)
500
501 Calls C<link> on the paths and passes a true value to the callback on
502 success.
503
504 Example: link "F<file> to F<file.bak>, then rename F<file.new> over F<file>,
505 to atomically replace it.
506
507 aio_link "file", "file.bak", sub {
508 @_
509 or return AE::log error => "file: $!";
510
511 aio_rename "file.new", "file", sub {
512 @_
513 or return AE::log error => "file.new: $!";
514
515 print "file atomically replaced by file.new, backup file.bak\n";
516 };
517 };
518
519 =item aio_symlink $oldpath, $newpath, $cb->($success)
520
521 Calls C<symlink> on the paths and passes a true value to the callback on
522 success.
523
524 Example: create a symlink "F<slink> containing "random data".
525
526 aio_symlink "random data", "slink", sub {
527 @_
528 or return AE::log error => "slink: $!";
529 };
530
531 =item aio_readlink $path, $cb->($target)
532
533 Calls C<readlink> on the paths and passes the link target string to the
534 callback.
535
536 Example: read the symlink called Fyslink> and verify that it contains "random data".
537
538 aio_readlink "slink", sub {
539 my ($target) = @_
540 or return AE::log error => "slink: $!";
541
542 $target eq "random data"
543 or AE::log critical => "omg, the world will end!";
544 };
545
546 =item aio_rename $oldpath, $newpath, $cb->($success)
547
548 Calls C<rename> on the paths and passes a true value to the callback on
549 success.
550
551 See C<aio_link> for an example.
552
553 =item aio_unlink $path, $cb->($success)
554
555 Tries to unlink the object at C<$path> and passes a true value to the
556 callback on success.
557
558 Example: try to delete the file F<tmpfile.dat~>.
559
560 aio_unlink "tmpfile.dat~", sub { };
561
562 =item aio_mkdir $path, $perms, $cb->($success)
563
564 Calls C<mkdir> on the path with the given permissions C<$perms> (when in
565 doubt, C<0777> is a good value) and passes a true value to the callback on
566 success.
567
568 Example: try to create the directory F<subdir> and leave it to whoeveer
569 comes after us to check whether it worked.
570
571 aio_mkdir "subdir", 0777, sub { };
572
573 =item aio_rmdir $path, $cb->($success)
574
575 Tries to remove the directory at C<$path> and passes a true value to the
576 callback on success.
577
578 Example: try to remove the directory F<subdir> and don't give a damn if
579 that fails.
580
581 aio_rmdir "subdir", sub { };
582
583 =item aio_readdir $path, $cb->(\@names)
584
585 Reads all filenames from the directory specified by C<$path> and passes
586 them to the callback, as an array reference with the names (without a path
587 prefix). The F<.> and F<..> names will be filtered out first.
588
589 The ordering of the file names is undefined - backends that are capable
590 of it (e.g. L<IO::AIO>) will return the ordering that most likely is
591 fastest to C<stat> through, and furthermore put entries that likely are
592 directories first in the array.
593
594 If you need best performance in recursive directory traversal or when
595 looking at really big directories, you are advised to use L<IO::AIO>
596 directly, specifically the C<aio_readdirx> and C<aio_scandir> functions,
597 which have more options to tune performance.
598
599 Example: recursively scan a directory hierarchy, silently skip diretcories
600 we couldn't read and print all others.
601
602 sub scan($); # visibility-in-next statement is not so useful these days
603 sub scan($) {
604 my ($path) = @_;
605
606 aio_readdir $path, sub {
607 my ($names) = @_
608 or return;
609
610 print "$path\n";
611
612 for my $name (@$names) {
613 aio_lstat "$path/$name", sub {
614 scan "$path/$name"
615 if -d _;
616 };
617 }
618 };
619 }
620
621 scan "/etc";
622
623 =back
624
625 =head1 ENVIRONMENT VARIABLES
626
627 See the description of C<PERL_ANYEVENT_IO_MODEL> in the L<AnyEvent>
628 manpage.
629
630 =head1 AUTHOR
631
632 Marc Lehmann <schmorp@schmorp.de>
633 http://anyevent.schmorp.de
634
635 =cut
636
637 1
638