ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/CBOR-XS/XS.pm
Revision: 1.19
Committed: Wed Nov 20 01:09:46 2013 UTC (10 years, 5 months ago) by root
Branch: MAIN
Changes since 1.18: +85 -0 lines
Log Message:
*** empty log message ***

File Contents

# User Rev Content
1 root 1.1 =head1 NAME
2    
3     CBOR::XS - Concise Binary Object Representation (CBOR, RFC7049)
4    
5     =encoding utf-8
6    
7     =head1 SYNOPSIS
8    
9     use CBOR::XS;
10    
11     $binary_cbor_data = encode_cbor $perl_value;
12     $perl_value = decode_cbor $binary_cbor_data;
13    
14     # OO-interface
15    
16     $coder = CBOR::XS->new;
17 root 1.6 $binary_cbor_data = $coder->encode ($perl_value);
18     $perl_value = $coder->decode ($binary_cbor_data);
19    
20     # prefix decoding
21    
22     my $many_cbor_strings = ...;
23     while (length $many_cbor_strings) {
24     my ($data, $length) = $cbor->decode_prefix ($many_cbor_strings);
25     # data was decoded
26     substr $many_cbor_strings, 0, $length, ""; # remove decoded cbor string
27     }
28 root 1.1
29     =head1 DESCRIPTION
30    
31 root 1.9 WARNING! This module is very new, and not very well tested (that's up to
32     you to do). Furthermore, details of the implementation might change freely
33     before version 1.0. And lastly, the object serialisation protocol depends
34     on a pending IANA assignment, and until that assignment is official, this
35     implementation is not interoperable with other implementations (even
36     future versions of this module) until the assignment is done.
37    
38     You are still invited to try out CBOR, and this module.
39 root 1.5
40     This module converts Perl data structures to the Concise Binary Object
41     Representation (CBOR) and vice versa. CBOR is a fast binary serialisation
42     format that aims to use a superset of the JSON data model, i.e. when you
43     can represent something in JSON, you should be able to represent it in
44     CBOR.
45 root 1.1
46 root 1.9 In short, CBOR is a faster and very compact binary alternative to JSON,
47 root 1.10 with the added ability of supporting serialisation of Perl objects. (JSON
48     often compresses better than CBOR though, so if you plan to compress the
49     data later you might want to compare both formats first).
50 root 1.5
51 root 1.15 To give you a general idea about speed, with texts in the megabyte range,
52     C<CBOR::XS> usually encodes roughly twice as fast as L<Storable> or
53     L<JSON::XS> and decodes about 15%-30% faster than those. The shorter the
54     data, the worse L<Storable> performs in comparison.
55    
56     As for compactness, C<CBOR::XS> encoded data structures are usually about
57     20% smaller than the same data encoded as (compact) JSON or L<Storable>.
58 root 1.14
59 root 1.5 The primary goal of this module is to be I<correct> and the secondary goal
60     is to be I<fast>. To reach the latter goal it was written in C.
61 root 1.1
62     See MAPPING, below, on how CBOR::XS maps perl values to CBOR values and
63     vice versa.
64    
65     =cut
66    
67     package CBOR::XS;
68    
69     use common::sense;
70    
71 root 1.17 our $VERSION = 0.08;
72 root 1.1 our @ISA = qw(Exporter);
73    
74     our @EXPORT = qw(encode_cbor decode_cbor);
75    
76     use Exporter;
77     use XSLoader;
78    
79 root 1.6 use Types::Serialiser;
80    
81 root 1.3 our $MAGIC = "\xd9\xd9\xf7";
82    
83 root 1.1 =head1 FUNCTIONAL INTERFACE
84    
85     The following convenience methods are provided by this module. They are
86     exported by default:
87    
88     =over 4
89    
90     =item $cbor_data = encode_cbor $perl_scalar
91    
92     Converts the given Perl data structure to CBOR representation. Croaks on
93     error.
94    
95     =item $perl_scalar = decode_cbor $cbor_data
96    
97     The opposite of C<encode_cbor>: expects a valid CBOR string to parse,
98     returning the resulting perl scalar. Croaks on error.
99    
100     =back
101    
102    
103     =head1 OBJECT-ORIENTED INTERFACE
104    
105     The object oriented interface lets you configure your own encoding or
106     decoding style, within the limits of supported formats.
107    
108     =over 4
109    
110     =item $cbor = new CBOR::XS
111    
112     Creates a new CBOR::XS object that can be used to de/encode CBOR
113     strings. All boolean flags described below are by default I<disabled>.
114    
115     The mutators for flags all return the CBOR object again and thus calls can
116     be chained:
117    
118     #TODO
119     my $cbor = CBOR::XS->new->encode ({a => [1,2]});
120    
121     =item $cbor = $cbor->max_depth ([$maximum_nesting_depth])
122    
123     =item $max_depth = $cbor->get_max_depth
124    
125     Sets the maximum nesting level (default C<512>) accepted while encoding
126     or decoding. If a higher nesting level is detected in CBOR data or a Perl
127     data structure, then the encoder and decoder will stop and croak at that
128     point.
129    
130     Nesting level is defined by number of hash- or arrayrefs that the encoder
131     needs to traverse to reach a given point or the number of C<{> or C<[>
132     characters without their matching closing parenthesis crossed to reach a
133     given character in a string.
134    
135     Setting the maximum depth to one disallows any nesting, so that ensures
136     that the object is only a single hash/object or array.
137    
138     If no argument is given, the highest possible setting will be used, which
139     is rarely useful.
140    
141     Note that nesting is implemented by recursion in C. The default value has
142     been chosen to be as large as typical operating systems allow without
143     crashing.
144    
145     See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
146    
147     =item $cbor = $cbor->max_size ([$maximum_string_size])
148    
149     =item $max_size = $cbor->get_max_size
150    
151     Set the maximum length a CBOR string may have (in bytes) where decoding
152     is being attempted. The default is C<0>, meaning no limit. When C<decode>
153     is called on a string that is longer then this many bytes, it will not
154     attempt to decode the string but throw an exception. This setting has no
155     effect on C<encode> (yet).
156    
157     If no argument is given, the limit check will be deactivated (same as when
158     C<0> is specified).
159    
160     See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
161    
162 root 1.19 =item $cbor = $cbor->allow_unknown ([$enable])
163    
164     =item $enabled = $cbor->get_allow_unknown
165    
166     If C<$enable> is true (or missing), then C<encode> will I<not> throw an
167     exception when it encounters values it cannot represent in CBOR (for
168     example, filehandles) but instead will encode a CBOR C<error> value.
169    
170     If C<$enable> is false (the default), then C<encode> will throw an
171     exception when it encounters anything it cannot encode as CBOR.
172    
173     This option does not affect C<decode> in any way, and it is recommended to
174     leave it off unless you know your communications partner.
175    
176     =item $cbor = $cbor->allow_sharable ([$enable])
177    
178     =item $enabled = $cbor->get_allow_sharable
179    
180     If C<$enable> is true (or missing), then C<encode> will not double-encode
181     values that have been seen before (e.g. when the same object, such as an
182     array, is referenced multiple times), but instead will emit a reference to
183     the earlier value.
184    
185     This means that such values will only be encoded once, and will not result
186     in a deep cloning of the value on decode, in decoders supporting the value
187     sharing extension.
188    
189     Detecting shared values incurs a runtime overhead when values are encoded
190     that have a reference counter large than one, and might unnecessarily
191     increase the encoded size, as potentially shared values are encode as
192     sharable whether or not they are actually shared.
193    
194     At the moment, all shared values will be detected, even weird and unusual
195     cases, such as an array with multiple "copies" of the I<same> scalar,
196     which are hard but not impossible to create in Perl (L<Storable> for
197     example doesn't handle these cases). If this turns out ot be a performance
198     issue then future versions might limit the shared value detection to
199     references only.
200    
201     If C<$enable> is false (the default), then C<encode> will encode
202     exception when it encounters anything it cannot encode as CBOR.
203    
204     This option does not affect C<decode> in any way - shared values and
205     references will always be decoded properly if present. It is recommended
206     to leave it off unless you know your communications partner supports the
207     value sharing extensions to CBOR (http://cbor.schmorp.de/value-sharing).
208    
209 root 1.1 =item $cbor_data = $cbor->encode ($perl_scalar)
210    
211     Converts the given Perl data structure (a scalar value) to its CBOR
212     representation.
213    
214     =item $perl_scalar = $cbor->decode ($cbor_data)
215    
216     The opposite of C<encode>: expects CBOR data and tries to parse it,
217     returning the resulting simple scalar or reference. Croaks on error.
218    
219     =item ($perl_scalar, $octets) = $cbor->decode_prefix ($cbor_data)
220    
221     This works like the C<decode> method, but instead of raising an exception
222     when there is trailing garbage after the CBOR string, it will silently
223     stop parsing there and return the number of characters consumed so far.
224    
225     This is useful if your CBOR texts are not delimited by an outer protocol
226     and you need to know where the first CBOR string ends amd the next one
227     starts.
228    
229     CBOR::XS->new->decode_prefix ("......")
230     => ("...", 3)
231    
232     =back
233    
234    
235     =head1 MAPPING
236    
237     This section describes how CBOR::XS maps Perl values to CBOR values and
238     vice versa. These mappings are designed to "do the right thing" in most
239     circumstances automatically, preserving round-tripping characteristics
240     (what you put in comes out as something equivalent).
241    
242     For the more enlightened: note that in the following descriptions,
243     lowercase I<perl> refers to the Perl interpreter, while uppercase I<Perl>
244     refers to the abstract Perl language itself.
245    
246    
247     =head2 CBOR -> PERL
248    
249     =over 4
250    
251 root 1.4 =item integers
252    
253     CBOR integers become (numeric) perl scalars. On perls without 64 bit
254     support, 64 bit integers will be truncated or otherwise corrupted.
255    
256     =item byte strings
257    
258     Byte strings will become octet strings in Perl (the byte values 0..255
259     will simply become characters of the same value in Perl).
260    
261     =item UTF-8 strings
262    
263     UTF-8 strings in CBOR will be decoded, i.e. the UTF-8 octets will be
264     decoded into proper Unicode code points. At the moment, the validity of
265     the UTF-8 octets will not be validated - corrupt input will result in
266     corrupted Perl strings.
267    
268     =item arrays, maps
269    
270     CBOR arrays and CBOR maps will be converted into references to a Perl
271     array or hash, respectively. The keys of the map will be stringified
272     during this process.
273    
274 root 1.6 =item null
275    
276     CBOR null becomes C<undef> in Perl.
277    
278     =item true, false, undefined
279 root 1.1
280 root 1.6 These CBOR values become C<Types:Serialiser::true>,
281     C<Types:Serialiser::false> and C<Types::Serialiser::error>,
282 root 1.1 respectively. They are overloaded to act almost exactly like the numbers
283 root 1.6 C<1> and C<0> (for true and false) or to throw an exception on access (for
284     error). See the L<Types::Serialiser> manpage for details.
285    
286     =item CBOR tag 256 (perl object)
287    
288 root 1.7 The tag value C<256> (TODO: pending iana registration) will be used
289 root 1.11 to deserialise a Perl object serialised with C<FREEZE>. See L<OBJECT
290     SERIALISATION>, below, for details.
291 root 1.1
292 root 1.6 =item CBOR tag 55799 (magic header)
293 root 1.4
294 root 1.6 The tag 55799 is ignored (this tag implements the magic header).
295 root 1.1
296 root 1.6 =item other CBOR tags
297 root 1.4
298 root 1.6 Tagged items consists of a numeric tag and another CBOR value. Tags not
299     handled internally are currently converted into a L<CBOR::XS::Tagged>
300     object, which is simply a blessed array reference consisting of the
301     numeric tag value followed by the (decoded) CBOR value.
302 root 1.4
303 root 1.6 In the future, support for user-supplied conversions might get added.
304 root 1.4
305     =item anything else
306    
307     Anything else (e.g. unsupported simple values) will raise a decoding
308     error.
309 root 1.1
310     =back
311    
312    
313     =head2 PERL -> CBOR
314    
315     The mapping from Perl to CBOR is slightly more difficult, as Perl is a
316     truly typeless language, so we can only guess which CBOR type is meant by
317     a Perl value.
318    
319     =over 4
320    
321     =item hash references
322    
323 root 1.4 Perl hash references become CBOR maps. As there is no inherent ordering in
324     hash keys (or CBOR maps), they will usually be encoded in a pseudo-random
325     order.
326    
327     Currently, tied hashes will use the indefinite-length format, while normal
328     hashes will use the fixed-length format.
329 root 1.1
330     =item array references
331    
332 root 1.4 Perl array references become fixed-length CBOR arrays.
333 root 1.1
334     =item other references
335    
336     Other unblessed references are generally not allowed and will cause an
337     exception to be thrown, except for references to the integers C<0> and
338 root 1.4 C<1>, which get turned into false and true in CBOR.
339    
340     =item CBOR::XS::Tagged objects
341    
342     Objects of this type must be arrays consisting of a single C<[tag, value]>
343 root 1.13 pair. The (numerical) tag will be encoded as a CBOR tag, the value will
344     be encoded as appropriate for the value. You cna use C<CBOR::XS::tag> to
345     create such objects.
346 root 1.1
347 root 1.6 =item Types::Serialiser::true, Types::Serialiser::false, Types::Serialiser::error
348 root 1.1
349 root 1.6 These special values become CBOR true, CBOR false and CBOR undefined
350     values, respectively. You can also use C<\1>, C<\0> and C<\undef> directly
351     if you want.
352 root 1.1
353 root 1.7 =item other blessed objects
354 root 1.1
355 root 1.7 Other blessed objects are serialised via C<TO_CBOR> or C<FREEZE>. See
356 root 1.11 L<OBJECT SERIALISATION>, below, for details.
357 root 1.1
358     =item simple scalars
359    
360     TODO
361     Simple Perl scalars (any scalar that is not a reference) are the most
362     difficult objects to encode: CBOR::XS will encode undefined scalars as
363 root 1.4 CBOR null values, scalars that have last been used in a string context
364 root 1.1 before encoding as CBOR strings, and anything else as number value:
365    
366     # dump as number
367     encode_cbor [2] # yields [2]
368     encode_cbor [-3.0e17] # yields [-3e+17]
369     my $value = 5; encode_cbor [$value] # yields [5]
370    
371     # used as string, so dump as string
372     print $value;
373     encode_cbor [$value] # yields ["5"]
374    
375     # undef becomes null
376     encode_cbor [undef] # yields [null]
377    
378     You can force the type to be a CBOR string by stringifying it:
379    
380     my $x = 3.1; # some variable containing a number
381     "$x"; # stringified
382     $x .= ""; # another, more awkward way to stringify
383     print $x; # perl does it for you, too, quite often
384    
385     You can force the type to be a CBOR number by numifying it:
386    
387     my $x = "3"; # some variable containing a string
388     $x += 0; # numify it, ensuring it will be dumped as a number
389     $x *= 1; # same thing, the choice is yours.
390    
391     You can not currently force the type in other, less obscure, ways. Tell me
392     if you need this capability (but don't forget to explain why it's needed
393     :).
394    
395 root 1.4 Perl values that seem to be integers generally use the shortest possible
396     representation. Floating-point values will use either the IEEE single
397     format if possible without loss of precision, otherwise the IEEE double
398     format will be used. Perls that use formats other than IEEE double to
399     represent numerical values are supported, but might suffer loss of
400     precision.
401 root 1.1
402     =back
403    
404 root 1.7 =head2 OBJECT SERIALISATION
405    
406     This module knows two way to serialise a Perl object: The CBOR-specific
407     way, and the generic way.
408    
409     Whenever the encoder encounters a Perl object that it cnanot serialise
410     directly (most of them), it will first look up the C<TO_CBOR> method on
411     it.
412    
413     If it has a C<TO_CBOR> method, it will call it with the object as only
414     argument, and expects exactly one return value, which it will then
415     substitute and encode it in the place of the object.
416    
417     Otherwise, it will look up the C<FREEZE> method. If it exists, it will
418     call it with the object as first argument, and the constant string C<CBOR>
419     as the second argument, to distinguish it from other serialisers.
420    
421     The C<FREEZE> method can return any number of values (i.e. zero or
422     more). These will be encoded as CBOR perl object, together with the
423     classname.
424    
425     If an object supports neither C<TO_CBOR> nor C<FREEZE>, encoding will fail
426     with an error.
427    
428     Objects encoded via C<TO_CBOR> cannot be automatically decoded, but
429     objects encoded via C<FREEZE> can be decoded using the following protocol:
430    
431     When an encoded CBOR perl object is encountered by the decoder, it will
432     look up the C<THAW> method, by using the stored classname, and will fail
433     if the method cannot be found.
434    
435     After the lookup it will call the C<THAW> method with the stored classname
436     as first argument, the constant string C<CBOR> as second argument, and all
437     values returned by C<FREEZE> as remaining arguments.
438    
439     =head4 EXAMPLES
440    
441     Here is an example C<TO_CBOR> method:
442    
443     sub My::Object::TO_CBOR {
444     my ($obj) = @_;
445    
446     ["this is a serialised My::Object object", $obj->{id}]
447     }
448    
449     When a C<My::Object> is encoded to CBOR, it will instead encode a simple
450     array with two members: a string, and the "object id". Decoding this CBOR
451     string will yield a normal perl array reference in place of the object.
452    
453     A more useful and practical example would be a serialisation method for
454     the URI module. CBOR has a custom tag value for URIs, namely 32:
455    
456     sub URI::TO_CBOR {
457     my ($self) = @_;
458     my $uri = "$self"; # stringify uri
459     utf8::upgrade $uri; # make sure it will be encoded as UTF-8 string
460     CBOR::XS::tagged 32, "$_[0]"
461     }
462    
463     This will encode URIs as a UTF-8 string with tag 32, which indicates an
464     URI.
465    
466     Decoding such an URI will not (currently) give you an URI object, but
467     instead a CBOR::XS::Tagged object with tag number 32 and the string -
468     exactly what was returned by C<TO_CBOR>.
469    
470     To serialise an object so it can automatically be deserialised, you need
471     to use C<FREEZE> and C<THAW>. To take the URI module as example, this
472     would be a possible implementation:
473    
474     sub URI::FREEZE {
475     my ($self, $serialiser) = @_;
476     "$self" # encode url string
477     }
478    
479     sub URI::THAW {
480     my ($class, $serialiser, $uri) = @_;
481    
482     $class->new ($uri)
483     }
484    
485     Unlike C<TO_CBOR>, multiple values can be returned by C<FREEZE>. For
486     example, a C<FREEZE> method that returns "type", "id" and "variant" values
487     would cause an invocation of C<THAW> with 5 arguments:
488    
489     sub My::Object::FREEZE {
490     my ($self, $serialiser) = @_;
491    
492     ($self->{type}, $self->{id}, $self->{variant})
493     }
494    
495     sub My::Object::THAW {
496     my ($class, $serialiser, $type, $id, $variant) = @_;
497    
498     $class-<new (type => $type, id => $id, variant => $variant)
499     }
500    
501 root 1.1
502 root 1.7 =head1 MAGIC HEADER
503 root 1.3
504     There is no way to distinguish CBOR from other formats
505     programmatically. To make it easier to distinguish CBOR from other
506     formats, the CBOR specification has a special "magic string" that can be
507 root 1.18 prepended to any CBOR string without changing its meaning.
508 root 1.3
509     This string is available as C<$CBOR::XS::MAGIC>. This module does not
510 root 1.18 prepend this string to the CBOR data it generates, but it will ignore it
511 root 1.3 if present, so users can prepend this string as a "file type" indicator as
512     required.
513    
514    
515 root 1.12 =head1 THE CBOR::XS::Tagged CLASS
516    
517     CBOR has the concept of tagged values - any CBOR value can be tagged with
518     a numeric 64 bit number, which are centrally administered.
519    
520     C<CBOR::XS> handles a few tags internally when en- or decoding. You can
521     also create tags yourself by encoding C<CBOR::XS::Tagged> objects, and the
522     decoder will create C<CBOR::XS::Tagged> objects itself when it hits an
523     unknown tag.
524    
525     These objects are simply blessed array references - the first member of
526     the array being the numerical tag, the second being the value.
527    
528     You can interact with C<CBOR::XS::Tagged> objects in the following ways:
529    
530     =over 4
531    
532     =item $tagged = CBOR::XS::tag $tag, $value
533    
534     This function(!) creates a new C<CBOR::XS::Tagged> object using the given
535     C<$tag> (0..2**64-1) to tag the given C<$value> (which can be any Perl
536     value that can be encoded in CBOR, including serialisable Perl objects and
537     C<CBOR::XS::Tagged> objects).
538    
539     =item $tagged->[0]
540    
541     =item $tagged->[0] = $new_tag
542    
543     =item $tag = $tagged->tag
544    
545     =item $new_tag = $tagged->tag ($new_tag)
546    
547     Access/mutate the tag.
548    
549     =item $tagged->[1]
550    
551     =item $tagged->[1] = $new_value
552    
553     =item $value = $tagged->value
554    
555     =item $new_value = $tagged->value ($new_value)
556    
557     Access/mutate the tagged value.
558    
559     =back
560    
561     =cut
562    
563     sub tag($$) {
564     bless [@_], CBOR::XS::Tagged::;
565     }
566    
567     sub CBOR::XS::Tagged::tag {
568     $_[0][0] = $_[1] if $#_;
569     $_[0][0]
570     }
571    
572     sub CBOR::XS::Tagged::value {
573     $_[0][1] = $_[1] if $#_;
574     $_[0][1]
575     }
576    
577 root 1.13 =head2 EXAMPLES
578    
579     Here are some examples of C<CBOR::XS::Tagged> uses to tag objects.
580    
581     You can look up CBOR tag value and emanings in the IANA registry at
582     L<http://www.iana.org/assignments/cbor-tags/cbor-tags.xhtml>.
583    
584     Prepend a magic header (C<$CBOR::XS::MAGIC>):
585    
586     my $cbor = encode_cbor CBOR::XS::tag 55799, $value;
587     # same as:
588     my $cbor = $CBOR::XS::MAGIC . encode_cbor $value;
589    
590     Serialise some URIs and a regex in an array:
591    
592     my $cbor = encode_cbor [
593     (CBOR::XS::tag 32, "http://www.nethype.de/"),
594     (CBOR::XS::tag 32, "http://software.schmorp.de/"),
595     (CBOR::XS::tag 35, "^[Pp][Ee][Rr][lL]\$"),
596     ];
597    
598     Wrap CBOR data in CBOR:
599    
600     my $cbor_cbor = encode_cbor
601     CBOR::XS::tag 24,
602     encode_cbor [1, 2, 3];
603    
604 root 1.19 =head1 TAG HANDLING AND EXTENSIONS
605    
606     This section describes how this module handles specific tagged values and
607     extensions. If a tag is not mentioned here, then the default handling
608     applies (creating a CBOR::XS::Tagged object on decoding, and only encoding
609     the tag when explicitly requested).
610    
611     Future versions of this module reserve the right to special case
612     additional tags (such as bigfloat or base64url).
613    
614     =over 4
615    
616     =item <unassigned> (perl-object, L<http://cbor.schmorp.de/perl-object>)
617    
618     These tags are automatically created for serialisable objects using the
619     C<FREEZE/THAW> methods (the L<Types::Serialier> object serialisation
620     protocol).
621    
622     =item <unassigned>, <unassigned> (sharable, sharedref, L <http://cbor.schmorp.de/value-sharing>)
623    
624     These tags are automatically decoded when encountered, resulting in
625     shared values in the decoded object. They are only encoded, however, when
626     C<allow_sharable> is enabled.
627    
628     =item 22098 (indirection, L<http://cbor.schmorp.de/indirection>)
629    
630     This tag is automatically generated when a reference are encountered (with
631     the exception of hash and array refernces). It is converted to a reference
632     when decoding.
633    
634     =item 55799 (self-describe CBOR, RFC 7049)
635    
636     This value is not generated on encoding (unless explicitly requested by
637     the user), and is simply ignored when decoding.
638    
639     =back
640    
641    
642 root 1.7 =head1 CBOR and JSON
643 root 1.1
644 root 1.4 CBOR is supposed to implement a superset of the JSON data model, and is,
645     with some coercion, able to represent all JSON texts (something that other
646     "binary JSON" formats such as BSON generally do not support).
647    
648     CBOR implements some extra hints and support for JSON interoperability,
649     and the spec offers further guidance for conversion between CBOR and
650     JSON. None of this is currently implemented in CBOR, and the guidelines
651     in the spec do not result in correct round-tripping of data. If JSON
652     interoperability is improved in the future, then the goal will be to
653     ensure that decoded JSON data will round-trip encoding and decoding to
654     CBOR intact.
655 root 1.1
656    
657     =head1 SECURITY CONSIDERATIONS
658    
659     When you are using CBOR in a protocol, talking to untrusted potentially
660     hostile creatures requires relatively few measures.
661    
662     First of all, your CBOR decoder should be secure, that is, should not have
663     any buffer overflows. Obviously, this module should ensure that and I am
664     trying hard on making that true, but you never know.
665    
666     Second, you need to avoid resource-starving attacks. That means you should
667     limit the size of CBOR data you accept, or make sure then when your
668     resources run out, that's just fine (e.g. by using a separate process that
669     can crash safely). The size of a CBOR string in octets is usually a good
670     indication of the size of the resources required to decode it into a Perl
671     structure. While CBOR::XS can check the size of the CBOR text, it might be
672     too late when you already have it in memory, so you might want to check
673     the size before you accept the string.
674    
675     Third, CBOR::XS recurses using the C stack when decoding objects and
676     arrays. The C stack is a limited resource: for instance, on my amd64
677     machine with 8MB of stack size I can decode around 180k nested arrays but
678     only 14k nested CBOR objects (due to perl itself recursing deeply on croak
679     to free the temporary). If that is exceeded, the program crashes. To be
680     conservative, the default nesting limit is set to 512. If your process
681     has a smaller stack, you should adjust this setting accordingly with the
682     C<max_depth> method.
683    
684     Something else could bomb you, too, that I forgot to think of. In that
685     case, you get to keep the pieces. I am always open for hints, though...
686    
687     Also keep in mind that CBOR::XS might leak contents of your Perl data
688     structures in its error messages, so when you serialise sensitive
689     information you might want to make sure that exceptions thrown by CBOR::XS
690     will not end up in front of untrusted eyes.
691    
692     =head1 CBOR IMPLEMENTATION NOTES
693    
694     This section contains some random implementation notes. They do not
695     describe guaranteed behaviour, but merely behaviour as-is implemented
696     right now.
697    
698     64 bit integers are only properly decoded when Perl was built with 64 bit
699     support.
700    
701     Strings and arrays are encoded with a definite length. Hashes as well,
702     unless they are tied (or otherwise magical).
703    
704     Only the double data type is supported for NV data types - when Perl uses
705     long double to represent floating point values, they might not be encoded
706     properly. Half precision types are accepted, but not encoded.
707    
708     Strict mode and canonical mode are not implemented.
709    
710    
711     =head1 THREADS
712    
713     This module is I<not> guaranteed to be thread safe and there are no
714     plans to change this until Perl gets thread support (as opposed to the
715     horribly slow so-called "threads" which are simply slow and bloated
716     process simulations - use fork, it's I<much> faster, cheaper, better).
717    
718     (It might actually work, but you have been warned).
719    
720    
721     =head1 BUGS
722    
723     While the goal of this module is to be correct, that unfortunately does
724     not mean it's bug-free, only that I think its design is bug-free. If you
725     keep reporting bugs they will be fixed swiftly, though.
726    
727     Please refrain from using rt.cpan.org or any other bug reporting
728     service. I put the contact address into my modules for a reason.
729    
730     =cut
731    
732     XSLoader::load "CBOR::XS", $VERSION;
733    
734     =head1 SEE ALSO
735    
736     The L<JSON> and L<JSON::XS> modules that do similar, but human-readable,
737     serialisation.
738    
739 root 1.6 The L<Types::Serialiser> module provides the data model for true, false
740     and error values.
741    
742 root 1.1 =head1 AUTHOR
743    
744     Marc Lehmann <schmorp@schmorp.de>
745     http://home.schmorp.de/
746    
747     =cut
748    
749 root 1.6 1
750