ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/CBOR-XS/README
Revision: 1.11
Committed: Sat Nov 30 18:42:27 2013 UTC (10 years, 5 months ago) by root
Branch: MAIN
CVS Tags: rel-1_1
Changes since 1.10: +57 -12 lines
Log Message:
1.1

File Contents

# User Rev Content
1 root 1.2 NAME
2     CBOR::XS - Concise Binary Object Representation (CBOR, RFC7049)
3    
4     SYNOPSIS
5     use CBOR::XS;
6    
7     $binary_cbor_data = encode_cbor $perl_value;
8     $perl_value = decode_cbor $binary_cbor_data;
9    
10     # OO-interface
11    
12     $coder = CBOR::XS->new;
13 root 1.5 $binary_cbor_data = $coder->encode ($perl_value);
14     $perl_value = $coder->decode ($binary_cbor_data);
15    
16     # prefix decoding
17    
18     my $many_cbor_strings = ...;
19     while (length $many_cbor_strings) {
20     my ($data, $length) = $cbor->decode_prefix ($many_cbor_strings);
21     # data was decoded
22     substr $many_cbor_strings, 0, $length, ""; # remove decoded cbor string
23     }
24 root 1.2
25     DESCRIPTION
26 root 1.4 This module converts Perl data structures to the Concise Binary Object
27     Representation (CBOR) and vice versa. CBOR is a fast binary
28 root 1.10 serialisation format that aims to use an (almost) superset of the JSON
29     data model, i.e. when you can represent something useful in JSON, you
30     should be able to represent it in CBOR.
31 root 1.4
32 root 1.10 In short, CBOR is a faster and quite compact binary alternative to JSON,
33 root 1.6 with the added ability of supporting serialisation of Perl objects.
34 root 1.7 (JSON often compresses better than CBOR though, so if you plan to
35 root 1.10 compress the data later and speed is less important you might want to
36     compare both formats first).
37 root 1.4
38 root 1.8 To give you a general idea about speed, with texts in the megabyte
39     range, "CBOR::XS" usually encodes roughly twice as fast as Storable or
40     JSON::XS and decodes about 15%-30% faster than those. The shorter the
41     data, the worse Storable performs in comparison.
42    
43 root 1.10 Regarding compactness, "CBOR::XS"-encoded data structures are usually
44     about 20% smaller than the same data encoded as (compact) JSON or
45     Storable.
46 root 1.8
47 root 1.9 In addition to the core CBOR data format, this module implements a
48 root 1.10 number of extensions, to support cyclic and shared data structures (see
49 root 1.11 "allow_sharing" and "allow_cycles"), string deduplication (see
50     "pack_strings") and scalar references (always enabled).
51 root 1.9
52 root 1.4 The primary goal of this module is to be *correct* and the secondary
53     goal is to be *fast*. To reach the latter goal it was written in C.
54 root 1.2
55     See MAPPING, below, on how CBOR::XS maps perl values to CBOR values and
56     vice versa.
57    
58     FUNCTIONAL INTERFACE
59     The following convenience methods are provided by this module. They are
60     exported by default:
61    
62     $cbor_data = encode_cbor $perl_scalar
63     Converts the given Perl data structure to CBOR representation.
64     Croaks on error.
65    
66     $perl_scalar = decode_cbor $cbor_data
67     The opposite of "encode_cbor": expects a valid CBOR string to parse,
68     returning the resulting perl scalar. Croaks on error.
69    
70     OBJECT-ORIENTED INTERFACE
71     The object oriented interface lets you configure your own encoding or
72     decoding style, within the limits of supported formats.
73    
74     $cbor = new CBOR::XS
75     Creates a new CBOR::XS object that can be used to de/encode CBOR
76     strings. All boolean flags described below are by default
77     *disabled*.
78    
79     The mutators for flags all return the CBOR object again and thus
80     calls can be chained:
81    
82 root 1.9 my $cbor = CBOR::XS->new->encode ({a => [1,2]});
83 root 1.2
84     $cbor = $cbor->max_depth ([$maximum_nesting_depth])
85     $max_depth = $cbor->get_max_depth
86     Sets the maximum nesting level (default 512) accepted while encoding
87     or decoding. If a higher nesting level is detected in CBOR data or a
88     Perl data structure, then the encoder and decoder will stop and
89     croak at that point.
90    
91     Nesting level is defined by number of hash- or arrayrefs that the
92     encoder needs to traverse to reach a given point or the number of
93     "{" or "[" characters without their matching closing parenthesis
94     crossed to reach a given character in a string.
95    
96     Setting the maximum depth to one disallows any nesting, so that
97     ensures that the object is only a single hash/object or array.
98    
99     If no argument is given, the highest possible setting will be used,
100     which is rarely useful.
101    
102     Note that nesting is implemented by recursion in C. The default
103     value has been chosen to be as large as typical operating systems
104     allow without crashing.
105    
106     See SECURITY CONSIDERATIONS, below, for more info on why this is
107     useful.
108    
109     $cbor = $cbor->max_size ([$maximum_string_size])
110     $max_size = $cbor->get_max_size
111     Set the maximum length a CBOR string may have (in bytes) where
112     decoding is being attempted. The default is 0, meaning no limit.
113     When "decode" is called on a string that is longer then this many
114     bytes, it will not attempt to decode the string but throw an
115     exception. This setting has no effect on "encode" (yet).
116    
117     If no argument is given, the limit check will be deactivated (same
118     as when 0 is specified).
119    
120     See SECURITY CONSIDERATIONS, below, for more info on why this is
121     useful.
122    
123 root 1.9 $cbor = $cbor->allow_unknown ([$enable])
124     $enabled = $cbor->get_allow_unknown
125     If $enable is true (or missing), then "encode" will *not* throw an
126     exception when it encounters values it cannot represent in CBOR (for
127     example, filehandles) but instead will encode a CBOR "error" value.
128    
129     If $enable is false (the default), then "encode" will throw an
130     exception when it encounters anything it cannot encode as CBOR.
131    
132     This option does not affect "decode" in any way, and it is
133     recommended to leave it off unless you know your communications
134     partner.
135    
136     $cbor = $cbor->allow_sharing ([$enable])
137     $enabled = $cbor->get_allow_sharing
138     If $enable is true (or missing), then "encode" will not
139     double-encode values that have been referenced before (e.g. when the
140     same object, such as an array, is referenced multiple times), but
141     instead will emit a reference to the earlier value.
142    
143     This means that such values will only be encoded once, and will not
144     result in a deep cloning of the value on decode, in decoders
145 root 1.10 supporting the value sharing extension. This also makes it possible
146 root 1.11 to encode cyclic data structures (which need "allow_cycles" to ne
147     enabled to be decoded by this module).
148 root 1.9
149     It is recommended to leave it off unless you know your communication
150     partner supports the value sharing extensions to CBOR
151 root 1.10 (<http://cbor.schmorp.de/value-sharing>), as without decoder
152     support, the resulting data structure might be unusable.
153 root 1.9
154     Detecting shared values incurs a runtime overhead when values are
155     encoded that have a reference counter large than one, and might
156     unnecessarily increase the encoded size, as potentially shared
157 root 1.11 values are encode as shareable whether or not they are actually
158 root 1.9 shared.
159    
160     At the moment, only targets of references can be shared (e.g.
161     scalars, arrays or hashes pointed to by a reference). Weirder
162     constructs, such as an array with multiple "copies" of the *same*
163     string, which are hard but not impossible to create in Perl, are not
164 root 1.10 supported (this is the same as with Storable).
165 root 1.9
166 root 1.10 If $enable is false (the default), then "encode" will encode shared
167     data structures repeatedly, unsharing them in the process. Cyclic
168     data structures cannot be encoded in this mode.
169 root 1.9
170     This option does not affect "decode" in any way - shared values and
171     references will always be decoded properly if present.
172    
173 root 1.11 $cbor = $cbor->allow_cycles ([$enable])
174     $enabled = $cbor->get_allow_cycles
175     If $enable is true (or missing), then "decode" will happily decode
176     self-referential (cyclic) data structures. By default these will not
177     be decoded, as they need manual cleanup to avoid memory leaks, so
178     code that isn't prepared for this will not leak memory.
179    
180     If $enable is false (the default), then "decode" will throw an error
181     when it encounters a self-referential/cyclic data structure.
182    
183     This option does not affect "encode" in any way - shared values and
184     references will always be decoded properly if present.
185    
186 root 1.10 $cbor = $cbor->pack_strings ([$enable])
187     $enabled = $cbor->get_pack_strings
188 root 1.9 If $enable is true (or missing), then "encode" will try not to
189     encode the same string twice, but will instead encode a reference to
190 root 1.10 the string instead. Depending on your data format, this can save a
191 root 1.9 lot of space, but also results in a very large runtime overhead
192     (expect encoding times to be 2-4 times as high as without).
193    
194     It is recommended to leave it off unless you know your
195     communications partner supports the stringref extension to CBOR
196 root 1.10 (<http://cbor.schmorp.de/stringref>), as without decoder support,
197     the resulting data structure might not be usable.
198 root 1.9
199 root 1.10 If $enable is false (the default), then "encode" will encode strings
200     the standard CBOR way.
201 root 1.9
202     This option does not affect "decode" in any way - string references
203     will always be decoded properly if present.
204    
205     $cbor = $cbor->filter ([$cb->($tag, $value)])
206     $cb_or_undef = $cbor->get_filter
207     Sets or replaces the tagged value decoding filter (when $cb is
208     specified) or clears the filter (if no argument or "undef" is
209     provided).
210    
211     The filter callback is called only during decoding, when a
212     non-enforced tagged value has been decoded (see "TAG HANDLING AND
213     EXTENSIONS" for a list of enforced tags). For specific tags, it's
214     often better to provide a default converter using the
215     %CBOR::XS::FILTER hash (see below).
216    
217     The first argument is the numerical tag, the second is the (decoded)
218     value that has been tagged.
219    
220     The filter function should return either exactly one value, which
221     will replace the tagged value in the decoded data structure, or no
222     values, which will result in default handling, which currently means
223     the decoder creates a "CBOR::XS::Tagged" object to hold the tag and
224     the value.
225    
226     When the filter is cleared (the default state), the default filter
227     function, "CBOR::XS::default_filter", is used. This function simply
228     looks up the tag in the %CBOR::XS::FILTER hash. If an entry exists
229     it must be a code reference that is called with tag and value, and
230     is responsible for decoding the value. If no entry exists, it
231     returns no values.
232    
233     Example: decode all tags not handled internally into
234 root 1.10 "CBOR::XS::Tagged" objects, with no other special handling (useful
235 root 1.9 when working with potentially "unsafe" CBOR data).
236    
237     CBOR::XS->new->filter (sub { })->decode ($cbor_data);
238    
239     Example: provide a global filter for tag 1347375694, converting the
240     value into some string form.
241    
242     $CBOR::XS::FILTER{1347375694} = sub {
243     my ($tag, $value);
244    
245     "tag 1347375694 value $value"
246     };
247    
248 root 1.2 $cbor_data = $cbor->encode ($perl_scalar)
249     Converts the given Perl data structure (a scalar value) to its CBOR
250     representation.
251    
252     $perl_scalar = $cbor->decode ($cbor_data)
253     The opposite of "encode": expects CBOR data and tries to parse it,
254     returning the resulting simple scalar or reference. Croaks on error.
255    
256     ($perl_scalar, $octets) = $cbor->decode_prefix ($cbor_data)
257     This works like the "decode" method, but instead of raising an
258     exception when there is trailing garbage after the CBOR string, it
259     will silently stop parsing there and return the number of characters
260     consumed so far.
261    
262     This is useful if your CBOR texts are not delimited by an outer
263     protocol and you need to know where the first CBOR string ends amd
264     the next one starts.
265    
266     CBOR::XS->new->decode_prefix ("......")
267     => ("...", 3)
268    
269     MAPPING
270     This section describes how CBOR::XS maps Perl values to CBOR values and
271     vice versa. These mappings are designed to "do the right thing" in most
272     circumstances automatically, preserving round-tripping characteristics
273     (what you put in comes out as something equivalent).
274    
275     For the more enlightened: note that in the following descriptions,
276     lowercase *perl* refers to the Perl interpreter, while uppercase *Perl*
277     refers to the abstract Perl language itself.
278    
279     CBOR -> PERL
280 root 1.4 integers
281     CBOR integers become (numeric) perl scalars. On perls without 64 bit
282     support, 64 bit integers will be truncated or otherwise corrupted.
283    
284     byte strings
285 root 1.10 Byte strings will become octet strings in Perl (the Byte values
286 root 1.4 0..255 will simply become characters of the same value in Perl).
287    
288     UTF-8 strings
289     UTF-8 strings in CBOR will be decoded, i.e. the UTF-8 octets will be
290     decoded into proper Unicode code points. At the moment, the validity
291     of the UTF-8 octets will not be validated - corrupt input will
292     result in corrupted Perl strings.
293    
294     arrays, maps
295     CBOR arrays and CBOR maps will be converted into references to a
296     Perl array or hash, respectively. The keys of the map will be
297     stringified during this process.
298    
299 root 1.5 null
300     CBOR null becomes "undef" in Perl.
301    
302     true, false, undefined
303     These CBOR values become "Types:Serialiser::true",
304     "Types:Serialiser::false" and "Types::Serialiser::error",
305 root 1.2 respectively. They are overloaded to act almost exactly like the
306 root 1.5 numbers 1 and 0 (for true and false) or to throw an exception on
307     access (for error). See the Types::Serialiser manpage for details.
308    
309 root 1.9 tagged values
310     Tagged items consists of a numeric tag and another CBOR value.
311 root 1.2
312 root 1.9 See "TAG HANDLING AND EXTENSIONS" and the description of "->filter"
313 root 1.10 for details on which tags are handled how.
314 root 1.4
315     anything else
316     Anything else (e.g. unsupported simple values) will raise a decoding
317     error.
318 root 1.2
319     PERL -> CBOR
320     The mapping from Perl to CBOR is slightly more difficult, as Perl is a
321 root 1.10 typeless language. That means this module can only guess which CBOR type
322     is meant by a perl value.
323 root 1.2
324     hash references
325     Perl hash references become CBOR maps. As there is no inherent
326     ordering in hash keys (or CBOR maps), they will usually be encoded
327 root 1.10 in a pseudo-random order. This order can be different each time a
328     hahs is encoded.
329 root 1.2
330 root 1.4 Currently, tied hashes will use the indefinite-length format, while
331     normal hashes will use the fixed-length format.
332    
333 root 1.2 array references
334 root 1.4 Perl array references become fixed-length CBOR arrays.
335 root 1.2
336     other references
337 root 1.10 Other unblessed references will be represented using the indirection
338     tag extension (tag value 22098,
339     <http://cbor.schmorp.de/indirection>). CBOR decoders are guaranteed
340     to be able to decode these values somehow, by either "doing the
341     right thing", decoding into a generic tagged object, simply ignoring
342     the tag, or something else.
343 root 1.4
344     CBOR::XS::Tagged objects
345     Objects of this type must be arrays consisting of a single "[tag,
346     value]" pair. The (numerical) tag will be encoded as a CBOR tag, the
347 root 1.10 value will be encoded as appropriate for the value. You must use
348 root 1.7 "CBOR::XS::tag" to create such objects.
349 root 1.2
350 root 1.5 Types::Serialiser::true, Types::Serialiser::false,
351     Types::Serialiser::error
352     These special values become CBOR true, CBOR false and CBOR undefined
353     values, respectively. You can also use "\1", "\0" and "\undef"
354     directly if you want.
355    
356     other blessed objects
357     Other blessed objects are serialised via "TO_CBOR" or "FREEZE". See
358 root 1.9 "TAG HANDLING AND EXTENSIONS" for specific classes handled by this
359     module, and "OBJECT SERIALISATION" for generic object serialisation.
360 root 1.2
361     simple scalars
362 root 1.9 Simple Perl scalars (any scalar that is not a reference) are the
363     most difficult objects to encode: CBOR::XS will encode undefined
364 root 1.4 scalars as CBOR null values, scalars that have last been used in a
365 root 1.2 string context before encoding as CBOR strings, and anything else as
366     number value:
367    
368     # dump as number
369     encode_cbor [2] # yields [2]
370     encode_cbor [-3.0e17] # yields [-3e+17]
371     my $value = 5; encode_cbor [$value] # yields [5]
372    
373 root 1.10 # used as string, so dump as string (either byte or text)
374 root 1.2 print $value;
375     encode_cbor [$value] # yields ["5"]
376    
377     # undef becomes null
378     encode_cbor [undef] # yields [null]
379    
380     You can force the type to be a CBOR string by stringifying it:
381    
382     my $x = 3.1; # some variable containing a number
383     "$x"; # stringified
384     $x .= ""; # another, more awkward way to stringify
385     print $x; # perl does it for you, too, quite often
386    
387 root 1.10 You can force whether a string ie encoded as byte or text string by
388     using "utf8::upgrade" and "utf8::downgrade"):
389    
390     utf8::upgrade $x; # encode $x as text string
391     utf8::downgrade $x; # encode $x as byte string
392    
393     Perl doesn't define what operations up- and downgrade strings, so if
394     the difference between byte and text is important, you should up- or
395     downgrade your string as late as possible before encoding.
396    
397 root 1.2 You can force the type to be a CBOR number by numifying it:
398    
399     my $x = "3"; # some variable containing a string
400     $x += 0; # numify it, ensuring it will be dumped as a number
401     $x *= 1; # same thing, the choice is yours.
402    
403     You can not currently force the type in other, less obscure, ways.
404     Tell me if you need this capability (but don't forget to explain why
405     it's needed :).
406    
407 root 1.4 Perl values that seem to be integers generally use the shortest
408     possible representation. Floating-point values will use either the
409     IEEE single format if possible without loss of precision, otherwise
410     the IEEE double format will be used. Perls that use formats other
411     than IEEE double to represent numerical values are supported, but
412     might suffer loss of precision.
413 root 1.2
414 root 1.5 OBJECT SERIALISATION
415 root 1.11 This module implements both a CBOR-specific and the generic
416     Types::Serialier object serialisation protocol. The following
417     subsections explain both methods.
418    
419     ENCODING
420 root 1.5 This module knows two way to serialise a Perl object: The CBOR-specific
421     way, and the generic way.
422    
423 root 1.11 Whenever the encoder encounters a Perl object that it cannot serialise
424 root 1.5 directly (most of them), it will first look up the "TO_CBOR" method on
425     it.
426    
427     If it has a "TO_CBOR" method, it will call it with the object as only
428     argument, and expects exactly one return value, which it will then
429     substitute and encode it in the place of the object.
430    
431     Otherwise, it will look up the "FREEZE" method. If it exists, it will
432     call it with the object as first argument, and the constant string
433     "CBOR" as the second argument, to distinguish it from other serialisers.
434    
435     The "FREEZE" method can return any number of values (i.e. zero or more).
436     These will be encoded as CBOR perl object, together with the classname.
437    
438 root 1.11 These methods *MUST NOT* change the data structure that is being
439     serialised. Failure to comply to this can result in memory corruption -
440     and worse.
441    
442 root 1.5 If an object supports neither "TO_CBOR" nor "FREEZE", encoding will fail
443     with an error.
444    
445 root 1.11 DECODING
446     Objects encoded via "TO_CBOR" cannot (normally) be automatically
447     decoded, but objects encoded via "FREEZE" can be decoded using the
448     following protocol:
449 root 1.5
450     When an encoded CBOR perl object is encountered by the decoder, it will
451     look up the "THAW" method, by using the stored classname, and will fail
452     if the method cannot be found.
453    
454     After the lookup it will call the "THAW" method with the stored
455     classname as first argument, the constant string "CBOR" as second
456     argument, and all values returned by "FREEZE" as remaining arguments.
457    
458     EXAMPLES
459     Here is an example "TO_CBOR" method:
460    
461     sub My::Object::TO_CBOR {
462     my ($obj) = @_;
463    
464     ["this is a serialised My::Object object", $obj->{id}]
465     }
466    
467     When a "My::Object" is encoded to CBOR, it will instead encode a simple
468     array with two members: a string, and the "object id". Decoding this
469     CBOR string will yield a normal perl array reference in place of the
470     object.
471    
472     A more useful and practical example would be a serialisation method for
473     the URI module. CBOR has a custom tag value for URIs, namely 32:
474    
475     sub URI::TO_CBOR {
476     my ($self) = @_;
477     my $uri = "$self"; # stringify uri
478     utf8::upgrade $uri; # make sure it will be encoded as UTF-8 string
479 root 1.10 CBOR::XS::tag 32, "$_[0]"
480 root 1.5 }
481    
482     This will encode URIs as a UTF-8 string with tag 32, which indicates an
483     URI.
484    
485     Decoding such an URI will not (currently) give you an URI object, but
486     instead a CBOR::XS::Tagged object with tag number 32 and the string -
487     exactly what was returned by "TO_CBOR".
488    
489     To serialise an object so it can automatically be deserialised, you need
490     to use "FREEZE" and "THAW". To take the URI module as example, this
491     would be a possible implementation:
492    
493     sub URI::FREEZE {
494     my ($self, $serialiser) = @_;
495     "$self" # encode url string
496     }
497    
498     sub URI::THAW {
499     my ($class, $serialiser, $uri) = @_;
500    
501     $class->new ($uri)
502     }
503    
504     Unlike "TO_CBOR", multiple values can be returned by "FREEZE". For
505     example, a "FREEZE" method that returns "type", "id" and "variant"
506     values would cause an invocation of "THAW" with 5 arguments:
507    
508     sub My::Object::FREEZE {
509     my ($self, $serialiser) = @_;
510    
511     ($self->{type}, $self->{id}, $self->{variant})
512     }
513    
514     sub My::Object::THAW {
515     my ($class, $serialiser, $type, $id, $variant) = @_;
516    
517     $class-<new (type => $type, id => $id, variant => $variant)
518     }
519    
520     MAGIC HEADER
521 root 1.3 There is no way to distinguish CBOR from other formats programmatically.
522     To make it easier to distinguish CBOR from other formats, the CBOR
523     specification has a special "magic string" that can be prepended to any
524 root 1.9 CBOR string without changing its meaning.
525 root 1.3
526     This string is available as $CBOR::XS::MAGIC. This module does not
527 root 1.9 prepend this string to the CBOR data it generates, but it will ignore it
528 root 1.3 if present, so users can prepend this string as a "file type" indicator
529     as required.
530    
531 root 1.7 THE CBOR::XS::Tagged CLASS
532     CBOR has the concept of tagged values - any CBOR value can be tagged
533     with a numeric 64 bit number, which are centrally administered.
534    
535     "CBOR::XS" handles a few tags internally when en- or decoding. You can
536     also create tags yourself by encoding "CBOR::XS::Tagged" objects, and
537     the decoder will create "CBOR::XS::Tagged" objects itself when it hits
538     an unknown tag.
539    
540     These objects are simply blessed array references - the first member of
541     the array being the numerical tag, the second being the value.
542    
543     You can interact with "CBOR::XS::Tagged" objects in the following ways:
544    
545     $tagged = CBOR::XS::tag $tag, $value
546     This function(!) creates a new "CBOR::XS::Tagged" object using the
547     given $tag (0..2**64-1) to tag the given $value (which can be any
548     Perl value that can be encoded in CBOR, including serialisable Perl
549     objects and "CBOR::XS::Tagged" objects).
550    
551     $tagged->[0]
552     $tagged->[0] = $new_tag
553     $tag = $tagged->tag
554     $new_tag = $tagged->tag ($new_tag)
555     Access/mutate the tag.
556    
557     $tagged->[1]
558     $tagged->[1] = $new_value
559     $value = $tagged->value
560     $new_value = $tagged->value ($new_value)
561     Access/mutate the tagged value.
562    
563     EXAMPLES
564     Here are some examples of "CBOR::XS::Tagged" uses to tag objects.
565    
566     You can look up CBOR tag value and emanings in the IANA registry at
567     <http://www.iana.org/assignments/cbor-tags/cbor-tags.xhtml>.
568    
569     Prepend a magic header ($CBOR::XS::MAGIC):
570    
571     my $cbor = encode_cbor CBOR::XS::tag 55799, $value;
572     # same as:
573     my $cbor = $CBOR::XS::MAGIC . encode_cbor $value;
574    
575     Serialise some URIs and a regex in an array:
576    
577     my $cbor = encode_cbor [
578     (CBOR::XS::tag 32, "http://www.nethype.de/"),
579     (CBOR::XS::tag 32, "http://software.schmorp.de/"),
580     (CBOR::XS::tag 35, "^[Pp][Ee][Rr][lL]\$"),
581     ];
582    
583     Wrap CBOR data in CBOR:
584    
585     my $cbor_cbor = encode_cbor
586     CBOR::XS::tag 24,
587     encode_cbor [1, 2, 3];
588    
589 root 1.9 TAG HANDLING AND EXTENSIONS
590     This section describes how this module handles specific tagged values
591     and extensions. If a tag is not mentioned here and no additional filters
592     are provided for it, then the default handling applies (creating a
593     CBOR::XS::Tagged object on decoding, and only encoding the tag when
594     explicitly requested).
595    
596     Tags not handled specifically are currently converted into a
597     CBOR::XS::Tagged object, which is simply a blessed array reference
598     consisting of the numeric tag value followed by the (decoded) CBOR
599     value.
600    
601     Future versions of this module reserve the right to special case
602     additional tags (such as base64url).
603    
604     ENFORCED TAGS
605     These tags are always handled when decoding, and their handling cannot
606     be overriden by the user.
607    
608 root 1.10 26 (perl-object, <http://cbor.schmorp.de/perl-object>)
609 root 1.9 These tags are automatically created (and decoded) for serialisable
610     objects using the "FREEZE/THAW" methods (the Types::Serialier object
611     serialisation protocol). See "OBJECT SERIALISATION" for details.
612    
613 root 1.11 28, 29 (shareable, sharedref, L <http://cbor.schmorp.de/value-sharing>)
614     These tags are automatically decoded when encountered (and they do
615     not result in a cyclic data structure, see "allow_cycles"),
616     resulting in shared values in the decoded object. They are only
617     encoded, however, when "allow_sharing" is enabled.
618    
619     Not all shared values can be successfully decoded: values that
620     reference themselves will *currently* decode as "undef" (this is not
621     the same as a reference pointing to itself, which will be
622     represented as a value that contains an indirect reference to itself
623     - these will be decoded properly).
624    
625     Note that considerably more shared value data structures can be
626     decoded than will be encoded - currently, only values pointed to by
627     references will be shared, others will not. While non-reference
628     shared values can be generated in Perl with some effort, they were
629     considered too unimportant to be supported in the encoder. The
630     decoder, however, will decode these values as shared values.
631 root 1.9
632 root 1.10 256, 25 (stringref-namespace, stringref, L
633 root 1.9 <http://cbor.schmorp.de/stringref>)
634     These tags are automatically decoded when encountered. They are only
635 root 1.10 encoded, however, when "pack_strings" is enabled.
636 root 1.9
637     22098 (indirection, <http://cbor.schmorp.de/indirection>)
638     This tag is automatically generated when a reference are encountered
639     (with the exception of hash and array refernces). It is converted to
640     a reference when decoding.
641    
642     55799 (self-describe CBOR, RFC 7049)
643     This value is not generated on encoding (unless explicitly requested
644     by the user), and is simply ignored when decoding.
645    
646     NON-ENFORCED TAGS
647     These tags have default filters provided when decoding. Their handling
648     can be overriden by changing the %CBOR::XS::FILTER entry for the tag, or
649     by providing a custom "filter" callback when decoding.
650    
651     When they result in decoding into a specific Perl class, the module
652     usually provides a corresponding "TO_CBOR" method as well.
653    
654     When any of these need to load additional modules that are not part of
655     the perl core distribution (e.g. URI), it is (currently) up to the user
656     to provide these modules. The decoding usually fails with an exception
657     if the required module cannot be loaded.
658    
659     2, 3 (positive/negative bignum)
660     These tags are decoded into Math::BigInt objects. The corresponding
661     "Math::BigInt::TO_CBOR" method encodes "small" bigints into normal
662     CBOR integers, and others into positive/negative CBOR bignums.
663    
664     4, 5 (decimal fraction/bigfloat)
665     Both decimal fractions and bigfloats are decoded into Math::BigFloat
666     objects. The corresponding "Math::BigFloat::TO_CBOR" method *always*
667     encodes into a decimal fraction.
668    
669     CBOR cannot represent bigfloats with *very* large exponents -
670     conversion of such big float objects is undefined.
671    
672     Also, NaN and infinities are not encoded properly.
673    
674     21, 22, 23 (expected later JSON conversion)
675     CBOR::XS is not a CBOR-to-JSON converter, and will simply ignore
676     these tags.
677    
678     32 (URI)
679     These objects decode into URI objects. The corresponding
680     "URI::TO_CBOR" method again results in a CBOR URI value.
681    
682 root 1.5 CBOR and JSON
683 root 1.4 CBOR is supposed to implement a superset of the JSON data model, and is,
684     with some coercion, able to represent all JSON texts (something that
685     other "binary JSON" formats such as BSON generally do not support).
686    
687     CBOR implements some extra hints and support for JSON interoperability,
688     and the spec offers further guidance for conversion between CBOR and
689     JSON. None of this is currently implemented in CBOR, and the guidelines
690     in the spec do not result in correct round-tripping of data. If JSON
691     interoperability is improved in the future, then the goal will be to
692     ensure that decoded JSON data will round-trip encoding and decoding to
693     CBOR intact.
694 root 1.2
695     SECURITY CONSIDERATIONS
696     When you are using CBOR in a protocol, talking to untrusted potentially
697     hostile creatures requires relatively few measures.
698    
699     First of all, your CBOR decoder should be secure, that is, should not
700     have any buffer overflows. Obviously, this module should ensure that and
701     I am trying hard on making that true, but you never know.
702    
703     Second, you need to avoid resource-starving attacks. That means you
704     should limit the size of CBOR data you accept, or make sure then when
705     your resources run out, that's just fine (e.g. by using a separate
706     process that can crash safely). The size of a CBOR string in octets is
707     usually a good indication of the size of the resources required to
708     decode it into a Perl structure. While CBOR::XS can check the size of
709     the CBOR text, it might be too late when you already have it in memory,
710     so you might want to check the size before you accept the string.
711    
712     Third, CBOR::XS recurses using the C stack when decoding objects and
713     arrays. The C stack is a limited resource: for instance, on my amd64
714     machine with 8MB of stack size I can decode around 180k nested arrays
715     but only 14k nested CBOR objects (due to perl itself recursing deeply on
716     croak to free the temporary). If that is exceeded, the program crashes.
717     To be conservative, the default nesting limit is set to 512. If your
718     process has a smaller stack, you should adjust this setting accordingly
719     with the "max_depth" method.
720    
721     Something else could bomb you, too, that I forgot to think of. In that
722     case, you get to keep the pieces. I am always open for hints, though...
723    
724     Also keep in mind that CBOR::XS might leak contents of your Perl data
725     structures in its error messages, so when you serialise sensitive
726     information you might want to make sure that exceptions thrown by
727     CBOR::XS will not end up in front of untrusted eyes.
728    
729     CBOR IMPLEMENTATION NOTES
730     This section contains some random implementation notes. They do not
731     describe guaranteed behaviour, but merely behaviour as-is implemented
732     right now.
733    
734     64 bit integers are only properly decoded when Perl was built with 64
735     bit support.
736    
737     Strings and arrays are encoded with a definite length. Hashes as well,
738     unless they are tied (or otherwise magical).
739    
740     Only the double data type is supported for NV data types - when Perl
741     uses long double to represent floating point values, they might not be
742     encoded properly. Half precision types are accepted, but not encoded.
743    
744     Strict mode and canonical mode are not implemented.
745    
746 root 1.11 LIMITATIONS ON PERLS WITHOUT 64-BIT INTEGER SUPPORT
747     On perls that were built without 64 bit integer support (these are rare
748     nowadays, even on 32 bit architectures), support for any kind of 64 bit
749     integer in CBOR is very limited - most likely, these 64 bit values will
750     be truncated, corrupted, or otherwise not decoded correctly. This also
751     includes string, array and map sizes that are stored as 64 bit integers.
752    
753 root 1.2 THREADS
754     This module is *not* guaranteed to be thread safe and there are no plans
755     to change this until Perl gets thread support (as opposed to the
756     horribly slow so-called "threads" which are simply slow and bloated
757     process simulations - use fork, it's *much* faster, cheaper, better).
758    
759     (It might actually work, but you have been warned).
760    
761     BUGS
762     While the goal of this module is to be correct, that unfortunately does
763     not mean it's bug-free, only that I think its design is bug-free. If you
764     keep reporting bugs they will be fixed swiftly, though.
765    
766     Please refrain from using rt.cpan.org or any other bug reporting
767     service. I put the contact address into my modules for a reason.
768    
769     SEE ALSO
770     The JSON and JSON::XS modules that do similar, but human-readable,
771     serialisation.
772    
773 root 1.5 The Types::Serialiser module provides the data model for true, false and
774     error values.
775    
776 root 1.2 AUTHOR
777     Marc Lehmann <schmorp@schmorp.de>
778     http://home.schmorp.de/
779