… | |
… | |
44 | about 20% smaller than the same data encoded as (compact) JSON or |
44 | about 20% smaller than the same data encoded as (compact) JSON or |
45 | Storable. |
45 | Storable. |
46 | |
46 | |
47 | In addition to the core CBOR data format, this module implements a |
47 | In addition to the core CBOR data format, this module implements a |
48 | number of extensions, to support cyclic and shared data structures (see |
48 | number of extensions, to support cyclic and shared data structures (see |
49 | "allow_sharing"), string deduplication (see "pack_strings") and scalar |
49 | "allow_sharing" and "allow_cycles"), string deduplication (see |
50 | references (always enabled). |
50 | "pack_strings") and scalar references (always enabled). |
51 | |
51 | |
52 | The primary goal of this module is to be *correct* and the secondary |
52 | The primary goal of this module is to be *correct* and the secondary |
53 | goal is to be *fast*. To reach the latter goal it was written in C. |
53 | goal is to be *fast*. To reach the latter goal it was written in C. |
54 | |
54 | |
55 | See MAPPING, below, on how CBOR::XS maps perl values to CBOR values and |
55 | See MAPPING, below, on how CBOR::XS maps perl values to CBOR values and |
… | |
… | |
141 | instead will emit a reference to the earlier value. |
141 | instead will emit a reference to the earlier value. |
142 | |
142 | |
143 | This means that such values will only be encoded once, and will not |
143 | This means that such values will only be encoded once, and will not |
144 | result in a deep cloning of the value on decode, in decoders |
144 | result in a deep cloning of the value on decode, in decoders |
145 | supporting the value sharing extension. This also makes it possible |
145 | supporting the value sharing extension. This also makes it possible |
146 | to encode cyclic data structures. |
146 | to encode cyclic data structures (which need "allow_cycles" to ne |
|
|
147 | enabled to be decoded by this module). |
147 | |
148 | |
148 | It is recommended to leave it off unless you know your communication |
149 | It is recommended to leave it off unless you know your communication |
149 | partner supports the value sharing extensions to CBOR |
150 | partner supports the value sharing extensions to CBOR |
150 | (<http://cbor.schmorp.de/value-sharing>), as without decoder |
151 | (<http://cbor.schmorp.de/value-sharing>), as without decoder |
151 | support, the resulting data structure might be unusable. |
152 | support, the resulting data structure might be unusable. |
152 | |
153 | |
153 | Detecting shared values incurs a runtime overhead when values are |
154 | Detecting shared values incurs a runtime overhead when values are |
154 | encoded that have a reference counter large than one, and might |
155 | encoded that have a reference counter large than one, and might |
155 | unnecessarily increase the encoded size, as potentially shared |
156 | unnecessarily increase the encoded size, as potentially shared |
156 | values are encode as sharable whether or not they are actually |
157 | values are encode as shareable whether or not they are actually |
157 | shared. |
158 | shared. |
158 | |
159 | |
159 | At the moment, only targets of references can be shared (e.g. |
160 | At the moment, only targets of references can be shared (e.g. |
160 | scalars, arrays or hashes pointed to by a reference). Weirder |
161 | scalars, arrays or hashes pointed to by a reference). Weirder |
161 | constructs, such as an array with multiple "copies" of the *same* |
162 | constructs, such as an array with multiple "copies" of the *same* |
… | |
… | |
165 | If $enable is false (the default), then "encode" will encode shared |
166 | If $enable is false (the default), then "encode" will encode shared |
166 | data structures repeatedly, unsharing them in the process. Cyclic |
167 | data structures repeatedly, unsharing them in the process. Cyclic |
167 | data structures cannot be encoded in this mode. |
168 | data structures cannot be encoded in this mode. |
168 | |
169 | |
169 | This option does not affect "decode" in any way - shared values and |
170 | This option does not affect "decode" in any way - shared values and |
|
|
171 | references will always be decoded properly if present. |
|
|
172 | |
|
|
173 | $cbor = $cbor->allow_cycles ([$enable]) |
|
|
174 | $enabled = $cbor->get_allow_cycles |
|
|
175 | If $enable is true (or missing), then "decode" will happily decode |
|
|
176 | self-referential (cyclic) data structures. By default these will not |
|
|
177 | be decoded, as they need manual cleanup to avoid memory leaks, so |
|
|
178 | code that isn't prepared for this will not leak memory. |
|
|
179 | |
|
|
180 | If $enable is false (the default), then "decode" will throw an error |
|
|
181 | when it encounters a self-referential/cyclic data structure. |
|
|
182 | |
|
|
183 | This option does not affect "encode" in any way - shared values and |
170 | references will always be decoded properly if present. |
184 | references will always be decoded properly if present. |
171 | |
185 | |
172 | $cbor = $cbor->pack_strings ([$enable]) |
186 | $cbor = $cbor->pack_strings ([$enable]) |
173 | $enabled = $cbor->get_pack_strings |
187 | $enabled = $cbor->get_pack_strings |
174 | If $enable is true (or missing), then "encode" will try not to |
188 | If $enable is true (or missing), then "encode" will try not to |
… | |
… | |
186 | the standard CBOR way. |
200 | the standard CBOR way. |
187 | |
201 | |
188 | This option does not affect "decode" in any way - string references |
202 | This option does not affect "decode" in any way - string references |
189 | will always be decoded properly if present. |
203 | will always be decoded properly if present. |
190 | |
204 | |
|
|
205 | $cbor = $cbor->validate_utf8 ([$enable]) |
|
|
206 | $enabled = $cbor->get_validate_utf8 |
|
|
207 | If $enable is true (or missing), then "decode" will validate that |
|
|
208 | elements (text strings) containing UTF-8 data in fact contain valid |
|
|
209 | UTF-8 data (instead of blindly accepting it). This validation |
|
|
210 | obviously takes extra time during decoding. |
|
|
211 | |
|
|
212 | The concept of "valid UTF-8" used is perl's concept, which is a |
|
|
213 | superset of the official UTF-8. |
|
|
214 | |
|
|
215 | If $enable is false (the default), then "decode" will blindly accept |
|
|
216 | UTF-8 data, marking them as valid UTF-8 in the resulting data |
|
|
217 | structure regardless of whether thats true or not. |
|
|
218 | |
|
|
219 | Perl isn't too happy about corrupted UTF-8 in strings, but should |
|
|
220 | generally not crash or do similarly evil things. Extensions might be |
|
|
221 | not so forgiving, so it's recommended to turn on this setting if you |
|
|
222 | receive untrusted CBOR. |
|
|
223 | |
|
|
224 | This option does not affect "encode" in any way - strings that are |
|
|
225 | supposedly valid UTF-8 will simply be dumped into the resulting CBOR |
|
|
226 | string without checking whether that is, in fact, true or not. |
|
|
227 | |
191 | $cbor = $cbor->filter ([$cb->($tag, $value)]) |
228 | $cbor = $cbor->filter ([$cb->($tag, $value)]) |
192 | $cb_or_undef = $cbor->get_filter |
229 | $cb_or_undef = $cbor->get_filter |
193 | Sets or replaces the tagged value decoding filter (when $cb is |
230 | Sets or replaces the tagged value decoding filter (when $cb is |
194 | specified) or clears the filter (if no argument or "undef" is |
231 | specified) or clears the filter (if no argument or "undef" is |
195 | provided). |
232 | provided). |
… | |
… | |
396 | the IEEE double format will be used. Perls that use formats other |
433 | the IEEE double format will be used. Perls that use formats other |
397 | than IEEE double to represent numerical values are supported, but |
434 | than IEEE double to represent numerical values are supported, but |
398 | might suffer loss of precision. |
435 | might suffer loss of precision. |
399 | |
436 | |
400 | OBJECT SERIALISATION |
437 | OBJECT SERIALISATION |
|
|
438 | This module implements both a CBOR-specific and the generic |
|
|
439 | Types::Serialier object serialisation protocol. The following |
|
|
440 | subsections explain both methods. |
|
|
441 | |
|
|
442 | ENCODING |
401 | This module knows two way to serialise a Perl object: The CBOR-specific |
443 | This module knows two way to serialise a Perl object: The CBOR-specific |
402 | way, and the generic way. |
444 | way, and the generic way. |
403 | |
445 | |
404 | Whenever the encoder encounters a Perl object that it cnanot serialise |
446 | Whenever the encoder encounters a Perl object that it cannot serialise |
405 | directly (most of them), it will first look up the "TO_CBOR" method on |
447 | directly (most of them), it will first look up the "TO_CBOR" method on |
406 | it. |
448 | it. |
407 | |
449 | |
408 | If it has a "TO_CBOR" method, it will call it with the object as only |
450 | If it has a "TO_CBOR" method, it will call it with the object as only |
409 | argument, and expects exactly one return value, which it will then |
451 | argument, and expects exactly one return value, which it will then |
… | |
… | |
414 | "CBOR" as the second argument, to distinguish it from other serialisers. |
456 | "CBOR" as the second argument, to distinguish it from other serialisers. |
415 | |
457 | |
416 | The "FREEZE" method can return any number of values (i.e. zero or more). |
458 | The "FREEZE" method can return any number of values (i.e. zero or more). |
417 | These will be encoded as CBOR perl object, together with the classname. |
459 | These will be encoded as CBOR perl object, together with the classname. |
418 | |
460 | |
|
|
461 | These methods *MUST NOT* change the data structure that is being |
|
|
462 | serialised. Failure to comply to this can result in memory corruption - |
|
|
463 | and worse. |
|
|
464 | |
419 | If an object supports neither "TO_CBOR" nor "FREEZE", encoding will fail |
465 | If an object supports neither "TO_CBOR" nor "FREEZE", encoding will fail |
420 | with an error. |
466 | with an error. |
421 | |
467 | |
|
|
468 | DECODING |
422 | Objects encoded via "TO_CBOR" cannot be automatically decoded, but |
469 | Objects encoded via "TO_CBOR" cannot (normally) be automatically |
423 | objects encoded via "FREEZE" can be decoded using the following |
470 | decoded, but objects encoded via "FREEZE" can be decoded using the |
424 | protocol: |
471 | following protocol: |
425 | |
472 | |
426 | When an encoded CBOR perl object is encountered by the decoder, it will |
473 | When an encoded CBOR perl object is encountered by the decoder, it will |
427 | look up the "THAW" method, by using the stored classname, and will fail |
474 | look up the "THAW" method, by using the stored classname, and will fail |
428 | if the method cannot be found. |
475 | if the method cannot be found. |
429 | |
476 | |
… | |
… | |
584 | 26 (perl-object, <http://cbor.schmorp.de/perl-object>) |
631 | 26 (perl-object, <http://cbor.schmorp.de/perl-object>) |
585 | These tags are automatically created (and decoded) for serialisable |
632 | These tags are automatically created (and decoded) for serialisable |
586 | objects using the "FREEZE/THAW" methods (the Types::Serialier object |
633 | objects using the "FREEZE/THAW" methods (the Types::Serialier object |
587 | serialisation protocol). See "OBJECT SERIALISATION" for details. |
634 | serialisation protocol). See "OBJECT SERIALISATION" for details. |
588 | |
635 | |
589 | 28, 29 (sharable, sharedref, L <http://cbor.schmorp.de/value-sharing>) |
636 | 28, 29 (shareable, sharedref, L <http://cbor.schmorp.de/value-sharing>) |
590 | These tags are automatically decoded when encountered, resulting in |
637 | These tags are automatically decoded when encountered (and they do |
|
|
638 | not result in a cyclic data structure, see "allow_cycles"), |
591 | shared values in the decoded object. They are only encoded, however, |
639 | resulting in shared values in the decoded object. They are only |
592 | when "allow_sharable" is enabled. |
640 | encoded, however, when "allow_sharing" is enabled. |
|
|
641 | |
|
|
642 | Not all shared values can be successfully decoded: values that |
|
|
643 | reference themselves will *currently* decode as "undef" (this is not |
|
|
644 | the same as a reference pointing to itself, which will be |
|
|
645 | represented as a value that contains an indirect reference to itself |
|
|
646 | - these will be decoded properly). |
|
|
647 | |
|
|
648 | Note that considerably more shared value data structures can be |
|
|
649 | decoded than will be encoded - currently, only values pointed to by |
|
|
650 | references will be shared, others will not. While non-reference |
|
|
651 | shared values can be generated in Perl with some effort, they were |
|
|
652 | considered too unimportant to be supported in the encoder. The |
|
|
653 | decoder, however, will decode these values as shared values. |
593 | |
654 | |
594 | 256, 25 (stringref-namespace, stringref, L |
655 | 256, 25 (stringref-namespace, stringref, L |
595 | <http://cbor.schmorp.de/stringref>) |
656 | <http://cbor.schmorp.de/stringref>) |
596 | These tags are automatically decoded when encountered. They are only |
657 | These tags are automatically decoded when encountered. They are only |
597 | encoded, however, when "pack_strings" is enabled. |
658 | encoded, however, when "pack_strings" is enabled. |
… | |
… | |
615 | |
676 | |
616 | When any of these need to load additional modules that are not part of |
677 | When any of these need to load additional modules that are not part of |
617 | the perl core distribution (e.g. URI), it is (currently) up to the user |
678 | the perl core distribution (e.g. URI), it is (currently) up to the user |
618 | to provide these modules. The decoding usually fails with an exception |
679 | to provide these modules. The decoding usually fails with an exception |
619 | if the required module cannot be loaded. |
680 | if the required module cannot be loaded. |
|
|
681 | |
|
|
682 | 0, 1 (date/time string, seconds since the epoch) |
|
|
683 | These tags are decoded into Time::Piece objects. The corresponding |
|
|
684 | "Time::Piece::TO_CBOR" method always encodes into tag 1 values |
|
|
685 | currently. |
|
|
686 | |
|
|
687 | The Time::Piece API is generally surprisingly bad, and fractional |
|
|
688 | seconds are only accidentally kept intact, so watch out. On the plus |
|
|
689 | side, the module comes with perl since 5.10, which has to count for |
|
|
690 | something. |
620 | |
691 | |
621 | 2, 3 (positive/negative bignum) |
692 | 2, 3 (positive/negative bignum) |
622 | These tags are decoded into Math::BigInt objects. The corresponding |
693 | These tags are decoded into Math::BigInt objects. The corresponding |
623 | "Math::BigInt::TO_CBOR" method encodes "small" bigints into normal |
694 | "Math::BigInt::TO_CBOR" method encodes "small" bigints into normal |
624 | CBOR integers, and others into positive/negative CBOR bignums. |
695 | CBOR integers, and others into positive/negative CBOR bignums. |
… | |
… | |
703 | uses long double to represent floating point values, they might not be |
774 | uses long double to represent floating point values, they might not be |
704 | encoded properly. Half precision types are accepted, but not encoded. |
775 | encoded properly. Half precision types are accepted, but not encoded. |
705 | |
776 | |
706 | Strict mode and canonical mode are not implemented. |
777 | Strict mode and canonical mode are not implemented. |
707 | |
778 | |
|
|
779 | LIMITATIONS ON PERLS WITHOUT 64-BIT INTEGER SUPPORT |
|
|
780 | On perls that were built without 64 bit integer support (these are rare |
|
|
781 | nowadays, even on 32 bit architectures), support for any kind of 64 bit |
|
|
782 | integer in CBOR is very limited - most likely, these 64 bit values will |
|
|
783 | be truncated, corrupted, or otherwise not decoded correctly. This also |
|
|
784 | includes string, array and map sizes that are stored as 64 bit integers. |
|
|
785 | |
708 | THREADS |
786 | THREADS |
709 | This module is *not* guaranteed to be thread safe and there are no plans |
787 | This module is *not* guaranteed to be thread safe and there are no plans |
710 | to change this until Perl gets thread support (as opposed to the |
788 | to change this until Perl gets thread support (as opposed to the |
711 | horribly slow so-called "threads" which are simply slow and bloated |
789 | horribly slow so-called "threads" which are simply slow and bloated |
712 | process simulations - use fork, it's *much* faster, cheaper, better). |
790 | process simulations - use fork, it's *much* faster, cheaper, better). |