… | |
… | |
64 | |
64 | |
65 | package CBOR::XS; |
65 | package CBOR::XS; |
66 | |
66 | |
67 | use common::sense; |
67 | use common::sense; |
68 | |
68 | |
69 | our $VERSION = 1.71; |
69 | our $VERSION = 1.87; |
70 | our @ISA = qw(Exporter); |
70 | our @ISA = qw(Exporter); |
71 | |
71 | |
72 | our @EXPORT = qw(encode_cbor decode_cbor); |
72 | our @EXPORT = qw(encode_cbor decode_cbor); |
73 | |
73 | |
74 | use Exporter; |
74 | use Exporter; |
… | |
… | |
121 | but configures the coder object to be safe to use with untrusted |
121 | but configures the coder object to be safe to use with untrusted |
122 | data. Currently, this is equivalent to: |
122 | data. Currently, this is equivalent to: |
123 | |
123 | |
124 | my $cbor = CBOR::XS |
124 | my $cbor = CBOR::XS |
125 | ->new |
125 | ->new |
|
|
126 | ->validate_utf8 |
126 | ->forbid_objects |
127 | ->forbid_objects |
127 | ->filter (\&CBOR::XS::safe_filter) |
128 | ->filter (\&CBOR::XS::safe_filter) |
128 | ->max_size (1e8); |
129 | ->max_size (1e8); |
129 | |
130 | |
130 | But is more future proof (it is better to crash because of a change than |
131 | But is more future proof (it is better to crash because of a change than |
… | |
… | |
133 | =cut |
134 | =cut |
134 | |
135 | |
135 | sub new_safe { |
136 | sub new_safe { |
136 | CBOR::XS |
137 | CBOR::XS |
137 | ->new |
138 | ->new |
|
|
139 | ->validate_utf8 |
138 | ->forbid_objects |
140 | ->forbid_objects |
139 | ->filter (\&CBOR::XS::safe_filter) |
141 | ->filter (\&CBOR::XS::safe_filter) |
140 | ->max_size (1e8) |
142 | ->max_size (1e8) |
141 | } |
143 | } |
142 | |
144 | |
… | |
… | |
214 | communication partner supports the value sharing extensions to CBOR |
216 | communication partner supports the value sharing extensions to CBOR |
215 | (L<http://cbor.schmorp.de/value-sharing>), as without decoder support, the |
217 | (L<http://cbor.schmorp.de/value-sharing>), as without decoder support, the |
216 | resulting data structure might be unusable. |
218 | resulting data structure might be unusable. |
217 | |
219 | |
218 | Detecting shared values incurs a runtime overhead when values are encoded |
220 | Detecting shared values incurs a runtime overhead when values are encoded |
219 | that have a reference counter large than one, and might unnecessarily |
221 | that have a reference counter larger than one, and might unnecessarily |
220 | increase the encoded size, as potentially shared values are encoded as |
222 | increase the encoded size, as potentially shared values are encoded as |
221 | shareable whether or not they are actually shared. |
223 | shareable whether or not they are actually shared. |
222 | |
224 | |
223 | At the moment, only targets of references can be shared (e.g. scalars, |
225 | At the moment, only targets of references can be shared (e.g. scalars, |
224 | arrays or hashes pointed to by a reference). Weirder constructs, such as |
226 | arrays or hashes pointed to by a reference). Weirder constructs, such as |
… | |
… | |
243 | isn't prepared for this will not leak memory. |
245 | isn't prepared for this will not leak memory. |
244 | |
246 | |
245 | If C<$enable> is false (the default), then C<decode> will throw an error |
247 | If C<$enable> is false (the default), then C<decode> will throw an error |
246 | when it encounters a self-referential/cyclic data structure. |
248 | when it encounters a self-referential/cyclic data structure. |
247 | |
249 | |
248 | FUTURE DIRECTION: the motivation behind this option is to avoid I<real> |
250 | This option does not affect C<encode> in any way - shared values and |
249 | cycles - future versions of this module might chose to decode cyclic data |
251 | references will always be encoded properly if present. |
250 | structures using weak references when this option is off, instead of |
252 | |
251 | throwing an error. |
253 | =item $cbor = $cbor->allow_weak_cycles ([$enable]) |
|
|
254 | |
|
|
255 | =item $enabled = $cbor->get_allow_weak_cycles |
|
|
256 | |
|
|
257 | This works like C<allow_cycles> in that it allows the resulting data |
|
|
258 | structures to contain cycles, but unlike C<allow_cycles>, those cyclic |
|
|
259 | rreferences will be weak. That means that code that recurrsively walks |
|
|
260 | the data structure must be prepared with cycles, but at least not special |
|
|
261 | precautions must be implemented to free these data structures. |
|
|
262 | |
|
|
263 | Only those references leading to actual cycles will be weakened - other |
|
|
264 | references, e.g. when the same hash or arrray is referenced multiple times |
|
|
265 | in an arrray, will be normal references. |
252 | |
266 | |
253 | This option does not affect C<encode> in any way - shared values and |
267 | This option does not affect C<encode> in any way - shared values and |
254 | references will always be encoded properly if present. |
268 | references will always be encoded properly if present. |
255 | |
269 | |
256 | =item $cbor = $cbor->forbid_objects ([$enable]) |
270 | =item $cbor = $cbor->forbid_objects ([$enable]) |
… | |
… | |
330 | strings as CBOR byte strings. |
344 | strings as CBOR byte strings. |
331 | |
345 | |
332 | This option does not affect C<decode> in any way. |
346 | This option does not affect C<decode> in any way. |
333 | |
347 | |
334 | This option has similar advantages and disadvantages as C<text_keys>. In |
348 | This option has similar advantages and disadvantages as C<text_keys>. In |
335 | addition, this option effectively removes the ability to encode byte |
349 | addition, this option effectively removes the ability to automatically |
336 | strings, which might break some C<FREEZE> and C<TO_CBOR> methods that rely |
350 | encode byte strings, which might break some C<FREEZE> and C<TO_CBOR> |
337 | on this, such as bignum encoding, so this option is mainly useful for very |
351 | methods that rely on this. |
338 | simple data. |
352 | |
|
|
353 | A workaround is to use explicit type casts, which are unaffected by this option. |
339 | |
354 | |
340 | =item $cbor = $cbor->validate_utf8 ([$enable]) |
355 | =item $cbor = $cbor->validate_utf8 ([$enable]) |
341 | |
356 | |
342 | =item $enabled = $cbor->get_validate_utf8 |
357 | =item $enabled = $cbor->get_validate_utf8 |
343 | |
358 | |
… | |
… | |
431 | &{ $my_filter{$_[0]} or return } |
446 | &{ $my_filter{$_[0]} or return } |
432 | }); |
447 | }); |
433 | |
448 | |
434 | |
449 | |
435 | Example: use the safe filter function (see L<SECURITY CONSIDERATIONS> for |
450 | Example: use the safe filter function (see L<SECURITY CONSIDERATIONS> for |
436 | more considerations on security). |
451 | more considerations regarding security). |
437 | |
452 | |
438 | CBOR::XS->new->filter (\&CBOR::XS::safe_filter)->decode ($cbor_data); |
453 | CBOR::XS->new->filter (\&CBOR::XS::safe_filter)->decode ($cbor_data); |
439 | |
454 | |
440 | =item $cbor_data = $cbor->encode ($perl_scalar) |
455 | =item $cbor_data = $cbor->encode ($perl_scalar) |
441 | |
456 | |
… | |
… | |
470 | Perl data structure in memory at one time, it does allow you to parse a |
485 | Perl data structure in memory at one time, it does allow you to parse a |
471 | CBOR stream incrementally, using a similar to using "decode_prefix" to see |
486 | CBOR stream incrementally, using a similar to using "decode_prefix" to see |
472 | if a full CBOR object is available, but is much more efficient. |
487 | if a full CBOR object is available, but is much more efficient. |
473 | |
488 | |
474 | It basically works by parsing as much of a CBOR string as possible - if |
489 | It basically works by parsing as much of a CBOR string as possible - if |
475 | the CBOR data is not complete yet, the pasrer will remember where it was, |
490 | the CBOR data is not complete yet, the parser will remember where it was, |
476 | to be able to restart when more data has been accumulated. Once enough |
491 | to be able to restart when more data has been accumulated. Once enough |
477 | data is available to either decode a complete CBOR value or raise an |
492 | data is available to either decode a complete CBOR value or raise an |
478 | error, a real decode will be attempted. |
493 | error, a real decode will be attempted. |
479 | |
494 | |
480 | A typical use case would be a network protocol that consists of sending |
495 | A typical use case would be a network protocol that consists of sending |
… | |
… | |
632 | create such objects. |
647 | create such objects. |
633 | |
648 | |
634 | =item Types::Serialiser::true, Types::Serialiser::false, Types::Serialiser::error |
649 | =item Types::Serialiser::true, Types::Serialiser::false, Types::Serialiser::error |
635 | |
650 | |
636 | These special values become CBOR true, CBOR false and CBOR undefined |
651 | These special values become CBOR true, CBOR false and CBOR undefined |
637 | values, respectively. You can also use C<\1>, C<\0> and C<\undef> directly |
652 | values, respectively. |
638 | if you want. |
|
|
639 | |
653 | |
640 | =item other blessed objects |
654 | =item other blessed objects |
641 | |
655 | |
642 | Other blessed objects are serialised via C<TO_CBOR> or C<FREEZE>. See |
656 | Other blessed objects are serialised via C<TO_CBOR> or C<FREEZE>. See |
643 | L<TAG HANDLING AND EXTENSIONS> for specific classes handled by this |
657 | L<TAG HANDLING AND EXTENSIONS> for specific classes handled by this |
… | |
… | |
668 | "$x"; # stringified |
682 | "$x"; # stringified |
669 | $x .= ""; # another, more awkward way to stringify |
683 | $x .= ""; # another, more awkward way to stringify |
670 | print $x; # perl does it for you, too, quite often |
684 | print $x; # perl does it for you, too, quite often |
671 | |
685 | |
672 | You can force whether a string is encoded as byte or text string by using |
686 | You can force whether a string is encoded as byte or text string by using |
673 | C<utf8::upgrade> and C<utf8::downgrade> (if C<text_strings> is disabled): |
687 | C<utf8::upgrade> and C<utf8::downgrade> (if C<text_strings> is disabled). |
674 | |
688 | |
675 | utf8::upgrade $x; # encode $x as text string |
689 | utf8::upgrade $x; # encode $x as text string |
676 | utf8::downgrade $x; # encode $x as byte string |
690 | utf8::downgrade $x; # encode $x as byte string |
|
|
691 | |
|
|
692 | More options are available, see L<TYPE CASTS>, below, and the C<text_keys> |
|
|
693 | and C<text_strings> options. |
677 | |
694 | |
678 | Perl doesn't define what operations up- and downgrade strings, so if the |
695 | Perl doesn't define what operations up- and downgrade strings, so if the |
679 | difference between byte and text is important, you should up- or downgrade |
696 | difference between byte and text is important, you should up- or downgrade |
680 | your string as late as possible before encoding. You can also force the |
697 | your string as late as possible before encoding. You can also force the |
681 | use of CBOR text strings by using C<text_keys> or C<text_strings>. |
698 | use of CBOR text strings by using C<text_keys> or C<text_strings>. |
… | |
… | |
696 | format will be used. Perls that use formats other than IEEE double to |
713 | format will be used. Perls that use formats other than IEEE double to |
697 | represent numerical values are supported, but might suffer loss of |
714 | represent numerical values are supported, but might suffer loss of |
698 | precision. |
715 | precision. |
699 | |
716 | |
700 | =back |
717 | =back |
|
|
718 | |
|
|
719 | =head2 TYPE CASTS |
|
|
720 | |
|
|
721 | B<EXPERIMENTAL>: As an experimental extension, C<CBOR::XS> allows you to |
|
|
722 | force specific CBOR types to be used when encoding. That allows you to |
|
|
723 | encode types not normally accessible (e.g. half floats) as well as force |
|
|
724 | string types even when C<text_strings> is in effect. |
|
|
725 | |
|
|
726 | Type forcing is done by calling a special "cast" function which keeps a |
|
|
727 | copy of the value and returns a new value that can be handed over to any |
|
|
728 | CBOR encoder function. |
|
|
729 | |
|
|
730 | The following casts are currently available (all of which are unary |
|
|
731 | operators, that is, have a prototype of C<$>): |
|
|
732 | |
|
|
733 | =over |
|
|
734 | |
|
|
735 | =item CBOR::XS::as_int $value |
|
|
736 | |
|
|
737 | Forces the value to be encoded as some form of (basic, not bignum) integer |
|
|
738 | type. |
|
|
739 | |
|
|
740 | =item CBOR::XS::as_text $value |
|
|
741 | |
|
|
742 | Forces the value to be encoded as (UTF-8) text values. |
|
|
743 | |
|
|
744 | =item CBOR::XS::as_bytes $value |
|
|
745 | |
|
|
746 | Forces the value to be encoded as a (binary) string value. |
|
|
747 | |
|
|
748 | Example: encode a perl string as binary even though C<text_strings> is in |
|
|
749 | effect. |
|
|
750 | |
|
|
751 | CBOR::XS->new->text_strings->encode ([4, "text", CBOR::XS::bytes "bytevalue"]); |
|
|
752 | |
|
|
753 | =item CBOR::XS::as_bool $value |
|
|
754 | |
|
|
755 | Converts a Perl boolean (which can be any kind of scalar) into a CBOR |
|
|
756 | boolean. Strictly the same, but shorter to write, than: |
|
|
757 | |
|
|
758 | $value ? Types::Serialiser::true : Types::Serialiser::false |
|
|
759 | |
|
|
760 | =item CBOR::XS::as_float16 $value |
|
|
761 | |
|
|
762 | Forces half-float (IEEE 754 binary16) encoding of the given value. |
|
|
763 | |
|
|
764 | =item CBOR::XS::as_float32 $value |
|
|
765 | |
|
|
766 | Forces single-float (IEEE 754 binary32) encoding of the given value. |
|
|
767 | |
|
|
768 | =item CBOR::XS::as_float64 $value |
|
|
769 | |
|
|
770 | Forces double-float (IEEE 754 binary64) encoding of the given value. |
|
|
771 | |
|
|
772 | =item CBOR::XS::as_cbor $cbor_text |
|
|
773 | |
|
|
774 | Not a type cast per-se, this type cast forces the argument to be encoded |
|
|
775 | as-is. This can be used to embed pre-encoded CBOR data. |
|
|
776 | |
|
|
777 | Note that no checking on the validity of the C<$cbor_text> is done - it's |
|
|
778 | the callers responsibility to correctly encode values. |
|
|
779 | |
|
|
780 | =item CBOR::XS::as_map [key => value...] |
|
|
781 | |
|
|
782 | Treat the array reference as key value pairs and output a CBOR map. This |
|
|
783 | allows you to generate CBOR maps with arbitrary key types (or, if you |
|
|
784 | don't care about semantics, duplicate keys or pairs in a custom order), |
|
|
785 | which is otherwise hard to do with Perl. |
|
|
786 | |
|
|
787 | The single argument must be an array reference with an even number of |
|
|
788 | elements. |
|
|
789 | |
|
|
790 | Note that only the reference to the array is copied, the array itself is |
|
|
791 | not. Modifications done to the array before calling an encoding function |
|
|
792 | will be reflected in the encoded output. |
|
|
793 | |
|
|
794 | Example: encode a CBOR map with a string and an integer as keys. |
|
|
795 | |
|
|
796 | encode_cbor CBOR::XS::as_map [string => "value", 5 => "value"] |
|
|
797 | |
|
|
798 | =back |
|
|
799 | |
|
|
800 | =cut |
|
|
801 | |
|
|
802 | sub CBOR::XS::as_cbor ($) { bless [$_[0], 0, undef], CBOR::XS::Tagged:: } |
|
|
803 | sub CBOR::XS::as_int ($) { bless [$_[0], 1, undef], CBOR::XS::Tagged:: } |
|
|
804 | sub CBOR::XS::as_bytes ($) { bless [$_[0], 2, undef], CBOR::XS::Tagged:: } |
|
|
805 | sub CBOR::XS::as_text ($) { bless [$_[0], 3, undef], CBOR::XS::Tagged:: } |
|
|
806 | sub CBOR::XS::as_float16 ($) { bless [$_[0], 4, undef], CBOR::XS::Tagged:: } |
|
|
807 | sub CBOR::XS::as_float32 ($) { bless [$_[0], 5, undef], CBOR::XS::Tagged:: } |
|
|
808 | sub CBOR::XS::as_float64 ($) { bless [$_[0], 6, undef], CBOR::XS::Tagged:: } |
|
|
809 | |
|
|
810 | sub CBOR::XS::as_bool ($) { $_[0] ? $Types::Serialiser::true : $Types::Serialiser::false } |
|
|
811 | |
|
|
812 | sub CBOR::XS::as_map ($) { |
|
|
813 | ARRAY:: eq ref $_[0] |
|
|
814 | and $#{ $_[0] } & 1 |
|
|
815 | or do { require Carp; Carp::croak ("CBOR::XS::as_map only acepts array references with an even number of elements, caught") }; |
|
|
816 | |
|
|
817 | bless [$_[0], 7, undef], CBOR::XS::Tagged:: |
|
|
818 | } |
701 | |
819 | |
702 | =head2 OBJECT SERIALISATION |
820 | =head2 OBJECT SERIALISATION |
703 | |
821 | |
704 | This module implements both a CBOR-specific and the generic |
822 | This module implements both a CBOR-specific and the generic |
705 | L<Types::Serialier> object serialisation protocol. The following |
823 | L<Types::Serialier> object serialisation protocol. The following |
… | |
… | |
1228 | =head1 LIMITATIONS ON PERLS WITHOUT 64-BIT INTEGER SUPPORT |
1346 | =head1 LIMITATIONS ON PERLS WITHOUT 64-BIT INTEGER SUPPORT |
1229 | |
1347 | |
1230 | On perls that were built without 64 bit integer support (these are rare |
1348 | On perls that were built without 64 bit integer support (these are rare |
1231 | nowadays, even on 32 bit architectures, as all major Perl distributions |
1349 | nowadays, even on 32 bit architectures, as all major Perl distributions |
1232 | are built with 64 bit integer support), support for any kind of 64 bit |
1350 | are built with 64 bit integer support), support for any kind of 64 bit |
1233 | integer in CBOR is very limited - most likely, these 64 bit values will |
1351 | value in CBOR is very limited - most likely, these 64 bit values will |
1234 | be truncated, corrupted, or otherwise not decoded correctly. This also |
1352 | be truncated, corrupted, or otherwise not decoded correctly. This also |
1235 | includes string, array and map sizes that are stored as 64 bit integers. |
1353 | includes string, float, array and map sizes that are stored as 64 bit |
|
|
1354 | integers. |
1236 | |
1355 | |
1237 | |
1356 | |
1238 | =head1 THREADS |
1357 | =head1 THREADS |
1239 | |
1358 | |
1240 | This module is I<not> guaranteed to be thread safe and there are no |
1359 | This module is I<not> guaranteed to be thread safe and there are no |