… | |
… | |
205 | the standard CBOR way. |
205 | the standard CBOR way. |
206 | |
206 | |
207 | This option does not affect "decode" in any way - string references |
207 | This option does not affect "decode" in any way - string references |
208 | will always be decoded properly if present. |
208 | will always be decoded properly if present. |
209 | |
209 | |
|
|
210 | $cbor = $cbor->text_keys ([$enable]) |
|
|
211 | $enabled = $cbor->get_text_keys |
|
|
212 | If $enabled is true (or missing), then "encode" will encode all perl |
|
|
213 | hash keys as CBOR text strings/UTF-8 string, upgrading them as |
|
|
214 | needed. |
|
|
215 | |
|
|
216 | If $enable is false (the default), then "encode" will encode hash |
|
|
217 | keys normally - upgraded perl strings (strings internally encoded as |
|
|
218 | UTF-8) as CBOR text strings, and downgraded perl strings as CBOR |
|
|
219 | byte strings. |
|
|
220 | |
|
|
221 | This option does not affect "decode" in any way. |
|
|
222 | |
|
|
223 | This option is useful for interoperability with CBOR decoders that |
|
|
224 | don't treat byte strings as a form of text. It is especially useful |
|
|
225 | as Perl gives very little control over hash keys. |
|
|
226 | |
|
|
227 | Enabling this option can be slow, as all downgraded hash keys that |
|
|
228 | are encoded need to be scanned and converted to UTF-8. |
|
|
229 | |
|
|
230 | $cbor = $cbor->text_strings ([$enable]) |
|
|
231 | $enabled = $cbor->get_text_strings |
|
|
232 | This option works similar to "text_keys", above, but works on all |
|
|
233 | strings (including hash keys), so "text_keys" has no further effect |
|
|
234 | after enabling "text_strings". |
|
|
235 | |
|
|
236 | If $enabled is true (or missing), then "encode" will encode all perl |
|
|
237 | strings as CBOR text strings/UTF-8 strings, upgrading them as |
|
|
238 | needed. |
|
|
239 | |
|
|
240 | If $enable is false (the default), then "encode" will encode strings |
|
|
241 | normally (but see "text_keys") - upgraded perl strings (strings |
|
|
242 | internally encoded as UTF-8) as CBOR text strings, and downgraded |
|
|
243 | perl strings as CBOR byte strings. |
|
|
244 | |
|
|
245 | This option does not affect "decode" in any way. |
|
|
246 | |
|
|
247 | This option has similar advantages and disadvantages as "text_keys". |
|
|
248 | In addition, this option effectively removes the ability to encode |
|
|
249 | byte strings, which might break some "FREEZE" and "TO_CBOR" methods |
|
|
250 | that rely on this, such as bignum encoding, so this option is mainly |
|
|
251 | useful for very simple data. |
|
|
252 | |
210 | $cbor = $cbor->validate_utf8 ([$enable]) |
253 | $cbor = $cbor->validate_utf8 ([$enable]) |
211 | $enabled = $cbor->get_validate_utf8 |
254 | $enabled = $cbor->get_validate_utf8 |
212 | If $enable is true (or missing), then "decode" will validate that |
255 | If $enable is true (or missing), then "decode" will validate that |
213 | elements (text strings) containing UTF-8 data in fact contain valid |
256 | elements (text strings) containing UTF-8 data in fact contain valid |
214 | UTF-8 data (instead of blindly accepting it). This validation |
257 | UTF-8 data (instead of blindly accepting it). This validation |
… | |
… | |
217 | The concept of "valid UTF-8" used is perl's concept, which is a |
260 | The concept of "valid UTF-8" used is perl's concept, which is a |
218 | superset of the official UTF-8. |
261 | superset of the official UTF-8. |
219 | |
262 | |
220 | If $enable is false (the default), then "decode" will blindly accept |
263 | If $enable is false (the default), then "decode" will blindly accept |
221 | UTF-8 data, marking them as valid UTF-8 in the resulting data |
264 | UTF-8 data, marking them as valid UTF-8 in the resulting data |
222 | structure regardless of whether thats true or not. |
265 | structure regardless of whether that's true or not. |
223 | |
266 | |
224 | Perl isn't too happy about corrupted UTF-8 in strings, but should |
267 | Perl isn't too happy about corrupted UTF-8 in strings, but should |
225 | generally not crash or do similarly evil things. Extensions might be |
268 | generally not crash or do similarly evil things. Extensions might be |
226 | not so forgiving, so it's recommended to turn on this setting if you |
269 | not so forgiving, so it's recommended to turn on this setting if you |
227 | receive untrusted CBOR. |
270 | receive untrusted CBOR. |
… | |
… | |
409 | |
452 | |
410 | hash references |
453 | hash references |
411 | Perl hash references become CBOR maps. As there is no inherent |
454 | Perl hash references become CBOR maps. As there is no inherent |
412 | ordering in hash keys (or CBOR maps), they will usually be encoded |
455 | ordering in hash keys (or CBOR maps), they will usually be encoded |
413 | in a pseudo-random order. This order can be different each time a |
456 | in a pseudo-random order. This order can be different each time a |
414 | hahs is encoded. |
457 | hash is encoded. |
415 | |
458 | |
416 | Currently, tied hashes will use the indefinite-length format, while |
459 | Currently, tied hashes will use the indefinite-length format, while |
417 | normal hashes will use the fixed-length format. |
460 | normal hashes will use the fixed-length format. |
418 | |
461 | |
419 | array references |
462 | array references |
… | |
… | |
468 | my $x = 3.1; # some variable containing a number |
511 | my $x = 3.1; # some variable containing a number |
469 | "$x"; # stringified |
512 | "$x"; # stringified |
470 | $x .= ""; # another, more awkward way to stringify |
513 | $x .= ""; # another, more awkward way to stringify |
471 | print $x; # perl does it for you, too, quite often |
514 | print $x; # perl does it for you, too, quite often |
472 | |
515 | |
473 | You can force whether a string ie encoded as byte or text string by |
516 | You can force whether a string is encoded as byte or text string by |
474 | using "utf8::upgrade" and "utf8::downgrade"): |
517 | using "utf8::upgrade" and "utf8::downgrade" (if "text_strings" is |
|
|
518 | disabled): |
475 | |
519 | |
476 | utf8::upgrade $x; # encode $x as text string |
520 | utf8::upgrade $x; # encode $x as text string |
477 | utf8::downgrade $x; # encode $x as byte string |
521 | utf8::downgrade $x; # encode $x as byte string |
478 | |
522 | |
479 | Perl doesn't define what operations up- and downgrade strings, so if |
523 | Perl doesn't define what operations up- and downgrade strings, so if |
480 | the difference between byte and text is important, you should up- or |
524 | the difference between byte and text is important, you should up- or |
481 | downgrade your string as late as possible before encoding. |
525 | downgrade your string as late as possible before encoding. You can |
|
|
526 | also force the use of CBOR text strings by using "text_keys" or |
|
|
527 | "text_strings". |
482 | |
528 | |
483 | You can force the type to be a CBOR number by numifying it: |
529 | You can force the type to be a CBOR number by numifying it: |
484 | |
530 | |
485 | my $x = "3"; # some variable containing a string |
531 | my $x = "3"; # some variable containing a string |
486 | $x += 0; # numify it, ensuring it will be dumped as a number |
532 | $x += 0; # numify it, ensuring it will be dumped as a number |
… | |
… | |
581 | "$self" # encode url string |
627 | "$self" # encode url string |
582 | } |
628 | } |
583 | |
629 | |
584 | sub URI::THAW { |
630 | sub URI::THAW { |
585 | my ($class, $serialiser, $uri) = @_; |
631 | my ($class, $serialiser, $uri) = @_; |
586 | |
|
|
587 | $class->new ($uri) |
632 | $class->new ($uri) |
588 | } |
633 | } |
589 | |
634 | |
590 | Unlike "TO_CBOR", multiple values can be returned by "FREEZE". For |
635 | Unlike "TO_CBOR", multiple values can be returned by "FREEZE". For |
591 | example, a "FREEZE" method that returns "type", "id" and "variant" |
636 | example, a "FREEZE" method that returns "type", "id" and "variant" |
… | |
… | |
687 | Future versions of this module reserve the right to special case |
732 | Future versions of this module reserve the right to special case |
688 | additional tags (such as base64url). |
733 | additional tags (such as base64url). |
689 | |
734 | |
690 | ENFORCED TAGS |
735 | ENFORCED TAGS |
691 | These tags are always handled when decoding, and their handling cannot |
736 | These tags are always handled when decoding, and their handling cannot |
692 | be overriden by the user. |
737 | be overridden by the user. |
693 | |
738 | |
694 | 26 (perl-object, <http://cbor.schmorp.de/perl-object>) |
739 | 26 (perl-object, <http://cbor.schmorp.de/perl-object>) |
695 | These tags are automatically created (and decoded) for serialisable |
740 | These tags are automatically created (and decoded) for serialisable |
696 | objects using the "FREEZE/THAW" methods (the Types::Serialier object |
741 | objects using the "FREEZE/THAW" methods (the Types::Serialier object |
697 | serialisation protocol). See "OBJECT SERIALISATION" for details. |
742 | serialisation protocol). See "OBJECT SERIALISATION" for details. |
… | |
… | |
720 | These tags are automatically decoded when encountered. They are only |
765 | These tags are automatically decoded when encountered. They are only |
721 | encoded, however, when "pack_strings" is enabled. |
766 | encoded, however, when "pack_strings" is enabled. |
722 | |
767 | |
723 | 22098 (indirection, <http://cbor.schmorp.de/indirection>) |
768 | 22098 (indirection, <http://cbor.schmorp.de/indirection>) |
724 | This tag is automatically generated when a reference are encountered |
769 | This tag is automatically generated when a reference are encountered |
725 | (with the exception of hash and array refernces). It is converted to |
770 | (with the exception of hash and array references). It is converted |
726 | a reference when decoding. |
771 | to a reference when decoding. |
727 | |
772 | |
728 | 55799 (self-describe CBOR, RFC 7049) |
773 | 55799 (self-describe CBOR, RFC 7049) |
729 | This value is not generated on encoding (unless explicitly requested |
774 | This value is not generated on encoding (unless explicitly requested |
730 | by the user), and is simply ignored when decoding. |
775 | by the user), and is simply ignored when decoding. |
731 | |
776 | |
732 | NON-ENFORCED TAGS |
777 | NON-ENFORCED TAGS |
733 | These tags have default filters provided when decoding. Their handling |
778 | These tags have default filters provided when decoding. Their handling |
734 | can be overriden by changing the %CBOR::XS::FILTER entry for the tag, or |
779 | can be overridden by changing the %CBOR::XS::FILTER entry for the tag, |
735 | by providing a custom "filter" callback when decoding. |
780 | or by providing a custom "filter" callback when decoding. |
736 | |
781 | |
737 | When they result in decoding into a specific Perl class, the module |
782 | When they result in decoding into a specific Perl class, the module |
738 | usually provides a corresponding "TO_CBOR" method as well. |
783 | usually provides a corresponding "TO_CBOR" method as well. |
739 | |
784 | |
740 | When any of these need to load additional modules that are not part of |
785 | When any of these need to load additional modules that are not part of |
… | |
… | |
755 | 2, 3 (positive/negative bignum) |
800 | 2, 3 (positive/negative bignum) |
756 | These tags are decoded into Math::BigInt objects. The corresponding |
801 | These tags are decoded into Math::BigInt objects. The corresponding |
757 | "Math::BigInt::TO_CBOR" method encodes "small" bigints into normal |
802 | "Math::BigInt::TO_CBOR" method encodes "small" bigints into normal |
758 | CBOR integers, and others into positive/negative CBOR bignums. |
803 | CBOR integers, and others into positive/negative CBOR bignums. |
759 | |
804 | |
760 | 4, 5 (decimal fraction/bigfloat) |
805 | 4, 5, 264, 265 (decimal fraction/bigfloat) |
761 | Both decimal fractions and bigfloats are decoded into Math::BigFloat |
806 | Both decimal fractions and bigfloats are decoded into Math::BigFloat |
762 | objects. The corresponding "Math::BigFloat::TO_CBOR" method *always* |
807 | objects. The corresponding "Math::BigFloat::TO_CBOR" method *always* |
763 | encodes into a decimal fraction. |
808 | encodes into a decimal fraction (either tag 4 or 264). |
764 | |
809 | |
765 | CBOR cannot represent bigfloats with *very* large exponents - |
|
|
766 | conversion of such big float objects is undefined. |
|
|
767 | |
|
|
768 | Also, NaN and infinities are not encoded properly. |
810 | NaN and infinities are not encoded properly, as they cannot be |
|
|
811 | represented in CBOR. |
|
|
812 | |
|
|
813 | See "BIGNUM SECURITY CONSIDERATIONS" for more info. |
|
|
814 | |
|
|
815 | 30 (rational numbers) |
|
|
816 | These tags are decoded into Math::BigRat objects. The corresponding |
|
|
817 | "Math::BigRat::TO_CBOR" method encodes rational numbers with |
|
|
818 | denominator 1 via their numerator only, i.e., they become normal |
|
|
819 | integers or "bignums". |
|
|
820 | |
|
|
821 | See "BIGNUM SECURITY CONSIDERATIONS" for more info. |
769 | |
822 | |
770 | 21, 22, 23 (expected later JSON conversion) |
823 | 21, 22, 23 (expected later JSON conversion) |
771 | CBOR::XS is not a CBOR-to-JSON converter, and will simply ignore |
824 | CBOR::XS is not a CBOR-to-JSON converter, and will simply ignore |
772 | these tags. |
825 | these tags. |
773 | |
826 | |
… | |
… | |
820 | Also keep in mind that CBOR::XS might leak contents of your Perl data |
873 | Also keep in mind that CBOR::XS might leak contents of your Perl data |
821 | structures in its error messages, so when you serialise sensitive |
874 | structures in its error messages, so when you serialise sensitive |
822 | information you might want to make sure that exceptions thrown by |
875 | information you might want to make sure that exceptions thrown by |
823 | CBOR::XS will not end up in front of untrusted eyes. |
876 | CBOR::XS will not end up in front of untrusted eyes. |
824 | |
877 | |
|
|
878 | BIGNUM SECURITY CONSIDERATIONS |
|
|
879 | CBOR::XS provides a "TO_CBOR" method for both Math::BigInt and |
|
|
880 | Math::BigFloat that tries to encode the number in the simplest possible |
|
|
881 | way, that is, either a CBOR integer, a CBOR bigint/decimal fraction (tag |
|
|
882 | 4) or an arbitrary-exponent decimal fraction (tag 264). Rational numbers |
|
|
883 | (Math::BigRat, tag 30) can also contain bignums as members. |
|
|
884 | |
|
|
885 | CBOR::XS will also understand base-2 bigfloat or arbitrary-exponent |
|
|
886 | bigfloats (tags 5 and 265), but it will never generate these on its own. |
|
|
887 | |
|
|
888 | Using the built-in Math::BigInt::Calc support, encoding and decoding |
|
|
889 | decimal fractions is generally fast. Decoding bigints can be slow for |
|
|
890 | very big numbers (tens of thousands of digits, something that could |
|
|
891 | potentially be caught by limiting the size of CBOR texts), and decoding |
|
|
892 | bigfloats or arbitrary-exponent bigfloats can be *extremely* slow |
|
|
893 | (minutes, decades) for large exponents (roughly 40 bit and longer). |
|
|
894 | |
|
|
895 | Additionally, Math::BigInt can take advantage of other bignum libraries, |
|
|
896 | such as Math::GMP, which cannot handle big floats with large exponents, |
|
|
897 | and might simply abort or crash your program, due to their code quality. |
|
|
898 | |
|
|
899 | This can be a concern if you want to parse untrusted CBOR. If it is, you |
|
|
900 | might want to disable decoding of tag 2 (bigint) and 3 (negative bigint) |
|
|
901 | types. You should also disable types 5 and 265, as these can be slow |
|
|
902 | even without bigints. |
|
|
903 | |
|
|
904 | Disabling bigints will also partially or fully disable types that rely |
|
|
905 | on them, e.g. rational numbers that use bignums. |
|
|
906 | |
825 | CBOR IMPLEMENTATION NOTES |
907 | CBOR IMPLEMENTATION NOTES |
826 | This section contains some random implementation notes. They do not |
908 | This section contains some random implementation notes. They do not |
827 | describe guaranteed behaviour, but merely behaviour as-is implemented |
909 | describe guaranteed behaviour, but merely behaviour as-is implemented |
828 | right now. |
910 | right now. |
829 | |
911 | |