… | |
… | |
38 | with the added ability of supporting serialisation of Perl objects. (JSON |
38 | with the added ability of supporting serialisation of Perl objects. (JSON |
39 | often compresses better than CBOR though, so if you plan to compress the |
39 | often compresses better than CBOR though, so if you plan to compress the |
40 | data later and speed is less important you might want to compare both |
40 | data later and speed is less important you might want to compare both |
41 | formats first). |
41 | formats first). |
42 | |
42 | |
|
|
43 | The primary goal of this module is to be I<correct> and the secondary goal |
|
|
44 | is to be I<fast>. To reach the latter goal it was written in C. |
|
|
45 | |
43 | To give you a general idea about speed, with texts in the megabyte range, |
46 | To give you a general idea about speed, with texts in the megabyte range, |
44 | C<CBOR::XS> usually encodes roughly twice as fast as L<Storable> or |
47 | C<CBOR::XS> usually encodes roughly twice as fast as L<Storable> or |
45 | L<JSON::XS> and decodes about 15%-30% faster than those. The shorter the |
48 | L<JSON::XS> and decodes about 15%-30% faster than those. The shorter the |
46 | data, the worse L<Storable> performs in comparison. |
49 | data, the worse L<Storable> performs in comparison. |
47 | |
50 | |
… | |
… | |
51 | |
54 | |
52 | In addition to the core CBOR data format, this module implements a |
55 | In addition to the core CBOR data format, this module implements a |
53 | number of extensions, to support cyclic and shared data structures |
56 | number of extensions, to support cyclic and shared data structures |
54 | (see C<allow_sharing> and C<allow_cycles>), string deduplication (see |
57 | (see C<allow_sharing> and C<allow_cycles>), string deduplication (see |
55 | C<pack_strings>) and scalar references (always enabled). |
58 | C<pack_strings>) and scalar references (always enabled). |
56 | |
|
|
57 | The primary goal of this module is to be I<correct> and the secondary goal |
|
|
58 | is to be I<fast>. To reach the latter goal it was written in C. |
|
|
59 | |
59 | |
60 | See MAPPING, below, on how CBOR::XS maps perl values to CBOR values and |
60 | See MAPPING, below, on how CBOR::XS maps perl values to CBOR values and |
61 | vice versa. |
61 | vice versa. |
62 | |
62 | |
63 | =cut |
63 | =cut |
… | |
… | |
215 | (L<http://cbor.schmorp.de/value-sharing>), as without decoder support, the |
215 | (L<http://cbor.schmorp.de/value-sharing>), as without decoder support, the |
216 | resulting data structure might be unusable. |
216 | resulting data structure might be unusable. |
217 | |
217 | |
218 | Detecting shared values incurs a runtime overhead when values are encoded |
218 | Detecting shared values incurs a runtime overhead when values are encoded |
219 | that have a reference counter large than one, and might unnecessarily |
219 | that have a reference counter large than one, and might unnecessarily |
220 | increase the encoded size, as potentially shared values are encode as |
220 | increase the encoded size, as potentially shared values are encoded as |
221 | shareable whether or not they are actually shared. |
221 | shareable whether or not they are actually shared. |
222 | |
222 | |
223 | At the moment, only targets of references can be shared (e.g. scalars, |
223 | At the moment, only targets of references can be shared (e.g. scalars, |
224 | arrays or hashes pointed to by a reference). Weirder constructs, such as |
224 | arrays or hashes pointed to by a reference). Weirder constructs, such as |
225 | an array with multiple "copies" of the I<same> string, which are hard but |
225 | an array with multiple "copies" of the I<same> string, which are hard but |
… | |
… | |
453 | when there is trailing garbage after the CBOR string, it will silently |
453 | when there is trailing garbage after the CBOR string, it will silently |
454 | stop parsing there and return the number of characters consumed so far. |
454 | stop parsing there and return the number of characters consumed so far. |
455 | |
455 | |
456 | This is useful if your CBOR texts are not delimited by an outer protocol |
456 | This is useful if your CBOR texts are not delimited by an outer protocol |
457 | and you need to know where the first CBOR string ends amd the next one |
457 | and you need to know where the first CBOR string ends amd the next one |
458 | starts. |
458 | starts - CBOR strings are self-delimited, so it is possible to concatenate |
|
|
459 | CBOR strings without any delimiters or size fields and recover their data. |
459 | |
460 | |
460 | CBOR::XS->new->decode_prefix ("......") |
461 | CBOR::XS->new->decode_prefix ("......") |
461 | => ("...", 3) |
462 | => ("...", 3) |
462 | |
463 | |
463 | =back |
464 | =back |
… | |
… | |
1057 | |
1058 | |
1058 | |
1059 | |
1059 | =head1 SECURITY CONSIDERATIONS |
1060 | =head1 SECURITY CONSIDERATIONS |
1060 | |
1061 | |
1061 | Tl;dr... if you want to decode or encode CBOR from untrusted sources, you |
1062 | Tl;dr... if you want to decode or encode CBOR from untrusted sources, you |
1062 | should start with a coder object created via C<new_safe>: |
1063 | should start with a coder object created via C<new_safe> (which implements |
|
|
1064 | the mitigations explained below): |
1063 | |
1065 | |
1064 | my $coder = CBOR::XS->new_safe; |
1066 | my $coder = CBOR::XS->new_safe; |
1065 | |
1067 | |
1066 | my $data = $coder->decode ($cbor_text); |
1068 | my $data = $coder->decode ($cbor_text); |
1067 | my $cbor = $coder->encode ($data); |
1069 | my $cbor = $coder->encode ($data); |
… | |
… | |
1089 | even if all your C<THAW> methods are secure, encoding data structures from |
1091 | even if all your C<THAW> methods are secure, encoding data structures from |
1090 | untrusted sources can invoke those and trigger bugs in those. |
1092 | untrusted sources can invoke those and trigger bugs in those. |
1091 | |
1093 | |
1092 | So, if you are not sure about the security of all the modules you |
1094 | So, if you are not sure about the security of all the modules you |
1093 | have loaded (you shouldn't), you should disable this part using |
1095 | have loaded (you shouldn't), you should disable this part using |
1094 | C<forbid_objects>. |
1096 | C<forbid_objects> or using C<new_safe>. |
1095 | |
1097 | |
1096 | =item CBOR can be extended with tags that call library code |
1098 | =item CBOR can be extended with tags that call library code |
1097 | |
1099 | |
1098 | CBOR can be extended with tags, and C<CBOR::XS> has a registry of |
1100 | CBOR can be extended with tags, and C<CBOR::XS> has a registry of |
1099 | conversion functions for many existing tags that can be extended via |
1101 | conversion functions for many existing tags that can be extended via |
1100 | third-party modules (see the C<filter> method). |
1102 | third-party modules (see the C<filter> method). |
1101 | |
1103 | |
1102 | If you don't trust these, you should configure the "safe" filter function, |
1104 | If you don't trust these, you should configure the "safe" filter function, |
1103 | C<CBOR::XS::safe_filter>, which by default only includes conversion |
1105 | C<CBOR::XS::safe_filter> (C<new_safe> does this), which by default only |
1104 | functions that are considered "safe" by the author (but again, they can be |
1106 | includes conversion functions that are considered "safe" by the author |
1105 | extended by third party modules). |
1107 | (but again, they can be extended by third party modules). |
1106 | |
1108 | |
1107 | Depending on your level of paranoia, you can use the "safe" filter: |
1109 | Depending on your level of paranoia, you can use the "safe" filter: |
1108 | |
1110 | |
1109 | $cbor->filter (\&CBOR::XS::safe_filter); |
1111 | $cbor->filter (\&CBOR::XS::safe_filter); |
1110 | |
1112 | |
… | |
… | |
1125 | the size of CBOR data you accept, or make sure then when your resources |
1127 | the size of CBOR data you accept, or make sure then when your resources |
1126 | run out, that's just fine (e.g. by using a separate process that can |
1128 | run out, that's just fine (e.g. by using a separate process that can |
1127 | crash safely). The size of a CBOR string in octets is usually a good |
1129 | crash safely). The size of a CBOR string in octets is usually a good |
1128 | indication of the size of the resources required to decode it into a Perl |
1130 | indication of the size of the resources required to decode it into a Perl |
1129 | structure. While CBOR::XS can check the size of the CBOR text (using |
1131 | structure. While CBOR::XS can check the size of the CBOR text (using |
1130 | C<max_size>), it might be too late when you already have it in memory, so |
1132 | C<max_size> - done by C<new_safe>), it might be too late when you already |
1131 | you might want to check the size before you accept the string. |
1133 | have it in memory, so you might want to check the size before you accept |
|
|
1134 | the string. |
1132 | |
1135 | |
1133 | As for encoding, it is possible to construct data structures that are |
1136 | As for encoding, it is possible to construct data structures that are |
1134 | relatively small but result in large CBOR texts (for example by having an |
1137 | relatively small but result in large CBOR texts (for example by having an |
1135 | array full of references to the same big data structure, which will all be |
1138 | array full of references to the same big data structure, which will all be |
1136 | deep-cloned during encoding by default). This is rarely an actual issue |
1139 | deep-cloned during encoding by default). This is rarely an actual issue |
… | |
… | |
1149 | method. |
1152 | method. |
1150 | |
1153 | |
1151 | =item Resource-starving attacks: CPU en-/decoding complexity |
1154 | =item Resource-starving attacks: CPU en-/decoding complexity |
1152 | |
1155 | |
1153 | CBOR::XS will use the L<Math::BigInt>, L<Math::BigFloat> and |
1156 | CBOR::XS will use the L<Math::BigInt>, L<Math::BigFloat> and |
1154 | L<Math::BigRat> libraries to represent encode/decode bignums. These can |
1157 | L<Math::BigRat> libraries to represent encode/decode bignums. These can be |
1155 | be very slow (as in, centuries of CPU time) and can even crash your |
1158 | very slow (as in, centuries of CPU time) and can even crash your program |
1156 | program (and are generally not very trustworthy). See the next section for |
1159 | (and are generally not very trustworthy). See the next section on bignum |
1157 | details. |
1160 | security for details. |
1158 | |
1161 | |
1159 | =item Data breaches: leaking information in error messages |
1162 | =item Data breaches: leaking information in error messages |
1160 | |
1163 | |
1161 | CBOR::XS might leak contents of your Perl data structures in its error |
1164 | CBOR::XS might leak contents of your Perl data structures in its error |
1162 | messages, so when you serialise sensitive information you might want to |
1165 | messages, so when you serialise sensitive information you might want to |