… | |
… | |
79 | The mutators for flags all return the CBOR object again and thus |
79 | The mutators for flags all return the CBOR object again and thus |
80 | calls can be chained: |
80 | calls can be chained: |
81 | |
81 | |
82 | my $cbor = CBOR::XS->new->encode ({a => [1,2]}); |
82 | my $cbor = CBOR::XS->new->encode ({a => [1,2]}); |
83 | |
83 | |
|
|
84 | $cbor = new_safe CBOR::XS |
|
|
85 | Create a new, safe/secure CBOR::XS object. This is similar to "new", |
|
|
86 | but configures the coder object to be safe to use with untrusted |
|
|
87 | data. Currently, this is equivalent to: |
|
|
88 | |
|
|
89 | my $cbor = CBOR::XS |
|
|
90 | ->new |
|
|
91 | ->forbid_objects |
|
|
92 | ->filter (\&CBOR::XS::safe_filter) |
|
|
93 | ->max_size (1e8); |
|
|
94 | |
|
|
95 | But is more future proof (it is better to crash because of a change |
|
|
96 | than to be exploited in other ways). |
|
|
97 | |
84 | $cbor = $cbor->max_depth ([$maximum_nesting_depth]) |
98 | $cbor = $cbor->max_depth ([$maximum_nesting_depth]) |
85 | $max_depth = $cbor->get_max_depth |
99 | $max_depth = $cbor->get_max_depth |
86 | Sets the maximum nesting level (default 512) accepted while encoding |
100 | Sets the maximum nesting level (default 512) accepted while encoding |
87 | or decoding. If a higher nesting level is detected in CBOR data or a |
101 | or decoding. If a higher nesting level is detected in CBOR data or a |
88 | Perl data structure, then the encoder and decoder will stop and |
102 | Perl data structure, then the encoder and decoder will stop and |
… | |
… | |
101 | |
115 | |
102 | Note that nesting is implemented by recursion in C. The default |
116 | Note that nesting is implemented by recursion in C. The default |
103 | value has been chosen to be as large as typical operating systems |
117 | value has been chosen to be as large as typical operating systems |
104 | allow without crashing. |
118 | allow without crashing. |
105 | |
119 | |
106 | See SECURITY CONSIDERATIONS, below, for more info on why this is |
120 | See "SECURITY CONSIDERATIONS", below, for more info on why this is |
107 | useful. |
121 | useful. |
108 | |
122 | |
109 | $cbor = $cbor->max_size ([$maximum_string_size]) |
123 | $cbor = $cbor->max_size ([$maximum_string_size]) |
110 | $max_size = $cbor->get_max_size |
124 | $max_size = $cbor->get_max_size |
111 | Set the maximum length a CBOR string may have (in bytes) where |
125 | Set the maximum length a CBOR string may have (in bytes) where |
… | |
… | |
115 | exception. This setting has no effect on "encode" (yet). |
129 | exception. This setting has no effect on "encode" (yet). |
116 | |
130 | |
117 | If no argument is given, the limit check will be deactivated (same |
131 | If no argument is given, the limit check will be deactivated (same |
118 | as when 0 is specified). |
132 | as when 0 is specified). |
119 | |
133 | |
120 | See SECURITY CONSIDERATIONS, below, for more info on why this is |
134 | See "SECURITY CONSIDERATIONS", below, for more info on why this is |
121 | useful. |
135 | useful. |
122 | |
136 | |
123 | $cbor = $cbor->allow_unknown ([$enable]) |
137 | $cbor = $cbor->allow_unknown ([$enable]) |
124 | $enabled = $cbor->get_allow_unknown |
138 | $enabled = $cbor->get_allow_unknown |
125 | If $enable is true (or missing), then "encode" will *not* throw an |
139 | If $enable is true (or missing), then "encode" will *not* throw an |
… | |
… | |
141 | instead will emit a reference to the earlier value. |
155 | instead will emit a reference to the earlier value. |
142 | |
156 | |
143 | This means that such values will only be encoded once, and will not |
157 | This means that such values will only be encoded once, and will not |
144 | result in a deep cloning of the value on decode, in decoders |
158 | result in a deep cloning of the value on decode, in decoders |
145 | supporting the value sharing extension. This also makes it possible |
159 | supporting the value sharing extension. This also makes it possible |
146 | to encode cyclic data structures (which need "allow_cycles" to ne |
160 | to encode cyclic data structures (which need "allow_cycles" to be |
147 | enabled to be decoded by this module). |
161 | enabled to be decoded by this module). |
148 | |
162 | |
149 | It is recommended to leave it off unless you know your communication |
163 | It is recommended to leave it off unless you know your communication |
150 | partner supports the value sharing extensions to CBOR |
164 | partner supports the value sharing extensions to CBOR |
151 | (<http://cbor.schmorp.de/value-sharing>), as without decoder |
165 | (<http://cbor.schmorp.de/value-sharing>), as without decoder |
… | |
… | |
185 | cyclic data structures using weak references when this option is |
199 | cyclic data structures using weak references when this option is |
186 | off, instead of throwing an error. |
200 | off, instead of throwing an error. |
187 | |
201 | |
188 | This option does not affect "encode" in any way - shared values and |
202 | This option does not affect "encode" in any way - shared values and |
189 | references will always be encoded properly if present. |
203 | references will always be encoded properly if present. |
|
|
204 | |
|
|
205 | $cbor = $cbor->forbid_objects ([$enable]) |
|
|
206 | $enabled = $cbor->get_forbid_objects |
|
|
207 | Disables the use of the object serialiser protocol. |
|
|
208 | |
|
|
209 | If $enable is true (or missing), then "encode" will will throw an |
|
|
210 | exception when it encounters perl objects that would be encoded |
|
|
211 | using the perl-object tag (26). When "decode" encounters such tags, |
|
|
212 | it will fall back to the general filter/tagged logic as if this were |
|
|
213 | an unknown tag (by default resulting in a "CBOR::XC::Tagged" |
|
|
214 | object). |
|
|
215 | |
|
|
216 | If $enable is false (the default), then "encode" will use the |
|
|
217 | Types::Serialiser object serialisation protocol to serialise objects |
|
|
218 | into perl-object tags, and "decode" will do the same to decode such |
|
|
219 | tags. |
|
|
220 | |
|
|
221 | See "SECURITY CONSIDERATIONS", below, for more info on why |
|
|
222 | forbidding this protocol can be useful. |
190 | |
223 | |
191 | $cbor = $cbor->pack_strings ([$enable]) |
224 | $cbor = $cbor->pack_strings ([$enable]) |
192 | $enabled = $cbor->get_pack_strings |
225 | $enabled = $cbor->get_pack_strings |
193 | If $enable is true (or missing), then "encode" will try not to |
226 | If $enable is true (or missing), then "encode" will try not to |
194 | encode the same string twice, but will instead encode a reference to |
227 | encode the same string twice, but will instead encode a reference to |
… | |
… | |
297 | When the filter is cleared (the default state), the default filter |
330 | When the filter is cleared (the default state), the default filter |
298 | function, "CBOR::XS::default_filter", is used. This function simply |
331 | function, "CBOR::XS::default_filter", is used. This function simply |
299 | looks up the tag in the %CBOR::XS::FILTER hash. If an entry exists |
332 | looks up the tag in the %CBOR::XS::FILTER hash. If an entry exists |
300 | it must be a code reference that is called with tag and value, and |
333 | it must be a code reference that is called with tag and value, and |
301 | is responsible for decoding the value. If no entry exists, it |
334 | is responsible for decoding the value. If no entry exists, it |
302 | returns no values. |
335 | returns no values. "CBOR::XS" provides a number of default filter |
|
|
336 | functions already, the the %CBOR::XS::FILTER hash can be freely |
|
|
337 | extended with more. |
|
|
338 | |
|
|
339 | "CBOR::XS" additionally provides an alternative filter function that |
|
|
340 | is supposed to be safe to use with untrusted data (which the default |
|
|
341 | filter might not), called "CBOR::XS::safe_filter", which works the |
|
|
342 | same as the "default_filter" but uses the %CBOR::XS::SAFE_FILTER |
|
|
343 | variable instead. It is prepopulated with the tag decoding functions |
|
|
344 | that are deemed safe (basically the same as %CBOR::XS::FILTER |
|
|
345 | without all the bignum tags), and can be extended by user code as |
|
|
346 | wlel, although, obviously, one should be very careful about adding |
|
|
347 | decoding functions here, since the expectation is that they are safe |
|
|
348 | to use on untrusted data, after all. |
303 | |
349 | |
304 | Example: decode all tags not handled internally into |
350 | Example: decode all tags not handled internally into |
305 | "CBOR::XS::Tagged" objects, with no other special handling (useful |
351 | "CBOR::XS::Tagged" objects, with no other special handling (useful |
306 | when working with potentially "unsafe" CBOR data). |
352 | when working with potentially "unsafe" CBOR data). |
307 | |
353 | |
… | |
… | |
313 | $CBOR::XS::FILTER{1347375694} = sub { |
359 | $CBOR::XS::FILTER{1347375694} = sub { |
314 | my ($tag, $value); |
360 | my ($tag, $value); |
315 | |
361 | |
316 | "tag 1347375694 value $value" |
362 | "tag 1347375694 value $value" |
317 | }; |
363 | }; |
|
|
364 | |
|
|
365 | Example: provide your own filter function that looks up tags in your |
|
|
366 | own hash: |
|
|
367 | |
|
|
368 | my %my_filter = ( |
|
|
369 | 998347484 => sub { |
|
|
370 | my ($tag, $value); |
|
|
371 | |
|
|
372 | "tag 998347484 value $value" |
|
|
373 | }; |
|
|
374 | ); |
|
|
375 | |
|
|
376 | my $coder = CBOR::XS->new->filter (sub { |
|
|
377 | &{ $my_filter{$_[0]} or return } |
|
|
378 | }); |
|
|
379 | |
|
|
380 | Example: use the safe filter function (see "SECURITY CONSIDERATIONS" |
|
|
381 | for more considerations on security). |
|
|
382 | |
|
|
383 | CBOR::XS->new->filter (\&CBOR::XS::safe_filter)->decode ($cbor_data); |
318 | |
384 | |
319 | $cbor_data = $cbor->encode ($perl_scalar) |
385 | $cbor_data = $cbor->encode ($perl_scalar) |
320 | Converts the given Perl data structure (a scalar value) to its CBOR |
386 | Converts the given Perl data structure (a scalar value) to its CBOR |
321 | representation. |
387 | representation. |
322 | |
388 | |
… | |
… | |
389 | $cbor->incr_reset |
455 | $cbor->incr_reset |
390 | Resets the incremental decoder. This throws away any saved state, so |
456 | Resets the incremental decoder. This throws away any saved state, so |
391 | that subsequent calls to "incr_parse" or "incr_parse_multiple" start |
457 | that subsequent calls to "incr_parse" or "incr_parse_multiple" start |
392 | to parse a new CBOR value from the beginning of the $buffer again. |
458 | to parse a new CBOR value from the beginning of the $buffer again. |
393 | |
459 | |
394 | This method can be caled at any time, but it *must* be called if you |
460 | This method can be called at any time, but it *must* be called if |
395 | want to change your $buffer or there was a decoding error and you |
461 | you want to change your $buffer or there was a decoding error and |
396 | want to reuse the $cbor object for future incremental parsings. |
462 | you want to reuse the $cbor object for future incremental parsings. |
397 | |
463 | |
398 | MAPPING |
464 | MAPPING |
399 | This section describes how CBOR::XS maps Perl values to CBOR values and |
465 | This section describes how CBOR::XS maps Perl values to CBOR values and |
400 | vice versa. These mappings are designed to "do the right thing" in most |
466 | vice versa. These mappings are designed to "do the right thing" in most |
401 | circumstances automatically, preserving round-tripping characteristics |
467 | circumstances automatically, preserving round-tripping characteristics |
… | |
… | |
840 | interoperability is improved in the future, then the goal will be to |
906 | interoperability is improved in the future, then the goal will be to |
841 | ensure that decoded JSON data will round-trip encoding and decoding to |
907 | ensure that decoded JSON data will round-trip encoding and decoding to |
842 | CBOR intact. |
908 | CBOR intact. |
843 | |
909 | |
844 | SECURITY CONSIDERATIONS |
910 | SECURITY CONSIDERATIONS |
845 | When you are using CBOR in a protocol, talking to untrusted potentially |
911 | Tl;dr... if you want to decode or encode CBOR from untrusted sources, |
846 | hostile creatures requires relatively few measures. |
912 | you should start with a coder object created via "new_safe": |
847 | |
913 | |
|
|
914 | my $coder = CBOR::XS->new_safe; |
|
|
915 | |
|
|
916 | my $data = $coder->decode ($cbor_text); |
|
|
917 | my $cbor = $coder->encode ($data); |
|
|
918 | |
|
|
919 | Longer version: When you are using CBOR in a protocol, talking to |
|
|
920 | untrusted potentially hostile creatures requires some thought: |
|
|
921 | |
|
|
922 | Security of the CBOR decoder itself |
848 | First of all, your CBOR decoder should be secure, that is, should not |
923 | First and foremost, your CBOR decoder should be secure, that is, |
849 | have any buffer overflows. Obviously, this module should ensure that and |
924 | should not have any buffer overflows or similar bugs that could |
|
|
925 | potentially be exploited. Obviously, this module should ensure that |
850 | I am trying hard on making that true, but you never know. |
926 | and I am trying hard on making that true, but you never know. |
851 | |
927 | |
|
|
928 | CBOR::XS can invoke almost arbitrary callbacks during decoding |
|
|
929 | CBOR::XS supports object serialisation - decoding CBOR can cause |
|
|
930 | calls to *any* "THAW" method in *any* package that exists in your |
|
|
931 | process (that is, CBOR::XS will not try to load modules, but any |
|
|
932 | existing "THAW" method or function can be called, so they all have |
|
|
933 | to be secure). |
|
|
934 | |
|
|
935 | Less obviously, it will also invoke "TO_CBOR" and "FREEZE" methods - |
|
|
936 | even if all your "THAW" methods are secure, encoding data structures |
|
|
937 | from untrusted sources can invoke those and trigger bugs in those. |
|
|
938 | |
|
|
939 | So, if you are not sure about the security of all the modules you |
|
|
940 | have loaded (you shouldn't), you should disable this part using |
|
|
941 | "forbid_objects". |
|
|
942 | |
|
|
943 | CBOR can be extended with tags that call library code |
|
|
944 | CBOR can be extended with tags, and "CBOR::XS" has a registry of |
|
|
945 | conversion functions for many existing tags that can be extended via |
|
|
946 | third-party modules (see the "filter" method). |
|
|
947 | |
|
|
948 | If you don't trust these, you should configure the "safe" filter |
|
|
949 | function, "CBOR::XS::safe_filter", which by default only includes |
|
|
950 | conversion functions that are considered "safe" by the author (but |
|
|
951 | again, they can be extended by third party modules). |
|
|
952 | |
|
|
953 | Depending on your level of paranoia, you can use the "safe" filter: |
|
|
954 | |
|
|
955 | $cbor->filter (\&CBOR::XS::safe_filter); |
|
|
956 | |
|
|
957 | ... your own filter... |
|
|
958 | |
|
|
959 | $cbor->filter (sub { ... do your stuffs here ... }); |
|
|
960 | |
|
|
961 | ... or even no filter at all, disabling all tag decoding: |
|
|
962 | |
|
|
963 | $cbor->filter (sub { }); |
|
|
964 | |
|
|
965 | This is never a problem for encoding, as the tag mechanism only |
|
|
966 | exists in CBOR texts. |
|
|
967 | |
|
|
968 | Resource-starving attacks: object memory usage |
852 | Second, you need to avoid resource-starving attacks. That means you |
969 | You need to avoid resource-starving attacks. That means you should |
853 | should limit the size of CBOR data you accept, or make sure then when |
970 | limit the size of CBOR data you accept, or make sure then when your |
854 | your resources run out, that's just fine (e.g. by using a separate |
971 | resources run out, that's just fine (e.g. by using a separate |
855 | process that can crash safely). The size of a CBOR string in octets is |
972 | process that can crash safely). The size of a CBOR string in octets |
856 | usually a good indication of the size of the resources required to |
973 | is usually a good indication of the size of the resources required |
857 | decode it into a Perl structure. While CBOR::XS can check the size of |
974 | to decode it into a Perl structure. While CBOR::XS can check the |
858 | the CBOR text, it might be too late when you already have it in memory, |
975 | size of the CBOR text (using "max_size"), it might be too late when |
859 | so you might want to check the size before you accept the string. |
976 | you already have it in memory, so you might want to check the size |
|
|
977 | before you accept the string. |
860 | |
978 | |
|
|
979 | As for encoding, it is possible to construct data structures that |
|
|
980 | are relatively small but result in large CBOR texts (for example by |
|
|
981 | having an array full of references to the same big data structure, |
|
|
982 | which will all be deep-cloned during encoding by default). This is |
|
|
983 | rarely an actual issue (and the worst case is still just running out |
|
|
984 | of memory), but you can reduce this risk by using "allow_sharing". |
|
|
985 | |
|
|
986 | Resource-starving attacks: stack overflows |
861 | Third, CBOR::XS recurses using the C stack when decoding objects and |
987 | CBOR::XS recurses using the C stack when decoding objects and |
862 | arrays. The C stack is a limited resource: for instance, on my amd64 |
988 | arrays. The C stack is a limited resource: for instance, on my amd64 |
863 | machine with 8MB of stack size I can decode around 180k nested arrays |
989 | machine with 8MB of stack size I can decode around 180k nested |
864 | but only 14k nested CBOR objects (due to perl itself recursing deeply on |
990 | arrays but only 14k nested CBOR objects (due to perl itself |
865 | croak to free the temporary). If that is exceeded, the program crashes. |
991 | recursing deeply on croak to free the temporary). If that is |
866 | To be conservative, the default nesting limit is set to 512. If your |
992 | exceeded, the program crashes. To be conservative, the default |
867 | process has a smaller stack, you should adjust this setting accordingly |
993 | nesting limit is set to 512. If your process has a smaller stack, |
868 | with the "max_depth" method. |
994 | you should adjust this setting accordingly with the "max_depth" |
|
|
995 | method. |
869 | |
996 | |
|
|
997 | Resource-starving attacks: CPU en-/decoding complexity |
|
|
998 | CBOR::XS will use the Math::BigInt, Math::BigFloat and Math::BigRat |
|
|
999 | libraries to represent encode/decode bignums. These can be very slow |
|
|
1000 | (as in, centuries of CPU time) and can even crash your program (and |
|
|
1001 | are generally not very trustworthy). See the next section for |
|
|
1002 | details. |
|
|
1003 | |
|
|
1004 | Data breaches: leaking information in error messages |
|
|
1005 | CBOR::XS might leak contents of your Perl data structures in its |
|
|
1006 | error messages, so when you serialise sensitive information you |
|
|
1007 | might want to make sure that exceptions thrown by CBOR::XS will not |
|
|
1008 | end up in front of untrusted eyes. |
|
|
1009 | |
|
|
1010 | Something else... |
870 | Something else could bomb you, too, that I forgot to think of. In that |
1011 | Something else could bomb you, too, that I forgot to think of. In |
871 | case, you get to keep the pieces. I am always open for hints, though... |
1012 | that case, you get to keep the pieces. I am always open for hints, |
872 | |
1013 | though... |
873 | Also keep in mind that CBOR::XS might leak contents of your Perl data |
|
|
874 | structures in its error messages, so when you serialise sensitive |
|
|
875 | information you might want to make sure that exceptions thrown by |
|
|
876 | CBOR::XS will not end up in front of untrusted eyes. |
|
|
877 | |
1014 | |
878 | BIGNUM SECURITY CONSIDERATIONS |
1015 | BIGNUM SECURITY CONSIDERATIONS |
879 | CBOR::XS provides a "TO_CBOR" method for both Math::BigInt and |
1016 | CBOR::XS provides a "TO_CBOR" method for both Math::BigInt and |
880 | Math::BigFloat that tries to encode the number in the simplest possible |
1017 | Math::BigFloat that tries to encode the number in the simplest possible |
881 | way, that is, either a CBOR integer, a CBOR bigint/decimal fraction (tag |
1018 | way, that is, either a CBOR integer, a CBOR bigint/decimal fraction (tag |