--- CBOR-XS/README 2016/04/27 09:40:18 1.17 +++ CBOR-XS/README 2016/12/07 14:14:30 1.18 @@ -81,6 +81,20 @@ my $cbor = CBOR::XS->new->encode ({a => [1,2]}); + $cbor = new_safe CBOR::XS + Create a new, safe/secure CBOR::XS object. This is similar to "new", + but configures the coder object to be safe to use with untrusted + data. Currently, this is equivalent to: + + my $cbor = CBOR::XS + ->new + ->forbid_objects + ->filter (\&CBOR::XS::safe_filter) + ->max_size (1e8); + + But is more future proof (it is better to crash because of a change + than to be exploited in other ways). + $cbor = $cbor->max_depth ([$maximum_nesting_depth]) $max_depth = $cbor->get_max_depth Sets the maximum nesting level (default 512) accepted while encoding @@ -103,7 +117,7 @@ value has been chosen to be as large as typical operating systems allow without crashing. - See SECURITY CONSIDERATIONS, below, for more info on why this is + See "SECURITY CONSIDERATIONS", below, for more info on why this is useful. $cbor = $cbor->max_size ([$maximum_string_size]) @@ -117,7 +131,7 @@ If no argument is given, the limit check will be deactivated (same as when 0 is specified). - See SECURITY CONSIDERATIONS, below, for more info on why this is + See "SECURITY CONSIDERATIONS", below, for more info on why this is useful. $cbor = $cbor->allow_unknown ([$enable]) @@ -143,7 +157,7 @@ This means that such values will only be encoded once, and will not result in a deep cloning of the value on decode, in decoders supporting the value sharing extension. This also makes it possible - to encode cyclic data structures (which need "allow_cycles" to ne + to encode cyclic data structures (which need "allow_cycles" to be enabled to be decoded by this module). It is recommended to leave it off unless you know your communication @@ -188,6 +202,25 @@ This option does not affect "encode" in any way - shared values and references will always be encoded properly if present. + $cbor = $cbor->forbid_objects ([$enable]) + $enabled = $cbor->get_forbid_objects + Disables the use of the object serialiser protocol. + + If $enable is true (or missing), then "encode" will will throw an + exception when it encounters perl objects that would be encoded + using the perl-object tag (26). When "decode" encounters such tags, + it will fall back to the general filter/tagged logic as if this were + an unknown tag (by default resulting in a "CBOR::XC::Tagged" + object). + + If $enable is false (the default), then "encode" will use the + Types::Serialiser object serialisation protocol to serialise objects + into perl-object tags, and "decode" will do the same to decode such + tags. + + See "SECURITY CONSIDERATIONS", below, for more info on why + forbidding this protocol can be useful. + $cbor = $cbor->pack_strings ([$enable]) $enabled = $cbor->get_pack_strings If $enable is true (or missing), then "encode" will try not to @@ -299,7 +332,20 @@ looks up the tag in the %CBOR::XS::FILTER hash. If an entry exists it must be a code reference that is called with tag and value, and is responsible for decoding the value. If no entry exists, it - returns no values. + returns no values. "CBOR::XS" provides a number of default filter + functions already, the the %CBOR::XS::FILTER hash can be freely + extended with more. + + "CBOR::XS" additionally provides an alternative filter function that + is supposed to be safe to use with untrusted data (which the default + filter might not), called "CBOR::XS::safe_filter", which works the + same as the "default_filter" but uses the %CBOR::XS::SAFE_FILTER + variable instead. It is prepopulated with the tag decoding functions + that are deemed safe (basically the same as %CBOR::XS::FILTER + without all the bignum tags), and can be extended by user code as + wlel, although, obviously, one should be very careful about adding + decoding functions here, since the expectation is that they are safe + to use on untrusted data, after all. Example: decode all tags not handled internally into "CBOR::XS::Tagged" objects, with no other special handling (useful @@ -316,6 +362,26 @@ "tag 1347375694 value $value" }; + Example: provide your own filter function that looks up tags in your + own hash: + + my %my_filter = ( + 998347484 => sub { + my ($tag, $value); + + "tag 998347484 value $value" + }; + ); + + my $coder = CBOR::XS->new->filter (sub { + &{ $my_filter{$_[0]} or return } + }); + + Example: use the safe filter function (see "SECURITY CONSIDERATIONS" + for more considerations on security). + + CBOR::XS->new->filter (\&CBOR::XS::safe_filter)->decode ($cbor_data); + $cbor_data = $cbor->encode ($perl_scalar) Converts the given Perl data structure (a scalar value) to its CBOR representation. @@ -391,9 +457,9 @@ that subsequent calls to "incr_parse" or "incr_parse_multiple" start to parse a new CBOR value from the beginning of the $buffer again. - This method can be caled at any time, but it *must* be called if you - want to change your $buffer or there was a decoding error and you - want to reuse the $cbor object for future incremental parsings. + This method can be called at any time, but it *must* be called if + you want to change your $buffer or there was a decoding error and + you want to reuse the $cbor object for future incremental parsings. MAPPING This section describes how CBOR::XS maps Perl values to CBOR values and @@ -842,38 +908,109 @@ CBOR intact. SECURITY CONSIDERATIONS - When you are using CBOR in a protocol, talking to untrusted potentially - hostile creatures requires relatively few measures. + Tl;dr... if you want to decode or encode CBOR from untrusted sources, + you should start with a coder object created via "new_safe": + + my $coder = CBOR::XS->new_safe; + + my $data = $coder->decode ($cbor_text); + my $cbor = $coder->encode ($data); - First of all, your CBOR decoder should be secure, that is, should not - have any buffer overflows. Obviously, this module should ensure that and - I am trying hard on making that true, but you never know. - - Second, you need to avoid resource-starving attacks. That means you - should limit the size of CBOR data you accept, or make sure then when - your resources run out, that's just fine (e.g. by using a separate - process that can crash safely). The size of a CBOR string in octets is - usually a good indication of the size of the resources required to - decode it into a Perl structure. While CBOR::XS can check the size of - the CBOR text, it might be too late when you already have it in memory, - so you might want to check the size before you accept the string. - - Third, CBOR::XS recurses using the C stack when decoding objects and - arrays. The C stack is a limited resource: for instance, on my amd64 - machine with 8MB of stack size I can decode around 180k nested arrays - but only 14k nested CBOR objects (due to perl itself recursing deeply on - croak to free the temporary). If that is exceeded, the program crashes. - To be conservative, the default nesting limit is set to 512. If your - process has a smaller stack, you should adjust this setting accordingly - with the "max_depth" method. - - Something else could bomb you, too, that I forgot to think of. In that - case, you get to keep the pieces. I am always open for hints, though... - - Also keep in mind that CBOR::XS might leak contents of your Perl data - structures in its error messages, so when you serialise sensitive - information you might want to make sure that exceptions thrown by - CBOR::XS will not end up in front of untrusted eyes. + Longer version: When you are using CBOR in a protocol, talking to + untrusted potentially hostile creatures requires some thought: + + Security of the CBOR decoder itself + First and foremost, your CBOR decoder should be secure, that is, + should not have any buffer overflows or similar bugs that could + potentially be exploited. Obviously, this module should ensure that + and I am trying hard on making that true, but you never know. + + CBOR::XS can invoke almost arbitrary callbacks during decoding + CBOR::XS supports object serialisation - decoding CBOR can cause + calls to *any* "THAW" method in *any* package that exists in your + process (that is, CBOR::XS will not try to load modules, but any + existing "THAW" method or function can be called, so they all have + to be secure). + + Less obviously, it will also invoke "TO_CBOR" and "FREEZE" methods - + even if all your "THAW" methods are secure, encoding data structures + from untrusted sources can invoke those and trigger bugs in those. + + So, if you are not sure about the security of all the modules you + have loaded (you shouldn't), you should disable this part using + "forbid_objects". + + CBOR can be extended with tags that call library code + CBOR can be extended with tags, and "CBOR::XS" has a registry of + conversion functions for many existing tags that can be extended via + third-party modules (see the "filter" method). + + If you don't trust these, you should configure the "safe" filter + function, "CBOR::XS::safe_filter", which by default only includes + conversion functions that are considered "safe" by the author (but + again, they can be extended by third party modules). + + Depending on your level of paranoia, you can use the "safe" filter: + + $cbor->filter (\&CBOR::XS::safe_filter); + + ... your own filter... + + $cbor->filter (sub { ... do your stuffs here ... }); + + ... or even no filter at all, disabling all tag decoding: + + $cbor->filter (sub { }); + + This is never a problem for encoding, as the tag mechanism only + exists in CBOR texts. + + Resource-starving attacks: object memory usage + You need to avoid resource-starving attacks. That means you should + limit the size of CBOR data you accept, or make sure then when your + resources run out, that's just fine (e.g. by using a separate + process that can crash safely). The size of a CBOR string in octets + is usually a good indication of the size of the resources required + to decode it into a Perl structure. While CBOR::XS can check the + size of the CBOR text (using "max_size"), it might be too late when + you already have it in memory, so you might want to check the size + before you accept the string. + + As for encoding, it is possible to construct data structures that + are relatively small but result in large CBOR texts (for example by + having an array full of references to the same big data structure, + which will all be deep-cloned during encoding by default). This is + rarely an actual issue (and the worst case is still just running out + of memory), but you can reduce this risk by using "allow_sharing". + + Resource-starving attacks: stack overflows + CBOR::XS recurses using the C stack when decoding objects and + arrays. The C stack is a limited resource: for instance, on my amd64 + machine with 8MB of stack size I can decode around 180k nested + arrays but only 14k nested CBOR objects (due to perl itself + recursing deeply on croak to free the temporary). If that is + exceeded, the program crashes. To be conservative, the default + nesting limit is set to 512. If your process has a smaller stack, + you should adjust this setting accordingly with the "max_depth" + method. + + Resource-starving attacks: CPU en-/decoding complexity + CBOR::XS will use the Math::BigInt, Math::BigFloat and Math::BigRat + libraries to represent encode/decode bignums. These can be very slow + (as in, centuries of CPU time) and can even crash your program (and + are generally not very trustworthy). See the next section for + details. + + Data breaches: leaking information in error messages + CBOR::XS might leak contents of your Perl data structures in its + error messages, so when you serialise sensitive information you + might want to make sure that exceptions thrown by CBOR::XS will not + end up in front of untrusted eyes. + + Something else... + Something else could bomb you, too, that I forgot to think of. In + that case, you get to keep the pieces. I am always open for hints, + though... BIGNUM SECURITY CONSIDERATIONS CBOR::XS provides a "TO_CBOR" method for both Math::BigInt and