--- JSON-XS/XS.pm 2016/09/07 17:14:56 1.159 +++ JSON-XS/XS.pm 2018/11/15 20:49:12 1.169 @@ -42,8 +42,8 @@ overridden) with no overhead due to emulation (by inheriting constructor and methods). If JSON::XS is not available, it will fall back to the compatible JSON::PP module as backend, so using JSON instead of JSON::XS -gives you a portable JSON API that can be fast when you need and doesn't -require a C compiler when that is a problem. +gives you a portable JSON API that can be fast when you need it and +doesn't require a C compiler when that is a problem. As this is the n-th-something JSON module on CPAN, what was the reason to write yet another JSON module? While it seems there are many JSON @@ -103,7 +103,7 @@ use common::sense; -our $VERSION = 3.02; +our $VERSION = 3.04; our @ISA = qw(Exporter); our @EXPORT = qw(encode_json decode_json); @@ -133,8 +133,8 @@ =item $perl_scalar = decode_json $json_text -The opposite of C: expects an UTF-8 (binary) string and tries -to parse that as an UTF-8 encoded JSON text, returning the resulting +The opposite of C: expects a UTF-8 (binary) string and tries +to parse that as a UTF-8 encoded JSON text, returning the resulting reference. Croaks on error. This function call is functionally identical to: @@ -204,7 +204,9 @@ =item $json = new JSON::XS Creates a new JSON::XS object that can be used to de/encode JSON -strings. All boolean flags described below are by default I. +strings. All boolean flags described below are by default I +(with the exception of C, which defaults to I since +version C<4.0>). The mutators for flags all return the JSON object again and thus calls can be chained: @@ -272,7 +274,7 @@ If C<$enable> is true (or missing), then the C method will encode the JSON result into UTF-8, as required by many protocols, while the -C method expects to be handled an UTF-8-encoded string. Please +C method expects to be handed a UTF-8-encoded string. Please note that UTF-8-encoded strings do not contain any characters outside the range C<0..255>, they are thus useful for bytewise/binary I/O. In future versions, enabling this option might enable autodetection of the UTF-16 @@ -367,7 +369,7 @@ If C<$enable> is true (or missing), then C will accept some extensions to normal JSON syntax (see below). C will not be -affected in anyway. I. I suggest only to use this option to parse application-specific files written by humans (configuration files, resource files etc.) @@ -443,6 +445,9 @@ =item $enabled = $json->get_allow_nonref +Unlike other boolean options, this opotion is enabled by default beginning +with version C<4.0>. See L for the gory details. + If C<$enable> is true (or missing), then the C method can convert a non-reference into its corresponding string, number or null JSON value, which is an extension to RFC4627. Likewise, C will accept those JSON @@ -453,11 +458,11 @@ or array. Likewise, C will croak if given something that is not a JSON object or array. -Example, encode a Perl scalar as JSON value with enabled C, -resulting in an invalid JSON text: +Example, encode a Perl scalar as JSON value without enabled C, +resulting in an error: - JSON::XS->new->allow_nonref->encode ("Hello, World!") - => "Hello, World!" + JSON::XS->new->allow_nonref (0)->encode ("Hello, World!") + => hash- or arrayref expected... =item $json = $json->allow_unknown ([$enable]) @@ -517,7 +522,7 @@ =item $json = $json->allow_tags ([$enable]) -=item $enabled = $json->allow_tags +=item $enabled = $json->get_allow_tags See L for details. @@ -536,13 +541,13 @@ =item $json = $json->filter_json_object ([$coderef->($hashref)]) When C<$coderef> is specified, it will be called from C each -time it decodes a JSON object. The only argument is a reference to the -newly-created hash. If the code references returns a single scalar (which -need not be a reference), this value (i.e. a copy of that scalar to avoid -aliasing) is inserted into the deserialised data structure. If it returns -an empty list (NOTE: I C, which is a valid scalar), the -original deserialised hash will be inserted. This setting can slow down -decoding considerably. +time it decodes a JSON object. The only argument is a reference to +the newly-created hash. If the code reference returns a single scalar +(which need not be a reference), this value (or rather a copy of it) is +inserted into the deserialised data structure. If it returns an empty +list (NOTE: I C, which is a valid scalar), the original +deserialised hash will be inserted. This setting can slow down decoding +considerably. When C<$coderef> is omitted or undefined, any existing callback will be removed and C will not change the deserialised hash in any @@ -771,6 +776,10 @@ real world conditions). As a special exception, you can also call this method before having parsed anything. +That means you can only use this function to look at or manipulate text +before or after complete JSON objects, not while the parser is in the +middle of parsing a JSON object. + This function is useful in two cases: a) finding the trailing text after a JSON object or b) parsing multiple JSON objects separated by non-JSON text (such as commas). @@ -1287,7 +1296,7 @@ that. The C flag therefore switches between two modes: disabled means you -will get a Unicode string in Perl, enabled means you get an UTF-8 encoded +will get a Unicode string in Perl, enabled means you get a UTF-8 encoded octet/binary string in Perl. =item C or C flags enabled @@ -1565,45 +1574,46 @@ security right). -=head1 "OLD" VS. "NEW" JSON (RFC 4627 VS. RFC 7159) +=head2 "OLD" VS. "NEW" JSON (RFC 4627 VS. RFC 7159) -TL;DR: Due to security concerns, JSON::XS will not allow scalar data in -JSON texts by default - you need to create your own JSON::XS object and -enable C: - - - my $json = JSON::XS->new->allow_nonref; - - $text = $json->encode ($data); - $data = $json->decode ($text); - -The long version: JSON being an important and supposedly stable format, -the IETF standardised it as RFC 4627 in 2006. Unfortunately, the inventor -of JSON, Dougles Crockford, unilaterally changed the definition of JSON in -javascript. Rather than create a fork, the IETF decided to standardise the -new syntax (apparently, so Iw as told, without finding it very amusing). - -The biggest difference between thed original JSON and the new JSON is that -the new JSON supports scalars (anything other than arrays and objects) at -the toplevel of a JSON text. While this is strictly backwards compatible -to older versions, it breaks a number of protocols that relied on sending -JSON back-to-back, and is a minor security concern. - -For example, imagine you have two banks communicating, and on one side, -trhe JSON coder gets upgraded. Two messages, such as C<10> and C<1000> -might then be confused to mean C<101000>, something that couldn't happen -in the original JSON, because niether of these messages would be valid -JSON. - -If one side accepts these messages, then an upgrade in the coder on either -side could result in this becoming exploitable. - -This module has always allowed these messages as an optional extension, by -default disabled. The security concerns are the reason why the default is -still disabled, but future versions might/will likely upgrade to the newer -RFC as default format, so you are advised to check your implementation -and/or override the default with C<< ->allow_nonref (0) >> to ensure that -future versions are safe. +JSON originally required JSON texts to represent an array or object - +scalar values were explicitly not allowed. This has changed, and versions +of JSON::XS beginning with C<4.0> reflect this by allowing scalar values +by default. + +One reason why one might not want this is that this removes a fundamental +property of JSON texts, namely that they are self-delimited and +self-contained, or in other words, you could take any number of "old" +JSON texts and paste them together, and the result would be unambiguously +parseable: + + [1,3]{"k":5}[][null] # four JSON texts, without doubt + +By allowing scalars, this property is lost: in the following example, is +this one JSON text (the number 12) or two JSON texts (the numbers 1 and +2): + + 12 # could be 12, or 1 and 2 + +Another lost property of "old" JSON is that no lookahead is required to +know the end of a JSON text, i.e. the JSON text definitely ended at the +last C<]> or C<}> character, there was no need to read extra characters. + +For example, a viable network protocol with "old" JSON was to simply +exchange JSON texts without delimiter. For "new" JSON, you have to use a +suitable delimiter (such as a newline) after every JSON text or ensure you +never encode/decode scalar values. + +Most protocols do work by only transferring arrays or objects, and the +easiest way to avoid problems with the "new" JSON definition is to +explicitly disallow scalar values in your encoder and decoder: + + $json_coder = JSON::XS->new->allow_nonref (0) + +This is a somewhat unhappy situation, and the blame can fully be put on +JSON's inmventor, Douglas Crockford, who unilaterally changed the format +in 2006 without consulting the IETF, forcing the IETF to either fork the +format or go with it (as I was told, the IETF wasn't amused). =head1 INTEROPERABILITY WITH OTHER MODULES @@ -1694,14 +1704,11 @@ will change. -=head1 THREADS - -This module is I guaranteed to be thread safe and there are no -plans to change this until Perl gets thread support (as opposed to the -horribly slow so-called "threads" which are simply slow and bloated -process simulations - use fork, it's I faster, cheaper, better). +=head1 (I-)THREADS -(It might actually work, but you have been warned). +This module is I guaranteed to be ithread (or MULTIPLICITY-) safe +and there are no plans to change this. Note that perl's builtin so-called +threads/ithreads are officially deprecated and should not be used. =head1 THE PERILS OF SETLOCALE