--- JSON-XS/README 2013/10/29 15:55:49 1.39 +++ JSON-XS/README 2016/02/26 21:46:45 1.40 @@ -354,6 +354,16 @@ # neither this one... ] + * literal ASCII TAB characters in strings + + Literal ASCII TAB characters are now allowed in strings (and + treated as "\t"). + + [ + "Hello\tWorld", + "HelloWorld", # literal would not normally be allowed + ] + $json = $json->canonical ([$enable]) $enabled = $json->get_canonical If $enable is true (or missing), then the "encode" method will @@ -624,7 +634,7 @@ protocol and you need to know where the JSON text ends. JSON::XS->new->decode_prefix ("[1] the tail") - => ([], 3) + => ([1], 3) INCREMENTAL PARSING In some cases, there is the need for incremental parsing of JSON texts. @@ -1431,12 +1441,126 @@ deal with it, as major browser developers care only for features, not about getting security right). +"OLD" VS. "NEW" JSON (RFC 4627 VS. RFC 7159) + TL;DR: Due to security concerns, JSON::XS will not allow scalar data in + JSON texts by default - you need to create your own JSON::XS object and + enable "allow_nonref": + + my $json = JSON::XS->new->allow_nonref; + + $text = $json->encode ($data); + $data = $json->decode ($text); + + The long version: JSON being an important and supposedly stable format, + the IETF standardised it as RFC 4627 in 2006. Unfortunately, the + inventor of JSON, Dougles Crockford, unilaterally changed the definition + of JSON in javascript. Rather than create a fork, the IETF decided to + standardise the new syntax (apparently, so Iw as told, without finding + it very amusing). + + The biggest difference between thed original JSON and the new JSON is + that the new JSON supports scalars (anything other than arrays and + objects) at the toplevel of a JSON text. While this is strictly + backwards compatible to older versions, it breaks a number of protocols + that relied on sending JSON back-to-back, and is a minor security + concern. + + For example, imagine you have two banks communicating, and on one side, + trhe JSON coder gets upgraded. Two messages, such as 10 and 1000 might + then be confused to mean 101000, something that couldn't happen in the + original JSON, because niether of these messages would be valid JSON. + + If one side accepts these messages, then an upgrade in the coder on + either side could result in this becoming exploitable. + + This module has always allowed these messages as an optional extension, + by default disabled. The security concerns are the reason why the + default is still disabled, but future versions might/will likely upgrade + to the newer RFC as default format, so you are advised to check your + implementation and/or override the default with "->allow_nonref (0)" to + ensure that future versions are safe. + INTEROPERABILITY WITH OTHER MODULES "JSON::XS" uses the Types::Serialiser module to provide boolean constants. That means that the JSON true and false values will be comaptible to true and false values of iother modules that do the same, such as JSON::PP and CBOR::XS. +INTEROPERABILITY WITH OTHER JSON DECODERS + As long as you only serialise data that can be directly expressed in + JSON, "JSON::XS" is incapable of generating invalid JSON output (modulo + bugs, but "JSON::XS" has found more bugs in the official JSON testsuite + (1) than the official JSON testsuite has found in "JSON::XS" (0)). + + When you have trouble decoding JSON generated by this module using other + decoders, then it is very likely that you have an encoding mismatch or + the other decoder is broken. + + When decoding, "JSON::XS" is strict by default and will likely catch all + errors. There are currently two settings that change this: "relaxed" + makes "JSON::XS" accept (but not generate) some non-standard extensions, + and "allow_tags" will allow you to encode and decode Perl objects, at + the cost of not outputting valid JSON anymore. + + TAGGED VALUE SYNTAX AND STANDARD JSON EN/DECODERS + When you use "allow_tags" to use the extended (and also nonstandard and + invalid) JSON syntax for serialised objects, and you still want to + decode the generated When you want to serialise objects, you can run a + regex to replace the tagged syntax by standard JSON arrays (it only + works for "normal" packagesnames without comma, newlines or single + colons). First, the readable Perl version: + + # if your FREEZE methods return no values, you need this replace first: + $json =~ s/\( \s* (" (?: [^\\":,]+|\\.|::)* ") \s* \) \s* \[\s*\]/[$1]/gx; + + # this works for non-empty constructor arg lists: + $json =~ s/\( \s* (" (?: [^\\":,]+|\\.|::)* ") \s* \) \s* \[/[$1,/gx; + + And here is a less readable version that is easy to adapt to other + languages: + + $json =~ s/\(\s*("([^\\":,]+|\\.|::)*")\s*\)\s*\[/[$1,/g; + + Here is an ECMAScript version (same regex): + + json = json.replace (/\(\s*("([^\\":,]+|\\.|::)*")\s*\)\s*\[/g, "[$1,"); + + Since this syntax converts to standard JSON arrays, it might be hard to + distinguish serialised objects from normal arrays. You can prepend a + "magic number" as first array element to reduce chances of a collision: + + $json =~ s/\(\s*("([^\\":,]+|\\.|::)*")\s*\)\s*\[/["XU1peReLzT4ggEllLanBYq4G9VzliwKF",$1,/g; + + And after decoding the JSON text, you could walk the data structure + looking for arrays with a first element of + "XU1peReLzT4ggEllLanBYq4G9VzliwKF". + + The same approach can be used to create the tagged format with another + encoder. First, you create an array with the magic string as first + member, the classname as second, and constructor arguments last, encode + it as part of your JSON structure, and then: + + $json =~ s/\[\s*"XU1peReLzT4ggEllLanBYq4G9VzliwKF"\s*,\s*("([^\\":,]+|\\.|::)*")\s*,/($1)[/g; + + Again, this has some limitations - the magic string must not be encoded + with character escapes, and the constructor arguments must be non-empty. + +RFC7159 + Since this module was written, Google has written a new JSON RFC, RFC + 7159 (and RFC7158). Unfortunately, this RFC breaks compatibility with + both the original JSON specification on www.json.org and RFC4627. + + As far as I can see, you can get partial compatibility when parsing by + using "->allow_nonref". However, consider thew security implications of + doing so. + + I haven't decided yet when to break compatibility with RFC4627 by + default (and potentially leave applications insecure) and change the + default to follow RFC7159, but application authors are well advised to + call "->allow_nonref(0)" even if this is the current default, if they + cannot handle non-reference values, in preparation for the day when the4 + default will change. + THREADS This module is *not* guaranteed to be thread safe and there are no plans to change this until Perl gets thread support (as opposed to the