--- JSON-XS/XS.pm 2010/01/06 08:02:18 1.126 +++ JSON-XS/XS.pm 2013/10/29 00:21:10 1.149 @@ -66,10 +66,10 @@ =item * round-trip integrity When you serialise a perl data structure using only data types supported -by JSON, the deserialised data structure is identical on the Perl level. -(e.g. the string "2.0" doesn't suddenly become "2" just because it looks -like a number). There minor I exceptions to this, read the MAPPING -section below to learn about those. +by JSON and Perl, the deserialised data structure is identical on the Perl +level. (e.g. the string "2.0" doesn't suddenly become "2" just because +it looks like a number). There I minor exceptions to this, read the +MAPPING section below to learn about those. =item * strict checking of JSON correctness @@ -85,7 +85,7 @@ =item * simple to use This module has both a simple functional interface as well as an object -oriented interface interface. +oriented interface. =item * reasonably versatile output formats @@ -103,24 +103,16 @@ use common::sense; -our $VERSION = '2.27'; +our $VERSION = '3.0'; our @ISA = qw(Exporter); -our @EXPORT = qw(encode_json decode_json to_json from_json); - -sub to_json($) { - require Carp; - Carp::croak ("JSON::XS::to_json has been renamed to encode_json, either downgrade to pre-2.0 versions of JSON::XS or rename the call"); -} - -sub from_json($) { - require Carp; - Carp::croak ("JSON::XS::from_json has been renamed to decode_json, either downgrade to pre-2.0 versions of JSON::XS or rename the call"); -} +our @EXPORT = qw(encode_json decode_json); use Exporter; use XSLoader; +use Types::Serialiser (); + =head1 FUNCTIONAL INTERFACE The following convenience methods are provided by this module. They are @@ -151,15 +143,6 @@ Except being faster. -=item $is_boolean = JSON::XS::is_bool $scalar - -Returns true if the passed scalar represents either JSON::XS::true or -JSON::XS::false, two constants that act like C<1> and C<0>, respectively -and are used to represent JSON C and C values in Perl. - -See MAPPING, below, for more information on how JSON values are mapped to -Perl. - =back @@ -434,7 +417,8 @@ If C<$enable> is false, then the C method will output key-value pairs in the order Perl stores them (which will likely change between runs -of the same script). +of the same script, and can change even within the same run from 5.18 +onwards). This option is useful if you want the same data structure to be encoded as the same JSON text (given the same overall settings). If it is disabled, @@ -485,26 +469,28 @@ =item $enabled = $json->get_allow_blessed +See L for details. + If C<$enable> is true (or missing), then the C method will not -barf when it encounters a blessed reference. Instead, the value of the -B option will decide whether C (C -disabled or no C method found) or a representation of the -object (C enabled and C method found) is being -encoded. Has no effect on C. +barf when it encounters a blessed reference that it cannot convert +otherwise. Instead, a JSON C value is encoded instead of the object. If C<$enable> is false (the default), then C will throw an -exception when it encounters a blessed object. +exception when it encounters a blessed object that it cannot convert +otherwise. + +This setting has no effect on C. =item $json = $json->convert_blessed ([$enable]) =item $enabled = $json->get_convert_blessed +See L for details. + If C<$enable> is true (or missing), then C, upon encountering a blessed object, will check for the availability of the C method -on the object's class. If found, it will be called in scalar context -and the resulting scalar will be encoded instead of the object. If no -C method is found, the value of C will decide what -to do. +on the object's class. If found, it will be called in scalar context and +the resulting scalar will be encoded instead of the object. The C method may safely call die if it wants. If C returns other blessed objects, those will be handled in the same @@ -514,12 +500,28 @@ usually in upper case letters and to avoid collisions with any C function or method. -This setting does not yet influence C in any way, but in the -future, global hooks might get installed that influence C and are -enabled by this setting. +If C<$enable> is false (the default), then C will not consider +this type of conversion. + +This setting has no effect on C. + +=item $json = $json->allow_tags ([$enable]) -If C<$enable> is false, then the C setting will decide what -to do when a blessed object is found. +=item $enabled = $json->allow_tags + +See L for details. + +If C<$enable> is true (or missing), then C, upon encountering a +blessed object, will check for the availability of the C method on +the object's class. If found, it will be used to serialise the object into +a nonstandard tagged JSON value (that JSON decoders cannot decode). + +It also causes C to parse such tagged JSON values and deserialise +them via a call to the C method. + +If C<$enable> is false (the default), then C will not consider +this type of conversion, and tagged JSON values will cause a parse error +in C, as if tags were not part of the grammar. =item $json = $json->filter_json_object ([$coderef->($hashref)]) @@ -668,22 +670,14 @@ =item $json_text = $json->encode ($perl_scalar) -Converts the given Perl data structure (a simple scalar or a reference -to a hash or array) to its JSON representation. Simple scalars will be -converted into JSON string or number sequences, while references to arrays -become JSON arrays and references to hashes become JSON objects. Undefined -Perl values (e.g. C) become JSON C values. Neither C -nor C values will be generated. +Converts the given Perl value or data structure to its JSON +representation. Croaks on error. =item $perl_scalar = $json->decode ($json_text) The opposite of C: expects a JSON text and tries to parse it, returning the resulting simple scalar or reference. Croaks on error. -JSON numbers and strings become simple Perl scalars. JSON arrays become -Perl arrayrefs and JSON objects become Perl hashrefs. C becomes -C<1>, C becomes C<0> and C becomes C. - =item ($perl_scalar, $characters) = $json->decode_prefix ($json_text) This works like the C method, but instead of raising an exception @@ -692,8 +686,7 @@ so far. This is useful if your JSON texts are not delimited by an outer protocol -(which is not the brightest thing to do in the first place) and you need -to know where the JSON text ends. +and you need to know where the JSON text ends. JSON::XS->new->decode_prefix ("[1] the tail") => ([], 3) @@ -715,8 +708,8 @@ JSON::XS will only attempt to parse the JSON text once it is sure it has enough text to get a decisive result, using a very simple but truly incremental parser. This means that it sometimes won't stop as -early as the full parser, for example, it doesn't detect parenthese -mismatches. The only thing it guarantees is that it starts decoding as +early as the full parser, for example, it doesn't detect mismatched +parentheses. The only thing it guarantees is that it starts decoding as soon as a syntactically valid JSON text has been seen. This means you need to set resource limits (e.g. C) to ensure the parser will stop parsing in the presence if syntax errors. @@ -742,7 +735,7 @@ exactly I JSON object. If that is successful, it will return this object, otherwise it will return C. If there is a parse error, this method will croak just as C would do (one can then use -C to skip the errornous part). This is the most common way of +C to skip the erroneous part). This is the most common way of using the method. And finally, in list context, it will try to extract as many objects @@ -753,6 +746,11 @@ case. Note that in this case, any previously-parsed JSON texts will be lost. +Example: Parse some JSON arrays/objects in a given string and return +them. + + my @objs = JSON::XS->new->incr_parse ("[5][7][1,2]"); + =item $lvalue_string = $json->incr_text This method returns the currently stored JSON fragment as an lvalue, that @@ -776,7 +774,7 @@ parse state. The difference to C is that only text until the parse error -occured is removed. +occurred is removed. =item $json->incr_reset @@ -792,10 +790,10 @@ =head2 LIMITATIONS All options that affect decoding are supported, except -C. The reason for this is that it cannot be made to -work sensibly: JSON objects and arrays are self-delimited, i.e. you can concatenate -them back to back and still decode them perfectly. This does not hold true -for JSON numbers, however. +C. The reason for this is that it cannot be made to work +sensibly: JSON objects and arrays are self-delimited, i.e. you can +concatenate them back to back and still decode them perfectly. This does +not hold true for JSON numbers, however. For example, is the string C<1> a single JSON number, or is it simply the start of C<12>? Or is C<12> a single JSON number, or the concatenation @@ -984,24 +982,45 @@ a numeric (floating point) value if that is possible without loss of precision. Otherwise it will preserve the number as a string value (in which case you lose roundtripping ability, as the JSON number will be -re-encoded toa JSON string). +re-encoded to a JSON string). Numbers containing a fractional or exponential part will always be represented as numeric (floating point) values, possibly at a loss of precision (in which case you might lose perfect roundtripping ability, but the JSON number will still be re-encoded as a JSON number). +Note that precision is not accuracy - binary floating point values cannot +represent most decimal fractions exactly, and when converting from and to +floating point, JSON::XS only guarantees precision up to but not including +the least significant bit. + =item true, false -These JSON atoms become C and C, -respectively. They are overloaded to act almost exactly like the numbers -C<1> and C<0>. You can check whether a scalar is a JSON boolean by using -the C function. +These JSON atoms become C and +C, respectively. They are overloaded to act +almost exactly like the numbers C<1> and C<0>. You can check whether +a scalar is a JSON boolean by using the C +function (after C, of course). =item null A JSON null atom becomes C in Perl. +=item shell-style comments (C<< # I >>) + +As a nonstandard extension to the JSON syntax that is enabled by the +C setting, shell-style comments are allowed. They can start +anywhere outside strings and go till the end of the line. + +=item tagged values (C<< (I)I >>). + +Another nonstandard extension to the JSON syntax, enabled with the +C setting, are tagged values. In this implementation, the +I must be a perl package/class name encoded as a JSON string, and the +I must be a JSON array encoding optional constructor arguments. + +See L, below, for details. + =back @@ -1015,15 +1034,13 @@ =item hash references -Perl hash references become JSON objects. As there is no inherent ordering -in hash keys (or JSON objects), they will usually be encoded in a -pseudo-random order that can change between runs of the same program but -stays generally the same within a single run of a program. JSON::XS can -optionally sort the hash keys (determined by the I flag), so -the same datastructure will serialise to the same JSON text (given same -settings and version of JSON::XS), but this incurs a runtime overhead -and is only rarely useful, e.g. when you want to compare some JSON text -against another for equality. +Perl hash references become JSON objects. As there is no inherent +ordering in hash keys (or JSON objects), they will usually be encoded +in a pseudo-random order. JSON::XS can optionally sort the hash keys +(determined by the I flag), so the same datastructure will +serialise to the same JSON text (given same settings and version of +JSON::XS), but this incurs a runtime overhead and is only rarely useful, +e.g. when you want to compare some JSON text against another for equality. =item array references @@ -1033,23 +1050,26 @@ Other unblessed references are generally not allowed and will cause an exception to be thrown, except for references to the integers C<0> and -C<1>, which get turned into C and C atoms in JSON. You can -also use C and C to improve readability. +C<1>, which get turned into C and C atoms in JSON. - encode_json [\0, JSON::XS::true] # yields [false,true] +Since C uses the boolean model from L, you +can also C and then use C +and C to improve readability. -=item JSON::XS::true, JSON::XS::false + use Types::Serialiser; + encode_json [\0, Types::Serialiser::true] # yields [false,true] -These special values become JSON true and JSON false values, -respectively. You can also use C<\1> and C<\0> directly if you want. +=item Types::Serialiser::true, Types::Serialiser::false + +These special values from the L module become JSON true +and JSON false values, respectively. You can also use C<\1> and C<\0> +directly if you want. =item blessed objects -Blessed objects are not directly representable in JSON. See the -C and C methods on various options on -how to deal with this: basically, you can choose between throwing an -exception, encoding the reference as if it weren't blessed, or provide -your own serialiser method. +Blessed objects are not directly representable in JSON, but C +allows various ways of handling objects. See L, +below, for details. =item simple scalars @@ -1087,8 +1107,117 @@ if you need this capability (but don't forget to explain why it's needed :). +Note that numerical precision has the same meaning as under Perl (so +binary to decimal conversion follows the same rules as in Perl, which +can differ to other languages). Also, your perl interpreter might expose +extensions to the floating point numbers of your platform, such as +infinities or NaN's - these cannot be represented in JSON, and it is an +error to pass those in. + +=back + +=head2 OBJECT SERIALISATION + +As JSON cannot directly represent Perl objects, you have to choose between +a pure JSON representation (without the ability to deserialise the object +automatically again), and a nonstandard extension to the JSON syntax, +tagged values. + +=head3 SERIALISATION + +What happens when C encounters a Perl object depends on the +C, C and C settings, which are +used in this order: + +=over 4 + +=item 1. C is enabled and the object has a C method. + +In this case, C uses the L object +serialisation protocol to create a tagged JSON value, using a nonstandard +extension to the JSON syntax. + +This works by invoking the C method on the object, with the first +argument being the object to serialise, and the second argument being the +constant string C to distinguish it from other serialisers. + +The C method can return any number of values (i.e. zero or +more). These values and the paclkage/classname of the object will then be +encoded as a tagged JSON value in the following format: + + ("classname")[FREEZE return values...] + +For example, the hypothetical C C method might use the +objects C and C members to encode the object: + + sub My::Object::FREEZE { + my ($self, $serialiser) = @_; + + ($self->{type}, $self->{id}) + } + +=item 2. C is enabled and the object has a C method. + +In this case, the C method of the object is invoked in scalar +context. It must return a single scalar that can be directly encoded into +JSON. This scalar replaces the object in the JSON text. + +For example, the following C method will convert all L +objects to JSON strings when serialised. The fatc that these values +originally were L objects is lost. + + sub URI::TO_JSON { + my ($uri) = @_; + $uri->as_string + } + +=item 3. C is enabled. + +The object will be serialised as a JSON null value. + +=item 4. none of the above + +If none of the settings are enabled or the respective methods are missing, +C throws an exception. + =back +=head3 DESERIALISATION + +For deserialisation there are only two cases to consider: either +nonstandard tagging was used, in which case C decides, +or objects cannot be automatically be deserialised, in which +case you can use postprocessing or the C or +C callbacks to get some real objects our of +your JSON. + +This section only considers the tagged value case: I a tagged JSON object +is encountered during decoding and C is disabled, a parse +error will result (as if tagged values were not part of the grammar). + +If C is enabled, C will look up the C method +of the package/classname used during serialisation (it will not attempt +to load the package as a Perl module). If there is no such method, the +decoding will fail with an error. + +Otherwise, the C method is invoked with the classname as first +argument, the constant string C as second argument, and all the +values from the JSON array (the values originally returned by the +C method) as remaining arguments. + +The method must then return the object. While technically you can return +any Perl scalar, you might have to enable the C setting to +make that work in all cases, so better return an actual blessed reference. + +As an example, let's implement a C function that regenerates the +C from the C example earlier: + + sub My::Object::THAW { + my ($class, $serialiser, $type, $id) = @_; + + $class->new (type => $type, id => $id) + } + =head1 ENCODING/CODESET FLAG NOTES @@ -1122,7 +1251,7 @@ When C is disabled (the default), then C/C generate and expect Unicode strings, that is, characters with high ordinal Unicode values (> 255) will be encoded as such characters, and likewise such -characters are decoded as-is, no canges to them will be done, except +characters are decoded as-is, no changes to them will be done, except "(re-)interpreting" them as Unicode codepoints or Unicode characters, respectively (to Perl, these are the same thing in strings unless you do funny/weird/dumb stuff). @@ -1240,7 +1369,7 @@ Another problem is that some javascript implementations reserve some property names for their own purposes (which probably makes them non-ECMAscript-compliant). For example, Iceweasel reserves the -C<__proto__> property name for it's own purposes. +C<__proto__> property name for its own purposes. If that is a problem, you could parse try to filter the resulting JSON output for these property strings, e.g.: @@ -1248,7 +1377,7 @@ $json =~ s/"__proto__"\s*:/"__proto__renamed":/g; This works because C<__proto__> is not valid outside of strings, so every -occurence of C<"__proto__"\s*:> must be a string used as property name. +occurrence of C<"__proto__"\s*:> must be a string used as property name. If you know of other incompatibilities, please let me know. @@ -1304,10 +1433,10 @@ real compatibility for many I and trying to silence people who point out that it isn't true. -Addendum/2009: the YAML 1.2 spec is still incomaptible with JSON, even -though the incompatibilities have been documented (and are known to -Brian) for many years and the spec makes explicit claims that YAML is a -superset of JSON. It would be so easy to fix, but apparently, bullying and +Addendum/2009: the YAML 1.2 spec is still incompatible with JSON, even +though the incompatibilities have been documented (and are known to Brian) +for many years and the spec makes explicit claims that YAML is a superset +of JSON. It would be so easy to fix, but apparently, bullying people and corrupting userdata is so much easier. =back @@ -1326,49 +1455,48 @@ {"method": "handleMessage", "params": ["user1", "we were just talking"], "id": null, "array":[1,11,234,-5,1e5,1e7, - true, false]} + 1, 0]} It shows the number of encodes/decodes per second (JSON::XS uses the functional interface, while JSON::XS/2 uses the OO interface with pretty-printing and hashkey sorting enabled, JSON::XS/3 enables -shrink). Higher is better: +shrink. JSON::DWIW/DS uses the deserialise function, while JSON::DWIW::FJ +uses the from_json method). Higher is better: - module | encode | decode | - -----------|------------|------------| - JSON 1.x | 4990.842 | 4088.813 | - JSON::DWIW | 51653.990 | 71575.154 | - JSON::PC | 65948.176 | 74631.744 | - JSON::PP | 8931.652 | 3817.168 | - JSON::Syck | 24877.248 | 27776.848 | - JSON::XS | 388361.481 | 227951.304 | - JSON::XS/2 | 227951.304 | 218453.333 | - JSON::XS/3 | 338250.323 | 218453.333 | - Storable | 16500.016 | 135300.129 | - -----------+------------+------------+ - -That is, JSON::XS is about five times faster than JSON::DWIW on encoding, -about three times faster on decoding, and over forty times faster -than JSON, even with pretty-printing and key sorting. It also compares -favourably to Storable for small amounts of data. + module | encode | decode | + --------------|------------|------------| + JSON::DWIW/DS | 86302.551 | 102300.098 | + JSON::DWIW/FJ | 86302.551 | 75983.768 | + JSON::PP | 15827.562 | 6638.658 | + JSON::Syck | 63358.066 | 47662.545 | + JSON::XS | 511500.488 | 511500.488 | + JSON::XS/2 | 291271.111 | 388361.481 | + JSON::XS/3 | 361577.931 | 361577.931 | + Storable | 66788.280 | 265462.278 | + --------------+------------+------------+ + +That is, JSON::XS is almost six times faster than JSON::DWIW on encoding, +about five times faster on decoding, and over thirty to seventy times +faster than JSON's pure perl implementation. It also compares favourably +to Storable for small amounts of data. Using a longer test string (roughly 18KB, generated from Yahoo! Locals search API (L). - module | encode | decode | - -----------|------------|------------| - JSON 1.x | 55.260 | 34.971 | - JSON::DWIW | 825.228 | 1082.513 | - JSON::PC | 3571.444 | 2394.829 | - JSON::PP | 210.987 | 32.574 | - JSON::Syck | 552.551 | 787.544 | - JSON::XS | 5780.463 | 4854.519 | - JSON::XS/2 | 3869.998 | 4798.975 | - JSON::XS/3 | 5862.880 | 4798.975 | - Storable | 4445.002 | 5235.027 | - -----------+------------+------------+ + module | encode | decode | + --------------|------------|------------| + JSON::DWIW/DS | 1647.927 | 2673.916 | + JSON::DWIW/FJ | 1630.249 | 2596.128 | + JSON::PP | 400.640 | 62.311 | + JSON::Syck | 1481.040 | 1524.869 | + JSON::XS | 20661.596 | 9541.183 | + JSON::XS/2 | 10683.403 | 9416.938 | + JSON::XS/3 | 20661.596 | 9400.054 | + Storable | 19765.806 | 10000.725 | + --------------+------------+------------+ Again, JSON::XS leads by far (except for Storable which non-surprisingly -decodes faster). +decodes a bit faster). On large strings containing lots of high Unicode characters, some modules (such as JSON::PC) seem to decode faster than JSON::XS, but the result @@ -1414,11 +1542,19 @@ If you are using JSON::XS to return packets to consumption by JavaScript scripts in a browser you should have a look at -L to see whether -you are vulnerable to some common attack vectors (which really are browser -design bugs, but it is still you who will have to deal with it, as major -browser developers care only for features, not about getting security -right). +L to +see whether you are vulnerable to some common attack vectors (which really +are browser design bugs, but it is still you who will have to deal with +it, as major browser developers care only for features, not about getting +security right). + + +=head1 INTEROPERABILITY WITH OTHER MODULES + +C uses the L module to provide boolean +constants. That means that the JSON true and false values will be +comaptible to true and false values of iother modules that do the same, +such as L and L. =head1 THREADS @@ -1431,6 +1567,24 @@ (It might actually work, but you have been warned). +=head1 THE PERILS OF SETLOCALE + +Sometimes people avoid the Perl locale support and directly call the +system's setlocale function with C. + +This breaks both perl and modules such as JSON::XS, as stringification of +numbers no longer works correctly (e.g. C<$x = 0.1; print "$x"+1> might +print C<1>, and JSON::XS might output illegal JSON as JSON::XS relies on +perl to stringify numbers). + +The solution is simple: don't call C, or use it for only those +categories you need, such as C or C. + +If you need C, you should enable it only around the code that +actually needs it (avoiding stringification of numbers), and restore it +afterwards. + + =head1 BUGS While the goal of this module is to be correct, that unfortunately does @@ -1442,29 +1596,18 @@ =cut -our $true = do { bless \(my $dummy = 1), "JSON::XS::Boolean" }; -our $false = do { bless \(my $dummy = 0), "JSON::XS::Boolean" }; - -sub true() { $true } -sub false() { $false } +BEGIN { + *true = \$Types::Serialiser::true; + *true = \&Types::Serialiser::true; + *false = \$Types::Serialiser::false; + *false = \&Types::Serialiser::false; + *is_bool = \&Types::Serialiser::is_bool; -sub is_bool($) { - UNIVERSAL::isa $_[0], "JSON::XS::Boolean" -# or UNIVERSAL::isa $_[0], "JSON::Literal" + *JSON::XS::Boolean:: = *Types::Serialiser::Boolean::; } XSLoader::load "JSON::XS", $VERSION; -package JSON::XS::Boolean; - -use overload - "0+" => sub { ${$_[0]} }, - "++" => sub { $_[0] = ${$_[0]} + 1 }, - "--" => sub { $_[0] = ${$_[0]} - 1 }, - fallback => 1; - -1; - =head1 SEE ALSO The F command line utility for quick experiments. @@ -1476,3 +1619,5 @@ =cut +1 +