--- JSON-XS/README 2007/03/24 19:42:14 1.6 +++ JSON-XS/README 2007/04/04 00:01:44 1.10 @@ -4,12 +4,17 @@ SYNOPSIS use JSON::XS; - # exported functions, croak on error + # exported functions, they croak on error + # and expect/generate UTF-8 $utf8_encoded_json_text = to_json $perl_hash_or_arrayref; $perl_hash_or_arrayref = from_json $utf8_encoded_json_text; - # oo-interface + # objToJson and jsonToObj aliases to to_json and from_json + # are exported for compatibility to the JSON module, + # but should not be used in new code. + + # OO-interface $coder = JSON::XS->new->ascii->pretty->allow_nonref; $pretty_printed_unencoded = $coder->encode ($perl_scalar); @@ -32,14 +37,15 @@ vice versa. FEATURES - * correct handling of unicode issues + * correct unicode handling This module knows how to handle Unicode, and even documents how and when it does so. * round-trip integrity When you serialise a perl data structure using only datatypes supported by JSON, the deserialised data structure is identical on - the Perl level. (e.g. the string "2.0" doesn't suddenly become "2"). + the Perl level. (e.g. the string "2.0" doesn't suddenly become "2" + just because it looks like a number). * strict checking of JSON correctness There is no guessing, no generating of illegal JSON texts by @@ -57,9 +63,10 @@ * reasonably versatile output formats You can choose between the most compact guarenteed single-line format possible (nice for simple line-based protocols), a pure-ascii - format (for when your transport is not 8-bit clean), or a - pretty-printed format (for when you want to read that stuff). Or you - can combine those features in whatever way you like. + format (for when your transport is not 8-bit clean, still supports + the whole unicode range), or a pretty-printed format (for when you + want to read that stuff). Or you can combine those features in + whatever way you like. FUNCTIONAL INTERFACE The following convinience methods are provided by this module. They are @@ -240,7 +247,12 @@ many short strings. It will also try to downgrade any strings to octet-form if possible: perl stores strings internally either in an encoding called UTF-X or in octet-form. The latter cannot store - everything but uses less space in general. + everything but uses less space in general (and some buggy Perl or C + code might even rely on that internal representation being used). + + The actual definition of what shrink does might change in future + versions, but it will always try to save space at the expense of + time. If $enable is true (or missing), the string returned by "encode" will be shrunk-to-fit, while all strings generated by "decode" will @@ -254,6 +266,26 @@ or floats internally (there is no difference on the Perl level), saving space. + $json = $json->max_depth ([$maximum_nesting_depth]) + Sets the maximum nesting level (default 512) accepted while encoding + or decoding. If the JSON text or Perl data structure has an equal or + higher nesting level then this limit, then the encoder and decoder + will stop and croak at that point. + + Nesting level is defined by number of hash- or arrayrefs that the + encoder needs to traverse to reach a given point or the number of + "{" or "[" characters without their matching closing parenthesis + crossed to reach a given character in a string. + + Setting the maximum depth to one disallows any nesting, so that + ensures that the object is only a single hash/object or array. + + The argument to "max_depth" will be rounded up to the next nearest + power of two. + + See SECURITY CONSIDERATIONS, below, for more info on why this is + useful. + $json_text = $json->encode ($perl_scalar) Converts the given Perl data structure (a simple scalar or a reference to a hash or array) to its JSON representation. Simple @@ -319,17 +351,28 @@ hash references Perl hash references become JSON objects. As there is no inherent - ordering in hash keys, they will usually be encoded in a - pseudo-random order that can change between runs of the same program - but stays generally the same within a single run of a program. - JSON::XS can optionally sort the hash keys (determined by the - *canonical* flag), so the same datastructure will serialise to the - same JSON text (given same settings and version of JSON::XS), but - this incurs a runtime overhead. + ordering in hash keys (or JSON objects), they will usually be + encoded in a pseudo-random order that can change between runs of the + same program but stays generally the same within a single run of a + program. JSON::XS can optionally sort the hash keys (determined by + the *canonical* flag), so the same datastructure will serialise to + the same JSON text (given same settings and version of JSON::XS), + but this incurs a runtime overhead and is only rarely useful, e.g. + when you want to compare some JSON text against another for + equality. array references Perl array references become JSON arrays. + other references + Other unblessed references are generally not allowed and will cause + an exception to be thrown, except for references to the integers 0 + and 1, which get turned into "false" and "true" atoms in JSON. You + can also use "JSON::XS::false" and "JSON::XS::true" to improve + readability. + + to_json [\0,JSON::XS::true] # yields [false,true] + blessed objects Blessed objects are not allowed. JSON::XS currently tries to encode their underlying representation (hash- or arrayref), but this @@ -370,9 +413,6 @@ You can not currently output JSON booleans or force the type in other, less obscure, ways. Tell me if you need this capability. - circular data structures - Those will be encoded until memory or stackspace runs out. - COMPARISON As already mentioned, this module was created because none of the existing JSON modules could be made to work correctly. First I will @@ -459,22 +499,26 @@ system. First comes a comparison between various modules using a very short JSON - string (83 bytes), showing the number of encodes/decodes per second - (JSON::XS is the functional interface, while JSON::XS/2 is the OO - interface with pretty-printing and hashkey sorting enabled). Higher is - better: + string: + + {"method": "handleMessage", "params": ["user1", "we were just talking"], "id": null} + + It shows the number of encodes/decodes per second (JSON::XS uses the + functional interface, while JSON::XS/2 uses the OO interface with + pretty-printing and hashkey sorting enabled). Higher is better: module | encode | decode | -----------|------------|------------| - JSON | 14006 | 6820 | - JSON::DWIW | 200937 | 120386 | - JSON::PC | 85065 | 129366 | - JSON::Syck | 59898 | 44232 | - JSON::XS | 1171478 | 342435 | - JSON::XS/2 | 730760 | 328714 | + JSON | 11488.516 | 7823.035 | + JSON::DWIW | 94708.054 | 129094.260 | + JSON::PC | 63884.157 | 128528.212 | + JSON::Syck | 34898.677 | 42096.911 | + JSON::XS | 654027.064 | 396423.669 | + JSON::XS/2 | 371564.190 | 371725.613 | -----------+------------+------------+ - That is, JSON::XS is 6 times faster than than JSON::DWIW and about 80 + That is, JSON::XS is more than six times faster than JSON::DWIW on + encoding, more than three times faster on decoding, and about thirty times faster than JSON, even with pretty-printing and key sorting. Using a longer test string (roughly 18KB, generated from Yahoo! Locals @@ -482,34 +526,55 @@ module | encode | decode | -----------|------------|------------| - JSON | 673 | 38 | - JSON::DWIW | 5271 | 770 | - JSON::PC | 9901 | 2491 | - JSON::Syck | 2360 | 786 | - JSON::XS | 37398 | 3202 | - JSON::XS/2 | 13765 | 3153 | + JSON | 273.023 | 44.674 | + JSON::DWIW | 1089.383 | 1145.704 | + JSON::PC | 3097.419 | 2393.921 | + JSON::Syck | 514.060 | 843.053 | + JSON::XS | 6479.668 | 3636.364 | + JSON::XS/2 | 3774.221 | 3599.124 | -----------+------------+------------+ - Again, JSON::XS leads by far in the encoding case, while still beating - every other module in the decoding case. + Again, JSON::XS leads by far. - On large strings containing lots of unicode characters, some modules - (such as JSON::PC) decode faster than JSON::XS, but the result will be - broken due to missing unicode handling. Others refuse to decode or - encode properly, so it was impossible to prepare a fair comparison table - for that case. - -RESOURCE LIMITS - JSON::XS does not impose any limits on the size of JSON texts or Perl - values they represent - if your machine can handle it, JSON::XS will - encode or decode it. Future versions might optionally impose structure - depth and memory use resource limits. + On large strings containing lots of high unicode characters, some + modules (such as JSON::PC) seem to decode faster than JSON::XS, but the + result will be broken due to missing (or wrong) unicode handling. Others + refuse to decode or encode properly, so it was impossible to prepare a + fair comparison table for that case. + +SECURITY CONSIDERATIONS + When you are using JSON in a protocol, talking to untrusted potentially + hostile creatures requires relatively few measures. + + First of all, your JSON decoder should be secure, that is, should not + have any buffer overflows. Obviously, this module should ensure that and + I am trying hard on making that true, but you never know. + + Second, you need to avoid resource-starving attacks. That means you + should limit the size of JSON texts you accept, or make sure then when + your resources run out, thats just fine (e.g. by using a separate + process that can crash safely). The size of a JSON text in octets or + characters is usually a good indication of the size of the resources + required to decode it into a Perl structure. + + Third, JSON::XS recurses using the C stack when decoding objects and + arrays. The C stack is a limited resource: for instance, on my amd64 + machine with 8MB of stack size I can decode around 180k nested arrays + but only 14k nested JSON objects (due to perl itself recursing deeply on + croak to free the temporary). If that is exceeded, the program crashes. + to be conservative, the default nesting limit is set to 512. If your + process has a smaller stack, you should adjust this setting accordingly + with the "max_depth" method. + + And last but least, something else could bomb you that I forgot to think + of. In that case, you get to keep the pieces. I am alway sopen for + hints, though... BUGS While the goal of this module is to be correct, that unfortunately does not mean its bug-free, only that I think its design is bug-free. It is - still very young and not well-tested. If you keep reporting bugs they - will be fixed swiftly, though. + still relatively early in its development. If you keep reporting bugs + they will be fixed swiftly, though. AUTHOR Marc Lehmann