--- JSON-XS/README 2007/03/29 02:45:49 1.9 +++ JSON-XS/README 2007/11/13 22:59:08 1.20 @@ -1,6 +1,10 @@ NAME JSON::XS - JSON serialising/deserialising, done correctly and fast + JSON::XS - 正しくて高速な JSON + シリアライザ/デシリアライザ + (http://fleur.hio.jp/perldoc/mix/lib/JSON/XS.html) + SYNOPSIS use JSON::XS; @@ -10,10 +14,6 @@ $utf8_encoded_json_text = to_json $perl_hash_or_arrayref; $perl_hash_or_arrayref = from_json $utf8_encoded_json_text; - # objToJson and jsonToObj aliases to to_json and from_json - # are exported for compatibility to the JSON module, - # but should not be used in new code. - # OO-interface $coder = JSON::XS->new->ascii->pretty->allow_nonref; @@ -37,7 +37,7 @@ vice versa. FEATURES - * correct unicode handling + * correct Unicode handling This module knows how to handle Unicode, and even documents how and when it does so. @@ -61,21 +61,20 @@ interface. * reasonably versatile output formats - You can choose between the most compact guarenteed single-line + You can choose between the most compact guaranteed single-line format possible (nice for simple line-based protocols), a pure-ascii format (for when your transport is not 8-bit clean, still supports - the whole unicode range), or a pretty-printed format (for when you + the whole Unicode range), or a pretty-printed format (for when you want to read that stuff). Or you can combine those features in whatever way you like. FUNCTIONAL INTERFACE - The following convinience methods are provided by this module. They are + The following convenience methods are provided by this module. They are exported by default: $json_text = to_json $perl_scalar - Converts the given Perl data structure (a simple scalar or a - reference to a hash or array) to a UTF-8 encoded, binary string - (that is, the string contains octets only). Croaks on error. + Converts the given Perl data structure to a UTF-8 encoded, binary + string (that is, the string contains octets only). Croaks on error. This function call is functionally identical to: @@ -86,7 +85,7 @@ $perl_scalar = from_json $json_text The opposite of "to_json": expects an UTF-8 (binary) string and tries to parse that as an UTF-8 encoded JSON text, returning the - resulting simple scalar or reference. Croaks on error. + resulting reference. Croaks on error. This function call is functionally identical to: @@ -94,6 +93,54 @@ except being faster. + $is_boolean = JSON::XS::is_bool $scalar + Returns true if the passed scalar represents either JSON::XS::true + or JSON::XS::false, two constants that act like 1 and 0, + respectively and are used to represent JSON "true" and "false" + values in Perl. + + See MAPPING, below, for more information on how JSON values are + mapped to Perl. + +A FEW NOTES ON UNICODE AND PERL + Since this often leads to confusion, here are a few very clear words on + how Unicode works in Perl, modulo bugs. + + 1. Perl strings can store characters with ordinal values > 255. + This enables you to store Unicode characters as single characters in + a Perl string - very natural. + + 2. Perl does *not* associate an encoding with your strings. + Unless you force it to, e.g. when matching it against a regex, or + printing the scalar to a file, in which case Perl either interprets + your string as locale-encoded text, octets/binary, or as Unicode, + depending on various settings. In no case is an encoding stored + together with your data, it is *use* that decides encoding, not any + magical metadata. + + 3. The internal utf-8 flag has no meaning with regards to the encoding + of your string. + Just ignore that flag unless you debug a Perl bug, a module written + in XS or want to dive into the internals of perl. Otherwise it will + only confuse you, as, despite the name, it says nothing about how + your string is encoded. You can have Unicode strings with that flag + set, with that flag clear, and you can have binary data with that + flag set and that flag clear. Other possibilities exist, too. + + If you didn't know about that flag, just the better, pretend it + doesn't exist. + + 4. A "Unicode String" is simply a string where each character can be + validly interpreted as a Unicode codepoint. + If you have UTF-8 encoded data, it is no longer a Unicode string, + but a Unicode string encoded in UTF-8, giving you a binary string. + + 5. A string containing "high" (> 255) character values is *not* a UTF-8 + string. + It's a fact. Learn to live with it. + + I hope this helps :) + OBJECT-ORIENTED INTERFACE The object oriented interface lets you configure your own encoding or decoding style, within the limits of supported formats. @@ -112,17 +159,49 @@ $json = $json->ascii ([$enable]) If $enable is true (or missing), then the "encode" method will not generate characters outside the code range 0..127 (which is ASCII). - Any unicode characters outside that range will be escaped using + Any Unicode characters outside that range will be escaped using either a single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL - escape sequence, as per RFC4627. + escape sequence, as per RFC4627. The resulting encoded JSON text can + be treated as a native Unicode string, an ascii-encoded, + latin1-encoded or UTF-8 encoded string, or any other superset of + ASCII. If $enable is false, then the "encode" method will not escape - Unicode characters unless required by the JSON syntax. This results - in a faster and more compact format. + Unicode characters unless required by the JSON syntax or other + flags. This results in a faster and more compact format. + + The main use for this flag is to produce JSON texts that can be + transmitted over a 7-bit channel, as the encoded JSON texts will not + contain any 8 bit characters. JSON::XS->new->ascii (1)->encode ([chr 0x10401]) => ["\ud801\udc01"] + $json = $json->latin1 ([$enable]) + If $enable is true (or missing), then the "encode" method will + encode the resulting JSON text as latin1 (or iso-8859-1), escaping + any characters outside the code range 0..255. The resulting string + can be treated as a latin1-encoded JSON text or a native Unicode + string. The "decode" method will not be affected in any way by this + flag, as "decode" by default expects Unicode, which is a strict + superset of latin1. + + If $enable is false, then the "encode" method will not escape + Unicode characters unless required by the JSON syntax or other + flags. + + The main use for this flag is efficiently encoding binary data as + JSON text, as most octets will not be escaped, resulting in a + smaller encoded size. The disadvantage is that the resulting JSON + text is encoded in latin1 (and must correctly be treated as such + when storing and transferring), a rare encoding for JSON. It is + therefore most useful when you want to store data structures known + to contain binary data efficiently in files or databases, not when + talking to other JSON encoders/decoders. + + JSON::XS->new->latin1->encode (["\x{89}\x{abc}"] + => ["\x{89}\\u0abc"] # (perl syntax, U+abc escaped, U+89 not) + $json = $json->utf8 ([$enable]) If $enable is true (or missing), then the "encode" method will encode the JSON result into UTF-8, as required by many protocols, @@ -134,8 +213,8 @@ described in RFC4627. If $enable is false, then the "encode" method will return the JSON - string as a (non-encoded) unicode string, while "decode" expects - thus a unicode string. Any decoding or encoding (e.g. to UTF-8 or + string as a (non-encoded) Unicode string, while "decode" expects + thus a Unicode string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs to be done yourself, e.g. using the Encode module. Example, output UTF-16BE-encoded JSON: @@ -167,11 +246,11 @@ $json = $json->indent ([$enable]) If $enable is true (or missing), then the "encode" method will use a multiline format as output, putting every array member or - object/hash key-value pair into its own line, identing them + object/hash key-value pair into its own line, indenting them properly. If $enable is false, no newlines or indenting will be produced, and - the resulting JSON text is guarenteed not to contain any "newlines". + the resulting JSON text is guaranteed not to contain any "newlines". This setting has no effect when decoding JSON texts. @@ -205,6 +284,45 @@ {"key": "value"} + $json = $json->relaxed ([$enable]) + If $enable is true (or missing), then "decode" will accept some + extensions to normal JSON syntax (see below). "encode" will not be + affected in anyway. *Be aware that this option makes you accept + invalid JSON texts as if they were valid!*. I suggest only to use + this option to parse application-specific files written by humans + (configuration files, resource files etc.) + + If $enable is false (the default), then "decode" will only accept + valid JSON texts. + + Currently accepted extensions are: + + * list items can have an end-comma + JSON *separates* array elements and key-value pairs with commas. + This can be annoying if you write JSON texts manually and want + to be able to quickly append elements, so this extension accepts + comma at the end of such items not just between them: + + [ + 1, + 2, <- this comma not normally allowed + ] + { + "k1": "v1", + "k2": "v2", <- this comma not normally allowed + } + + * shell-style '#'-comments + Whenever JSON allows whitespace, shell-style comments are + additionally allowed. They are terminated by the first + carriage-return or line-feed character, after which more + white-space and comments are allowed. + + [ + 1, # this comment not allowed in JSON + # neither this one... + ] + $json = $json->canonical ([$enable]) If $enable is true (or missing), then the "encode" method will output JSON objects by sorting their keys. This is adding a @@ -216,7 +334,7 @@ This option is useful if you want the same data structure to be encoded as the same JSON text (given the same overall settings). If - it is disabled, the same hash migh be encoded differently even if + it is disabled, the same hash might be encoded differently even if contains the same data, as key-value pairs have no inherent ordering in Perl. @@ -239,6 +357,116 @@ JSON::XS->new->allow_nonref->encode ("Hello, World!") => "Hello, World!" + $json = $json->allow_blessed ([$enable]) + If $enable is true (or missing), then the "encode" method will not + barf when it encounters a blessed reference. Instead, the value of + the convert_blessed option will decide whether "null" + ("convert_blessed" disabled or no "to_json" method found) or a + representation of the object ("convert_blessed" enabled and + "to_json" method found) is being encoded. Has no effect on "decode". + + If $enable is false (the default), then "encode" will throw an + exception when it encounters a blessed object. + + $json = $json->convert_blessed ([$enable]) + If $enable is true (or missing), then "encode", upon encountering a + blessed object, will check for the availability of the "TO_JSON" + method on the object's class. If found, it will be called in scalar + context and the resulting scalar will be encoded instead of the + object. If no "TO_JSON" method is found, the value of + "allow_blessed" will decide what to do. + + The "TO_JSON" method may safely call die if it wants. If "TO_JSON" + returns other blessed objects, those will be handled in the same + way. "TO_JSON" must take care of not causing an endless recursion + cycle (== crash) in this case. The name of "TO_JSON" was chosen + because other methods called by the Perl core (== not by the user of + the object) are usually in upper case letters and to avoid + collisions with the "to_json" function. + + This setting does not yet influence "decode" in any way, but in the + future, global hooks might get installed that influence "decode" and + are enabled by this setting. + + If $enable is false, then the "allow_blessed" setting will decide + what to do when a blessed object is found. + + $json = $json->filter_json_object ([$coderef->($hashref)]) + When $coderef is specified, it will be called from "decode" each + time it decodes a JSON object. The only argument is a reference to + the newly-created hash. If the code references returns a single + scalar (which need not be a reference), this value (i.e. a copy of + that scalar to avoid aliasing) is inserted into the deserialised + data structure. If it returns an empty list (NOTE: *not* "undef", + which is a valid scalar), the original deserialised hash will be + inserted. This setting can slow down decoding considerably. + + When $coderef is omitted or undefined, any existing callback will be + removed and "decode" will not change the deserialised hash in any + way. + + Example, convert all JSON objects into the integer 5: + + my $js = JSON::XS->new->filter_json_object (sub { 5 }); + # returns [5] + $js->decode ('[{}]') + # throw an exception because allow_nonref is not enabled + # so a lone 5 is not allowed. + $js->decode ('{"a":1, "b":2}'); + + $json = $json->filter_json_single_key_object ($key [=> + $coderef->($value)]) + Works remotely similar to "filter_json_object", but is only called + for JSON objects having a single key named $key. + + This $coderef is called before the one specified via + "filter_json_object", if any. It gets passed the single value in the + JSON object. If it returns a single value, it will be inserted into + the data structure. If it returns nothing (not even "undef" but the + empty list), the callback from "filter_json_object" will be called + next, as if no single-key callback were specified. + + If $coderef is omitted or undefined, the corresponding callback will + be disabled. There can only ever be one callback for a given key. + + As this callback gets called less often then the + "filter_json_object" one, decoding speed will not usually suffer as + much. Therefore, single-key objects make excellent targets to + serialise Perl objects into, especially as single-key JSON objects + are as close to the type-tagged value concept as JSON gets (it's + basically an ID/VALUE tuple). Of course, JSON does not support this + in any way, so you need to make sure your data never looks like a + serialised Perl hash. + + Typical names for the single object key are "__class_whatever__", or + "$__dollars_are_rarely_used__$" or "}ugly_brace_placement", or even + things like "__class_md5sum(classname)__", to reduce the risk of + clashing with real hashes. + + Example, decode JSON objects of the form "{ "__widget__" => }" + into the corresponding $WIDGET{} object: + + # return whatever is in $WIDGET{5}: + JSON::XS + ->new + ->filter_json_single_key_object (__widget__ => sub { + $WIDGET{ $_[0] } + }) + ->decode ('{"__widget__": 5') + + # this can be used with a TO_JSON method in some "widget" class + # for serialisation to json: + sub WidgetBase::TO_JSON { + my ($self) = @_; + + unless ($self->{id}) { + $self->{id} = ..get..some..id..; + $WIDGET{$self->{id}} = $self; + } + + { __widget__ => $self->{id} } + } + $json = $json->shrink ([$enable]) Perl usually over-allocates memory a bit when allocating space for strings. This flag optionally resizes strings generated by either @@ -267,10 +495,10 @@ saving space. $json = $json->max_depth ([$maximum_nesting_depth]) - Sets the maximum nesting level (default 4096) accepted while - encoding or decoding. If the JSON text or Perl data structure has an - equal or higher nesting level then this limit, then the encoder and - decoder will stop and croak at that point. + Sets the maximum nesting level (default 512) accepted while encoding + or decoding. If the JSON text or Perl data structure has an equal or + higher nesting level then this limit, then the encoder and decoder + will stop and croak at that point. Nesting level is defined by number of hash- or arrayrefs that the encoder needs to traverse to reach a given point or the number of @@ -280,8 +508,24 @@ Setting the maximum depth to one disallows any nesting, so that ensures that the object is only a single hash/object or array. - The argument to "max_depth" will be rounded up to the next nearest - power of two. + The argument to "max_depth" will be rounded up to the next highest + power of two. If no argument is given, the highest possible setting + will be used, which is rarely useful. + + See SECURITY CONSIDERATIONS, below, for more info on why this is + useful. + + $json = $json->max_size ([$maximum_string_size]) + Set the maximum length a JSON text may have (in bytes) where + decoding is being attempted. The default is 0, meaning no limit. + When "decode" is called on a string longer then this number of + characters it will not attempt to decode the string but throw an + exception. This setting has no effect on "encode" (yet). + + The argument to "max_size" will be rounded up to the next highest + power of two (so may be more than requested). If no argument is + given, the limit check will be deactivated (same as when 0 is + specified). See SECURITY CONSIDERATIONS, below, for more info on why this is useful. @@ -303,6 +547,19 @@ become Perl arrayrefs and JSON objects become Perl hashrefs. "true" becomes 1, "false" becomes 0 and "null" becomes "undef". + ($perl_scalar, $characters) = $json->decode_prefix ($json_text) + This works like the "decode" method, but instead of raising an + exception when there is trailing garbage after the first JSON + object, it will silently stop parsing there and return the number of + characters consumed so far. + + This is useful if your JSON texts are not delimited by an outer + protocol (which is not the brightest thing to do in the first place) + and you need to know where the JSON text ends. + + JSON::XS->new->decode_prefix ("[1] the tail") + => ([], 3) + MAPPING This section describes how JSON::XS maps Perl values to JSON values and vice versa. These mappings are designed to "do the right thing" in most @@ -310,14 +567,14 @@ (what you put in comes out as something equivalent). For the more enlightened: note that in the following descriptions, - lowercase *perl* refers to the Perl interpreter, while uppcercase *Perl* + lowercase *perl* refers to the Perl interpreter, while uppercase *Perl* refers to the abstract Perl language itself. JSON -> PERL object A JSON object becomes a reference to a hash in Perl. No ordering of - object keys is preserved (JSON does not preserver object key - ordering itself). + object keys is preserved (JSON does not preserve object key ordering + itself). array A JSON array becomes a reference to an array in Perl. @@ -328,18 +585,31 @@ so no manual decoding is necessary. number - A JSON number becomes either an integer or numeric (floating point) - scalar in perl, depending on its range and any fractional parts. On - the Perl level, there is no difference between those as Perl handles - all the conversion details, but an integer may take slightly less - memory and might represent more values exactly than (floating point) - numbers. + A JSON number becomes either an integer, numeric (floating point) or + string scalar in perl, depending on its range and any fractional + parts. On the Perl level, there is no difference between those as + Perl handles all the conversion details, but an integer may take + slightly less memory and might represent more values exactly than + (floating point) numbers. + + If the number consists of digits only, JSON::XS will try to + represent it as an integer value. If that fails, it will try to + represent it as a numeric (floating point) value if that is possible + without loss of precision. Otherwise it will preserve the number as + a string value. + + Numbers containing a fractional or exponential part will always be + represented as numeric (floating point) values, possibly at a loss + of precision. + + This might create round-tripping problems as numbers might become + strings, but as Perl is typeless there is no other way to do it. true, false - These JSON atoms become 0, 1, respectively. Information is lost in - this process. Future versions might represent those values - differently, but they will be guarenteed to act like these integers - would normally in Perl. + These JSON atoms become "JSON::XS::true" and "JSON::XS::false", + respectively. They are overloaded to act almost exactly like the + numbers 1 and 0. You can check whether a scalar is a JSON boolean by + using the "JSON::XS::is_bool" function. null A JSON null atom becomes "undef" in Perl. @@ -373,6 +643,10 @@ to_json [\0,JSON::XS::true] # yields [false,true] + JSON::XS::true, JSON::XS::false + These special values become JSON true and JSON false values, + respectively. You can also use "\1" and "\0" directly if you want. + blessed objects Blessed objects are not allowed. JSON::XS currently tries to encode their underlying representation (hash- or arrayref), but this @@ -397,24 +671,21 @@ # undef becomes null to_json [undef] # yields [null] - You can force the type to be a string by stringifying it: + You can force the type to be a JSON string by stringifying it: my $x = 3.1; # some variable containing a number "$x"; # stringified $x .= ""; # another, more awkward way to stringify print $x; # perl does it for you, too, quite often - You can force the type to be a number by numifying it: + You can force the type to be a JSON number by numifying it: my $x = "3"; # some variable containing a string $x += 0; # numify it, ensuring it will be dumped as a number - $x *= 1; # same thing, the choise is yours. + $x *= 1; # same thing, the choice is yours. - You can not currently output JSON booleans or force the type in - other, less obscure, ways. Tell me if you need this capability. - - circular data structures - Those will be encoded until memory or stackspace runs out. + You can not currently force the type in other, less obscure, ways. + Tell me if you need this capability. COMPARISON As already mentioned, this module was created because none of the @@ -426,12 +697,12 @@ JSON 1.07 Slow (but very portable, as it is written in pure Perl). - Undocumented/buggy Unicode handling (how JSON handles unicode values - is undocumented. One can get far by feeding it unicode strings and - doing en-/decoding oneself, but unicode escapes are not working + Undocumented/buggy Unicode handling (how JSON handles Unicode values + is undocumented. One can get far by feeding it Unicode strings and + doing en-/decoding oneself, but Unicode escapes are not working properly). - No roundtripping (strings get clobbered if they look like numbers, + No round-tripping (strings get clobbered if they look like numbers, e.g. the string 2.0 will encode to 2.0 instead of "2.0", and that will decode into the number 2. @@ -440,7 +711,7 @@ Undocumented/buggy Unicode handling. - No roundtripping. + No round-tripping. Has problems handling many Perl values (e.g. regex results and other magic values will make it croak). @@ -460,12 +731,12 @@ preferably a way to generate ASCII-only JSON texts). Completely broken (and confusingly documented) Unicode handling - (unicode escapes are not working properly, you need to set + (Unicode escapes are not working properly, you need to set ImplicitUnicode to *different* values on en- and decoding to get symmetric behaviour). - No roundtripping (simple cases work, but this depends on wether the - scalar value was used in a numeric context or not). + No round-tripping (simple cases work, but this depends on whether + the scalar value was used in a numeric context or not). Dumping hashes may skip hash values depending on iterator state. @@ -474,7 +745,7 @@ Does not check input for validity (i.e. will accept non-JSON input and return "something" instead of raising an exception. This is a - security issue: imagine two banks transfering money between each + security issue: imagine two banks transferring money between each other using JSON. One bank might parse a given non-JSON request and deduct money, while the other might reject the transaction with a syntax error. While a good protocol will at least recover, that is @@ -483,65 +754,100 @@ JSON::DWIW 0.04 Very fast. Very natural. Very nice. - Undocumented unicode handling (but the best of the pack. Unicode + Undocumented Unicode handling (but the best of the pack. Unicode escapes still don't get parsed properly). Very inflexible. - No roundtripping. + No round-tripping. Does not generate valid JSON texts (key strings are often unquoted, empty keys result in nothing being output) Does not check input for validity. + JSON and YAML + You often hear that JSON is a subset (or a close subset) of YAML. This + is, however, a mass hysteria and very far from the truth. In general, + there is no way to configure JSON::XS to output a data structure as + valid YAML. + + If you really must use JSON::XS to generate YAML, you should use this + algorithm (subject to change in future versions): + + my $to_yaml = JSON::XS->new->utf8->space_after (1); + my $yaml = $to_yaml->encode ($ref) . "\n"; + + This will usually generate JSON texts that also parse as valid YAML. + Please note that YAML has hardcoded limits on (simple) object key + lengths that JSON doesn't have, so you should make sure that your hash + keys are noticeably shorter than the 1024 characters YAML allows. + + There might be other incompatibilities that I am not aware of. In + general you should not try to generate YAML with a JSON generator or + vice versa, or try to parse JSON with a YAML parser or vice versa: + chances are high that you will run into severe interoperability + problems. + SPEED It seems that JSON::XS is surprisingly fast, as shown in the following tables. They have been generated with the help of the "eg/bench" program in the JSON::XS distribution, to make it easy to compare on your own system. - First comes a comparison between various modules using a very short JSON - string: + First comes a comparison between various modules using a very short + single-line JSON string: - {"method": "handleMessage", "params": ["user1", "we were just talking"], "id": null} + {"method": "handleMessage", "params": ["user1", "we were just talking"], \ + "id": null, "array":[1,11,234,-5,1e5,1e7, true, false]} It shows the number of encodes/decodes per second (JSON::XS uses the functional interface, while JSON::XS/2 uses the OO interface with - pretty-printing and hashkey sorting enabled). Higher is better: + pretty-printing and hashkey sorting enabled, JSON::XS/3 enables shrink). + Higher is better: + Storable | 15779.925 | 14169.946 | + -----------+------------+------------+ module | encode | decode | -----------|------------|------------| - JSON | 11488.516 | 7823.035 | - JSON::DWIW | 94708.054 | 129094.260 | - JSON::PC | 63884.157 | 128528.212 | - JSON::Syck | 34898.677 | 42096.911 | - JSON::XS | 654027.064 | 396423.669 | - JSON::XS/2 | 371564.190 | 371725.613 | + JSON | 4990.842 | 4088.813 | + JSON::DWIW | 51653.990 | 71575.154 | + JSON::PC | 65948.176 | 74631.744 | + JSON::PP | 8931.652 | 3817.168 | + JSON::Syck | 24877.248 | 27776.848 | + JSON::XS | 388361.481 | 227951.304 | + JSON::XS/2 | 227951.304 | 218453.333 | + JSON::XS/3 | 338250.323 | 218453.333 | + Storable | 16500.016 | 135300.129 | -----------+------------+------------+ - That is, JSON::XS is more than six times faster than JSON::DWIW on - encoding, more than three times faster on decoding, and about thirty - times faster than JSON, even with pretty-printing and key sorting. + That is, JSON::XS is about five times faster than JSON::DWIW on + encoding, about three times faster on decoding, and over forty times + faster than JSON, even with pretty-printing and key sorting. It also + compares favourably to Storable for small amounts of data. Using a longer test string (roughly 18KB, generated from Yahoo! Locals search API (http://nanoref.com/yahooapis/mgPdGg): module | encode | decode | -----------|------------|------------| - JSON | 273.023 | 44.674 | - JSON::DWIW | 1089.383 | 1145.704 | - JSON::PC | 3097.419 | 2393.921 | - JSON::Syck | 514.060 | 843.053 | - JSON::XS | 6479.668 | 3636.364 | - JSON::XS/2 | 3774.221 | 3599.124 | + JSON | 55.260 | 34.971 | + JSON::DWIW | 825.228 | 1082.513 | + JSON::PC | 3571.444 | 2394.829 | + JSON::PP | 210.987 | 32.574 | + JSON::Syck | 552.551 | 787.544 | + JSON::XS | 5780.463 | 4854.519 | + JSON::XS/2 | 3869.998 | 4798.975 | + JSON::XS/3 | 5862.880 | 4798.975 | + Storable | 4445.002 | 5235.027 | -----------+------------+------------+ - Again, JSON::XS leads by far. + Again, JSON::XS leads by far (except for Storable which non-surprisingly + decodes faster). - On large strings containing lots of high unicode characters, some + On large strings containing lots of high Unicode characters, some modules (such as JSON::PC) seem to decode faster than JSON::XS, but the - result will be broken due to missing (or wrong) unicode handling. Others + result will be broken due to missing (or wrong) Unicode handling. Others refuse to decode or encode properly, so it was impossible to prepare a fair comparison table for that case. @@ -555,29 +861,52 @@ Second, you need to avoid resource-starving attacks. That means you should limit the size of JSON texts you accept, or make sure then when - your resources run out, thats just fine (e.g. by using a separate + your resources run out, that's just fine (e.g. by using a separate process that can crash safely). The size of a JSON text in octets or characters is usually a good indication of the size of the resources - required to decode it into a Perl structure. + required to decode it into a Perl structure. While JSON::XS can check + the size of the JSON text, it might be too late when you already have it + in memory, so you might want to check the size before you accept the + string. Third, JSON::XS recurses using the C stack when decoding objects and arrays. The C stack is a limited resource: for instance, on my amd64 machine with 8MB of stack size I can decode around 180k nested arrays - but only 14k nested JSON objects. If that is exceeded, the program - crashes. Thats why the default nesting limit is set to 4096. If your + but only 14k nested JSON objects (due to perl itself recursing deeply on + croak to free the temporary). If that is exceeded, the program crashes. + to be conservative, the default nesting limit is set to 512. If your process has a smaller stack, you should adjust this setting accordingly with the "max_depth" method. And last but least, something else could bomb you that I forgot to think - of. In that case, you get to keep the pieces. I am alway sopen for + of. In that case, you get to keep the pieces. I am always open for hints, though... + If you are using JSON::XS to return packets to consumption by JavaScript + scripts in a browser you should have a look at + to see whether + you are vulnerable to some common attack vectors (which really are + browser design bugs, but it is still you who will have to deal with it, + as major browser developers care only for features, not about doing + security right). + +THREADS + This module is *not* guaranteed to be thread safe and there are no plans + to change this until Perl gets thread support (as opposed to the + horribly slow so-called "threads" which are simply slow and bloated + process simulations - use fork, its *much* faster, cheaper, better). + + (It might actually work, but you have been warned). + BUGS While the goal of this module is to be correct, that unfortunately does not mean its bug-free, only that I think its design is bug-free. It is still relatively early in its development. If you keep reporting bugs they will be fixed swiftly, though. + Please refrain from using rt.cpan.org or any other bug reporting + service. I put the contact address into my modules for a reason. + AUTHOR Marc Lehmann http://home.schmorp.de/