--- JSON-XS/README 2007/03/29 02:45:49 1.9 +++ JSON-XS/README 2007/06/11 03:45:26 1.13 @@ -114,15 +114,47 @@ generate characters outside the code range 0..127 (which is ASCII). Any unicode characters outside that range will be escaped using either a single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL - escape sequence, as per RFC4627. + escape sequence, as per RFC4627. The resulting encoded JSON text can + be treated as a native unicode string, an ascii-encoded, + latin1-encoded or UTF-8 encoded string, or any other superset of + ASCII. If $enable is false, then the "encode" method will not escape - Unicode characters unless required by the JSON syntax. This results - in a faster and more compact format. + Unicode characters unless required by the JSON syntax or other + flags. This results in a faster and more compact format. + + The main use for this flag is to produce JSON texts that can be + transmitted over a 7-bit channel, as the encoded JSON texts will not + contain any 8 bit characters. JSON::XS->new->ascii (1)->encode ([chr 0x10401]) => ["\ud801\udc01"] + $json = $json->latin1 ([$enable]) + If $enable is true (or missing), then the "encode" method will + encode the resulting JSON text as latin1 (or iso-8859-1), escaping + any characters outside the code range 0..255. The resulting string + can be treated as a latin1-encoded JSON text or a native unicode + string. The "decode" method will not be affected in any way by this + flag, as "decode" by default expects unicode, which is a strict + superset of latin1. + + If $enable is false, then the "encode" method will not escape + Unicode characters unless required by the JSON syntax or other + flags. + + The main use for this flag is efficiently encoding binary data as + JSON text, as most octets will not be escaped, resulting in a + smaller encoded size. The disadvantage is that the resulting JSON + text is encoded in latin1 (and must correctly be treated as such + when storing and transfering), a rare encoding for JSON. It is + therefore most useful when you want to store data structures known + to contain binary data efficiently in files or databases, not when + talking to other JSON encoders/decoders. + + JSON::XS->new->latin1->encode (["\x{89}\x{abc}"] + => ["\x{89}\\u0abc"] # (perl syntax, U+abc escaped, U+89 not) + $json = $json->utf8 ([$enable]) If $enable is true (or missing), then the "encode" method will encode the JSON result into UTF-8, as required by many protocols, @@ -267,10 +299,10 @@ saving space. $json = $json->max_depth ([$maximum_nesting_depth]) - Sets the maximum nesting level (default 4096) accepted while - encoding or decoding. If the JSON text or Perl data structure has an - equal or higher nesting level then this limit, then the encoder and - decoder will stop and croak at that point. + Sets the maximum nesting level (default 512) accepted while encoding + or decoding. If the JSON text or Perl data structure has an equal or + higher nesting level then this limit, then the encoder and decoder + will stop and croak at that point. Nesting level is defined by number of hash- or arrayrefs that the encoder needs to traverse to reach a given point or the number of @@ -303,6 +335,19 @@ become Perl arrayrefs and JSON objects become Perl hashrefs. "true" becomes 1, "false" becomes 0 and "null" becomes "undef". + ($perl_scalar, $characters) = $json->decode_prefix ($json_text) + This works like the "decode" method, but instead of raising an + exception when there is trailing garbage after the first JSON + object, it will silently stop parsing there and return the number of + characters consumed so far. + + This is useful if your JSON texts are not delimited by an outer + protocol (which is not the brightest thing to do in the first place) + and you need to know where the JSON text ends. + + JSON::XS->new->decode_prefix ("[1] the tail") + => ([], 3) + MAPPING This section describes how JSON::XS maps Perl values to JSON values and vice versa. These mappings are designed to "do the right thing" in most @@ -413,9 +458,6 @@ You can not currently output JSON booleans or force the type in other, less obscure, ways. Tell me if you need this capability. - circular data structures - Those will be encoded until memory or stackspace runs out. - COMPARISON As already mentioned, this module was created because none of the existing JSON modules could be made to work correctly. First I will @@ -495,49 +537,80 @@ Does not check input for validity. + JSON and YAML + You often hear that JSON is a subset (or a close subset) of YAML. This + is, however, a mass hysteria and very far from the truth. In general, + there is no way to configure JSON::XS to output a data structure as + valid YAML. + + If you really must use JSON::XS to generate YAML, you should use this + algorithm (subject to change in future versions): + + my $to_yaml = JSON::XS->new->utf8->space_after (1); + my $yaml = $to_yaml->encode ($ref) . "\n"; + + This will usually generate JSON texts that also parse as valid YAML. + Please note that YAML has hardcoded limits on (simple) object key + lengths that JSON doesn't have, so you should make sure that your hash + keys are noticably shorter than the 1024 characters YAML allows. + + There might be other incompatibilities that I am not aware of. In + general you should not try to generate YAML with a JSON generator or + vice versa, or try to parse JSON with a YAML parser or vice versa: + chances are high that you will run into severe interoperability + problems. + SPEED It seems that JSON::XS is surprisingly fast, as shown in the following tables. They have been generated with the help of the "eg/bench" program in the JSON::XS distribution, to make it easy to compare on your own system. - First comes a comparison between various modules using a very short JSON - string: + First comes a comparison between various modules using a very short + single-line JSON string: - {"method": "handleMessage", "params": ["user1", "we were just talking"], "id": null} + {"method": "handleMessage", "params": ["user1", "we were just talking"], \ + "id": null, "array":[1,11,234,-5,1e5,1e7, true, false]} It shows the number of encodes/decodes per second (JSON::XS uses the functional interface, while JSON::XS/2 uses the OO interface with - pretty-printing and hashkey sorting enabled). Higher is better: + pretty-printing and hashkey sorting enabled, JSON::XS/3 enables shrink). + Higher is better: module | encode | decode | -----------|------------|------------| - JSON | 11488.516 | 7823.035 | - JSON::DWIW | 94708.054 | 129094.260 | - JSON::PC | 63884.157 | 128528.212 | - JSON::Syck | 34898.677 | 42096.911 | - JSON::XS | 654027.064 | 396423.669 | - JSON::XS/2 | 371564.190 | 371725.613 | + JSON | 7645.468 | 4208.613 | + JSON::DWIW | 40721.398 | 77101.176 | + JSON::PC | 65948.176 | 78251.940 | + JSON::Syck | 22844.793 | 26479.192 | + JSON::XS | 388361.481 | 199728.762 | + JSON::XS/2 | 218453.333 | 192399.266 | + JSON::XS/3 | 338250.323 | 192399.266 | + Storable | 15779.925 | 14169.946 | -----------+------------+------------+ - That is, JSON::XS is more than six times faster than JSON::DWIW on - encoding, more than three times faster on decoding, and about thirty - times faster than JSON, even with pretty-printing and key sorting. + That is, JSON::XS is about five times faster than JSON::DWIW on + encoding, about three times faster on decoding, and over fourty times + faster than JSON, even with pretty-printing and key sorting. It also + compares favourably to Storable for small amounts of data. Using a longer test string (roughly 18KB, generated from Yahoo! Locals search API (http://nanoref.com/yahooapis/mgPdGg): module | encode | decode | -----------|------------|------------| - JSON | 273.023 | 44.674 | - JSON::DWIW | 1089.383 | 1145.704 | - JSON::PC | 3097.419 | 2393.921 | - JSON::Syck | 514.060 | 843.053 | - JSON::XS | 6479.668 | 3636.364 | - JSON::XS/2 | 3774.221 | 3599.124 | + JSON | 254.685 | 37.665 | + JSON::DWIW | 843.343 | 1049.731 | + JSON::PC | 3602.116 | 2307.352 | + JSON::Syck | 505.107 | 787.899 | + JSON::XS | 5747.196 | 3690.220 | + JSON::XS/2 | 3968.121 | 3676.634 | + JSON::XS/3 | 6105.246 | 3662.508 | + Storable | 4417.337 | 5285.161 | -----------+------------+------------+ - Again, JSON::XS leads by far. + Again, JSON::XS leads by far (except for Storable which non-surprisingly + decodes faster). On large strings containing lots of high unicode characters, some modules (such as JSON::PC) seem to decode faster than JSON::XS, but the @@ -563,13 +636,14 @@ Third, JSON::XS recurses using the C stack when decoding objects and arrays. The C stack is a limited resource: for instance, on my amd64 machine with 8MB of stack size I can decode around 180k nested arrays - but only 14k nested JSON objects. If that is exceeded, the program - crashes. Thats why the default nesting limit is set to 4096. If your + but only 14k nested JSON objects (due to perl itself recursing deeply on + croak to free the temporary). If that is exceeded, the program crashes. + to be conservative, the default nesting limit is set to 512. If your process has a smaller stack, you should adjust this setting accordingly with the "max_depth" method. And last but least, something else could bomb you that I forgot to think - of. In that case, you get to keep the pieces. I am alway sopen for + of. In that case, you get to keep the pieces. I am always open for hints, though... BUGS