--- JSON-XS/README 2007/03/25 22:11:06 1.8 +++ JSON-XS/README 2007/05/09 16:35:21 1.11 @@ -114,15 +114,47 @@ generate characters outside the code range 0..127 (which is ASCII). Any unicode characters outside that range will be escaped using either a single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL - escape sequence, as per RFC4627. + escape sequence, as per RFC4627. The resulting encoded JSON text can + be treated as a native unicode string, an ascii-encoded, + latin1-encoded or UTF-8 encoded string, or any other superset of + ASCII. If $enable is false, then the "encode" method will not escape - Unicode characters unless required by the JSON syntax. This results - in a faster and more compact format. + Unicode characters unless required by the JSON syntax or other + flags. This results in a faster and more compact format. + + The main use for this flag is to produce JSON texts that can be + transmitted over a 7-bit channel, as the encoded JSON texts will not + contain any 8 bit characters. JSON::XS->new->ascii (1)->encode ([chr 0x10401]) => ["\ud801\udc01"] + $json = $json->latin1 ([$enable]) + If $enable is true (or missing), then the "encode" method will + encode the resulting JSON text as latin1 (or iso-8859-1), escaping + any characters outside the code range 0..255. The resulting string + can be treated as a latin1-encoded JSON text or a native unicode + string. The "decode" method will not be affected in any way by this + flag, as "decode" by default expects unicode, which is a strict + superset of latin1. + + If $enable is false, then the "encode" method will not escape + Unicode characters unless required by the JSON syntax or other + flags. + + The main use for this flag is efficiently encoding binary data as + JSON text, as most octets will not be escaped, resulting in a + smaller encoded size. The disadvantage is that the resulting JSON + text is encoded in latin1 (and must correctly be treated as such + when storing and transfering), a rare encoding for JSON. It is + therefore most useful when you want to store data structures known + to contain binary data efficiently in files or databases, not when + talking to other JSON encoders/decoders. + + JSON::XS->new->latin1->encode (["\x{89}\x{abc}"] + => ["\x{89}\\u0abc"] # (perl syntax, U+abc escaped, U+89 not) + $json = $json->utf8 ([$enable]) If $enable is true (or missing), then the "encode" method will encode the JSON result into UTF-8, as required by many protocols, @@ -247,7 +279,12 @@ many short strings. It will also try to downgrade any strings to octet-form if possible: perl stores strings internally either in an encoding called UTF-X or in octet-form. The latter cannot store - everything but uses less space in general. + everything but uses less space in general (and some buggy Perl or C + code might even rely on that internal representation being used). + + The actual definition of what shrink does might change in future + versions, but it will always try to save space at the expense of + time. If $enable is true (or missing), the string returned by "encode" will be shrunk-to-fit, while all strings generated by "decode" will @@ -262,10 +299,10 @@ saving space. $json = $json->max_depth ([$maximum_nesting_depth]) - Sets the maximum nesting level (default 8192) accepted while - encoding or decoding. If the JSON text or Perl data structure has an - equal or higher nesting level then this limit, then the encoder and - decoder will stop and croak at that point. + Sets the maximum nesting level (default 512) accepted while encoding + or decoding. If the JSON text or Perl data structure has an equal or + higher nesting level then this limit, then the encoder and decoder + will stop and croak at that point. Nesting level is defined by number of hash- or arrayrefs that the encoder needs to traverse to reach a given point or the number of @@ -298,6 +335,19 @@ become Perl arrayrefs and JSON objects become Perl hashrefs. "true" becomes 1, "false" becomes 0 and "null" becomes "undef". + ($perl_scalar, $characters) = $json->decode_prefix ($json_text) + This works like the "decode" method, but instead of raising an + exception when there is trailing garbage after the first JSON + object, it will silently stop parsing there and return the number of + characters consumed so far. + + This is useful if your JSON texts are not delimited by an outer + protocol (which is not the brightest thing to do in the first place) + and you need to know where the JSON text ends. + + JSON::XS->new->decode_prefix ("[1] the tail") + => ([], 3) + MAPPING This section describes how JSON::XS maps Perl values to JSON values and vice versa. These mappings are designed to "do the right thing" in most @@ -346,17 +396,28 @@ hash references Perl hash references become JSON objects. As there is no inherent - ordering in hash keys, they will usually be encoded in a - pseudo-random order that can change between runs of the same program - but stays generally the same within a single run of a program. - JSON::XS can optionally sort the hash keys (determined by the - *canonical* flag), so the same datastructure will serialise to the - same JSON text (given same settings and version of JSON::XS), but - this incurs a runtime overhead. + ordering in hash keys (or JSON objects), they will usually be + encoded in a pseudo-random order that can change between runs of the + same program but stays generally the same within a single run of a + program. JSON::XS can optionally sort the hash keys (determined by + the *canonical* flag), so the same datastructure will serialise to + the same JSON text (given same settings and version of JSON::XS), + but this incurs a runtime overhead and is only rarely useful, e.g. + when you want to compare some JSON text against another for + equality. array references Perl array references become JSON arrays. + other references + Other unblessed references are generally not allowed and will cause + an exception to be thrown, except for references to the integers 0 + and 1, which get turned into "false" and "true" atoms in JSON. You + can also use "JSON::XS::false" and "JSON::XS::true" to improve + readability. + + to_json [\0,JSON::XS::true] # yields [false,true] + blessed objects Blessed objects are not allowed. JSON::XS currently tries to encode their underlying representation (hash- or arrayref), but this @@ -397,9 +458,6 @@ You can not currently output JSON booleans or force the type in other, less obscure, ways. Tell me if you need this capability. - circular data structures - Those will be encoded until memory or stackspace runs out. - COMPARISON As already mentioned, this module was created because none of the existing JSON modules could be made to work correctly. First I will @@ -547,13 +605,14 @@ Third, JSON::XS recurses using the C stack when decoding objects and arrays. The C stack is a limited resource: for instance, on my amd64 machine with 8MB of stack size I can decode around 180k nested arrays - but only 14k nested JSON objects. If that is exceeded, the program - crashes. Thats why the default nesting limit is set to 8192. If your + but only 14k nested JSON objects (due to perl itself recursing deeply on + croak to free the temporary). If that is exceeded, the program crashes. + to be conservative, the default nesting limit is set to 512. If your process has a smaller stack, you should adjust this setting accordingly with the "max_depth" method. And last but least, something else could bomb you that I forgot to think - of. In that case, you get to keep the pieces. I am alway sopen for + of. In that case, you get to keep the pieces. I am always open for hints, though... BUGS