ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/JSON-XS/README
Revision: 1.28
Committed: Mon Sep 29 03:09:27 2008 UTC (15 years, 7 months ago) by root
Branch: MAIN
CVS Tags: rel-2_231, rel-2_23
Changes since 1.27: +0 -2 lines
Log Message:
2.23

File Contents

# User Rev Content
1 root 1.1 NAME
2 root 1.2 JSON::XS - JSON serialising/deserialising, done correctly and fast
3 root 1.1
4 root 1.23 JSON::XS - 正しくて高速な JSON シリアライザ/デシリアライザ
5 root 1.19 (http://fleur.hio.jp/perldoc/mix/lib/JSON/XS.html)
6    
7 root 1.1 SYNOPSIS
8 root 1.2 use JSON::XS;
9 root 1.1
10 root 1.8 # exported functions, they croak on error
11     # and expect/generate UTF-8
12 root 1.4
13 root 1.22 $utf8_encoded_json_text = encode_json $perl_hash_or_arrayref;
14     $perl_hash_or_arrayref = decode_json $utf8_encoded_json_text;
15 root 1.4
16 root 1.8 # OO-interface
17 root 1.4
18     $coder = JSON::XS->new->ascii->pretty->allow_nonref;
19     $pretty_printed_unencoded = $coder->encode ($perl_scalar);
20     $perl_scalar = $coder->decode ($unicode_json_text);
21    
22 root 1.21 # Note that JSON version 2.0 and above will automatically use JSON::XS
23     # if available, at virtually no speed overhead either, so you should
24     # be able to just:
25 root 1.23
26     use JSON;
27 root 1.21
28     # and do the same things, except that you have a pure-perl fallback now.
29    
30 root 1.1 DESCRIPTION
31 root 1.2 This module converts Perl data structures to JSON and vice versa. Its
32     primary goal is to be *correct* and its secondary goal is to be *fast*.
33     To reach the latter goal it was written in C.
34    
35 root 1.21 Beginning with version 2.0 of the JSON module, when both JSON and
36     JSON::XS are installed, then JSON will fall back on JSON::XS (this can
37 root 1.26 be overridden) with no overhead due to emulation (by inheriting
38 root 1.21 constructor and methods). If JSON::XS is not available, it will fall
39     back to the compatible JSON::PP module as backend, so using JSON instead
40     of JSON::XS gives you a portable JSON API that can be fast when you need
41     and doesn't require a C compiler when that is a problem.
42    
43 root 1.2 As this is the n-th-something JSON module on CPAN, what was the reason
44     to write yet another JSON module? While it seems there are many JSON
45     modules, none of them correctly handle all corner cases, and in most
46     cases their maintainers are unresponsive, gone missing, or not listening
47     to bug reports for other reasons.
48    
49 root 1.4 See MAPPING, below, on how JSON::XS maps perl values to JSON values and
50     vice versa.
51    
52 root 1.2 FEATURES
53 root 1.23 * correct Unicode handling
54    
55     This module knows how to handle Unicode, documents how and when it
56     does so, and even documents what "correct" means.
57    
58     * round-trip integrity
59 root 1.2
60 root 1.26 When you serialise a perl data structure using only data types
61 root 1.2 supported by JSON, the deserialised data structure is identical on
62 root 1.8 the Perl level. (e.g. the string "2.0" doesn't suddenly become "2"
63 root 1.23 just because it looks like a number). There minor *are* exceptions
64     to this, read the MAPPING section below to learn about those.
65    
66     * strict checking of JSON correctness
67 root 1.2
68 root 1.6 There is no guessing, no generating of illegal JSON texts by
69 root 1.4 default, and only JSON is accepted as input by default (the latter
70     is a security feature).
71 root 1.2
72 root 1.23 * fast
73    
74     Compared to other JSON modules and other serialisers such as
75     Storable, this module usually compares favourably in terms of speed,
76     too.
77    
78     * simple to use
79 root 1.2
80 root 1.23 This module has both a simple functional interface as well as an
81 root 1.26 object oriented interface interface.
82 root 1.23
83     * reasonably versatile output formats
84    
85     You can choose between the most compact guaranteed-single-line
86 root 1.26 format possible (nice for simple line-based protocols), a pure-ASCII
87 root 1.8 format (for when your transport is not 8-bit clean, still supports
88 root 1.20 the whole Unicode range), or a pretty-printed format (for when you
89 root 1.8 want to read that stuff). Or you can combine those features in
90     whatever way you like.
91 root 1.2
92     FUNCTIONAL INTERFACE
93 root 1.20 The following convenience methods are provided by this module. They are
94 root 1.2 exported by default:
95    
96 root 1.22 $json_text = encode_json $perl_scalar
97 root 1.19 Converts the given Perl data structure to a UTF-8 encoded, binary
98     string (that is, the string contains octets only). Croaks on error.
99 root 1.2
100 root 1.6 This function call is functionally identical to:
101 root 1.2
102 root 1.6 $json_text = JSON::XS->new->utf8->encode ($perl_scalar)
103    
104 root 1.26 Except being faster.
105 root 1.6
106 root 1.22 $perl_scalar = decode_json $json_text
107     The opposite of "encode_json": expects an UTF-8 (binary) string and
108 root 1.6 tries to parse that as an UTF-8 encoded JSON text, returning the
109 root 1.19 resulting reference. Croaks on error.
110 root 1.2
111 root 1.6 This function call is functionally identical to:
112    
113     $perl_scalar = JSON::XS->new->utf8->decode ($json_text)
114    
115 root 1.26 Except being faster.
116 root 1.2
117 root 1.14 $is_boolean = JSON::XS::is_bool $scalar
118     Returns true if the passed scalar represents either JSON::XS::true
119     or JSON::XS::false, two constants that act like 1 and 0,
120     respectively and are used to represent JSON "true" and "false"
121     values in Perl.
122    
123     See MAPPING, below, for more information on how JSON values are
124     mapped to Perl.
125    
126 root 1.19 A FEW NOTES ON UNICODE AND PERL
127     Since this often leads to confusion, here are a few very clear words on
128     how Unicode works in Perl, modulo bugs.
129    
130     1. Perl strings can store characters with ordinal values > 255.
131 root 1.20 This enables you to store Unicode characters as single characters in
132 root 1.19 a Perl string - very natural.
133    
134     2. Perl does *not* associate an encoding with your strings.
135 root 1.23 ... until you force it to, e.g. when matching it against a regex, or
136 root 1.19 printing the scalar to a file, in which case Perl either interprets
137     your string as locale-encoded text, octets/binary, or as Unicode,
138     depending on various settings. In no case is an encoding stored
139     together with your data, it is *use* that decides encoding, not any
140 root 1.23 magical meta data.
141 root 1.19
142     3. The internal utf-8 flag has no meaning with regards to the encoding
143     of your string.
144     Just ignore that flag unless you debug a Perl bug, a module written
145     in XS or want to dive into the internals of perl. Otherwise it will
146     only confuse you, as, despite the name, it says nothing about how
147 root 1.20 your string is encoded. You can have Unicode strings with that flag
148 root 1.19 set, with that flag clear, and you can have binary data with that
149     flag set and that flag clear. Other possibilities exist, too.
150    
151     If you didn't know about that flag, just the better, pretend it
152     doesn't exist.
153    
154     4. A "Unicode String" is simply a string where each character can be
155 root 1.26 validly interpreted as a Unicode code point.
156 root 1.19 If you have UTF-8 encoded data, it is no longer a Unicode string,
157     but a Unicode string encoded in UTF-8, giving you a binary string.
158    
159     5. A string containing "high" (> 255) character values is *not* a UTF-8
160     string.
161 root 1.20 It's a fact. Learn to live with it.
162 root 1.19
163     I hope this helps :)
164    
165 root 1.2 OBJECT-ORIENTED INTERFACE
166     The object oriented interface lets you configure your own encoding or
167     decoding style, within the limits of supported formats.
168    
169     $json = new JSON::XS
170     Creates a new JSON::XS object that can be used to de/encode JSON
171     strings. All boolean flags described below are by default
172     *disabled*.
173    
174     The mutators for flags all return the JSON object again and thus
175     calls can be chained:
176    
177 root 1.6 my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]})
178 root 1.2 => {"a": [1, 2]}
179    
180 root 1.4 $json = $json->ascii ([$enable])
181 root 1.21 $enabled = $json->get_ascii
182 root 1.4 If $enable is true (or missing), then the "encode" method will not
183 root 1.6 generate characters outside the code range 0..127 (which is ASCII).
184 root 1.20 Any Unicode characters outside that range will be escaped using
185 root 1.6 either a single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL
186 root 1.11 escape sequence, as per RFC4627. The resulting encoded JSON text can
187 root 1.20 be treated as a native Unicode string, an ascii-encoded,
188 root 1.11 latin1-encoded or UTF-8 encoded string, or any other superset of
189     ASCII.
190 root 1.2
191     If $enable is false, then the "encode" method will not escape
192 root 1.11 Unicode characters unless required by the JSON syntax or other
193     flags. This results in a faster and more compact format.
194    
195 root 1.23 See also the section *ENCODING/CODESET FLAG NOTES* later in this
196     document.
197    
198 root 1.11 The main use for this flag is to produce JSON texts that can be
199     transmitted over a 7-bit channel, as the encoded JSON texts will not
200     contain any 8 bit characters.
201 root 1.2
202 root 1.6 JSON::XS->new->ascii (1)->encode ([chr 0x10401])
203     => ["\ud801\udc01"]
204 root 1.2
205 root 1.11 $json = $json->latin1 ([$enable])
206 root 1.21 $enabled = $json->get_latin1
207 root 1.11 If $enable is true (or missing), then the "encode" method will
208     encode the resulting JSON text as latin1 (or iso-8859-1), escaping
209     any characters outside the code range 0..255. The resulting string
210 root 1.20 can be treated as a latin1-encoded JSON text or a native Unicode
211 root 1.11 string. The "decode" method will not be affected in any way by this
212 root 1.20 flag, as "decode" by default expects Unicode, which is a strict
213 root 1.11 superset of latin1.
214    
215     If $enable is false, then the "encode" method will not escape
216     Unicode characters unless required by the JSON syntax or other
217     flags.
218    
219 root 1.23 See also the section *ENCODING/CODESET FLAG NOTES* later in this
220     document.
221    
222 root 1.11 The main use for this flag is efficiently encoding binary data as
223     JSON text, as most octets will not be escaped, resulting in a
224     smaller encoded size. The disadvantage is that the resulting JSON
225     text is encoded in latin1 (and must correctly be treated as such
226 root 1.20 when storing and transferring), a rare encoding for JSON. It is
227 root 1.11 therefore most useful when you want to store data structures known
228     to contain binary data efficiently in files or databases, not when
229     talking to other JSON encoders/decoders.
230    
231     JSON::XS->new->latin1->encode (["\x{89}\x{abc}"]
232     => ["\x{89}\\u0abc"] # (perl syntax, U+abc escaped, U+89 not)
233    
234 root 1.4 $json = $json->utf8 ([$enable])
235 root 1.21 $enabled = $json->get_utf8
236 root 1.4 If $enable is true (or missing), then the "encode" method will
237 root 1.6 encode the JSON result into UTF-8, as required by many protocols,
238 root 1.4 while the "decode" method expects to be handled an UTF-8-encoded
239     string. Please note that UTF-8-encoded strings do not contain any
240     characters outside the range 0..255, they are thus useful for
241 root 1.6 bytewise/binary I/O. In future versions, enabling this option might
242     enable autodetection of the UTF-16 and UTF-32 encoding families, as
243     described in RFC4627.
244 root 1.2
245     If $enable is false, then the "encode" method will return the JSON
246 root 1.20 string as a (non-encoded) Unicode string, while "decode" expects
247     thus a Unicode string. Any decoding or encoding (e.g. to UTF-8 or
248 root 1.2 UTF-16) needs to be done yourself, e.g. using the Encode module.
249    
250 root 1.23 See also the section *ENCODING/CODESET FLAG NOTES* later in this
251     document.
252    
253 root 1.6 Example, output UTF-16BE-encoded JSON:
254    
255     use Encode;
256     $jsontext = encode "UTF-16BE", JSON::XS->new->encode ($object);
257    
258     Example, decode UTF-32LE-encoded JSON:
259    
260     use Encode;
261     $object = JSON::XS->new->decode (decode "UTF-32LE", $jsontext);
262 root 1.4
263     $json = $json->pretty ([$enable])
264 root 1.2 This enables (or disables) all of the "indent", "space_before" and
265     "space_after" (and in the future possibly more) flags in one call to
266     generate the most readable (or most compact) form possible.
267    
268 root 1.4 Example, pretty-print some simple structure:
269    
270 root 1.2 my $json = JSON::XS->new->pretty(1)->encode ({a => [1,2]})
271     =>
272     {
273     "a" : [
274     1,
275     2
276     ]
277     }
278    
279 root 1.4 $json = $json->indent ([$enable])
280 root 1.21 $enabled = $json->get_indent
281 root 1.4 If $enable is true (or missing), then the "encode" method will use a
282     multiline format as output, putting every array member or
283 root 1.20 object/hash key-value pair into its own line, indenting them
284 root 1.4 properly.
285 root 1.2
286     If $enable is false, no newlines or indenting will be produced, and
287 root 1.20 the resulting JSON text is guaranteed not to contain any "newlines".
288 root 1.2
289 root 1.6 This setting has no effect when decoding JSON texts.
290 root 1.2
291 root 1.4 $json = $json->space_before ([$enable])
292 root 1.21 $enabled = $json->get_space_before
293 root 1.4 If $enable is true (or missing), then the "encode" method will add
294     an extra optional space before the ":" separating keys from values
295     in JSON objects.
296 root 1.2
297     If $enable is false, then the "encode" method will not add any extra
298     space at those places.
299    
300 root 1.6 This setting has no effect when decoding JSON texts. You will also
301 root 1.2 most likely combine this setting with "space_after".
302    
303 root 1.4 Example, space_before enabled, space_after and indent disabled:
304    
305     {"key" :"value"}
306    
307     $json = $json->space_after ([$enable])
308 root 1.21 $enabled = $json->get_space_after
309 root 1.4 If $enable is true (or missing), then the "encode" method will add
310     an extra optional space after the ":" separating keys from values in
311     JSON objects and extra whitespace after the "," separating key-value
312 root 1.2 pairs and array members.
313    
314     If $enable is false, then the "encode" method will not add any extra
315     space at those places.
316    
317 root 1.6 This setting has no effect when decoding JSON texts.
318 root 1.2
319 root 1.4 Example, space_before and indent disabled, space_after enabled:
320    
321     {"key": "value"}
322    
323 root 1.17 $json = $json->relaxed ([$enable])
324 root 1.21 $enabled = $json->get_relaxed
325 root 1.17 If $enable is true (or missing), then "decode" will accept some
326     extensions to normal JSON syntax (see below). "encode" will not be
327     affected in anyway. *Be aware that this option makes you accept
328     invalid JSON texts as if they were valid!*. I suggest only to use
329     this option to parse application-specific files written by humans
330     (configuration files, resource files etc.)
331    
332     If $enable is false (the default), then "decode" will only accept
333     valid JSON texts.
334    
335     Currently accepted extensions are:
336    
337 root 1.23 * list items can have an end-comma
338    
339 root 1.17 JSON *separates* array elements and key-value pairs with commas.
340     This can be annoying if you write JSON texts manually and want
341     to be able to quickly append elements, so this extension accepts
342     comma at the end of such items not just between them:
343    
344     [
345     1,
346     2, <- this comma not normally allowed
347     ]
348     {
349     "k1": "v1",
350     "k2": "v2", <- this comma not normally allowed
351     }
352    
353 root 1.23 * shell-style '#'-comments
354    
355 root 1.18 Whenever JSON allows whitespace, shell-style comments are
356     additionally allowed. They are terminated by the first
357     carriage-return or line-feed character, after which more
358     white-space and comments are allowed.
359    
360     [
361     1, # this comment not allowed in JSON
362     # neither this one...
363     ]
364    
365 root 1.4 $json = $json->canonical ([$enable])
366 root 1.21 $enabled = $json->get_canonical
367 root 1.4 If $enable is true (or missing), then the "encode" method will
368     output JSON objects by sorting their keys. This is adding a
369     comparatively high overhead.
370 root 1.2
371     If $enable is false, then the "encode" method will output key-value
372     pairs in the order Perl stores them (which will likely change
373     between runs of the same script).
374    
375     This option is useful if you want the same data structure to be
376 root 1.6 encoded as the same JSON text (given the same overall settings). If
377 root 1.20 it is disabled, the same hash might be encoded differently even if
378 root 1.2 contains the same data, as key-value pairs have no inherent ordering
379     in Perl.
380    
381 root 1.6 This setting has no effect when decoding JSON texts.
382 root 1.2
383 root 1.4 $json = $json->allow_nonref ([$enable])
384 root 1.21 $enabled = $json->get_allow_nonref
385 root 1.4 If $enable is true (or missing), then the "encode" method can
386     convert a non-reference into its corresponding string, number or
387     null JSON value, which is an extension to RFC4627. Likewise,
388     "decode" will accept those JSON values instead of croaking.
389 root 1.2
390     If $enable is false, then the "encode" method will croak if it isn't
391 root 1.6 passed an arrayref or hashref, as JSON texts must either be an
392 root 1.2 object or array. Likewise, "decode" will croak if given something
393     that is not a JSON object or array.
394    
395 root 1.4 Example, encode a Perl scalar as JSON value with enabled
396     "allow_nonref", resulting in an invalid JSON text:
397    
398     JSON::XS->new->allow_nonref->encode ("Hello, World!")
399     => "Hello, World!"
400    
401 root 1.25 $json = $json->allow_unknown ([$enable])
402     $enabled = $json->get_allow_unknown
403     If $enable is true (or missing), then "encode" will *not* throw an
404     exception when it encounters values it cannot represent in JSON (for
405     example, filehandles) but instead will encode a JSON "null" value.
406     Note that blessed objects are not included here and are handled
407     separately by c<allow_nonref>.
408    
409     If $enable is false (the default), then "encode" will throw an
410     exception when it encounters anything it cannot encode as JSON.
411    
412     This option does not affect "decode" in any way, and it is
413     recommended to leave it off unless you know your communications
414     partner.
415    
416 root 1.15 $json = $json->allow_blessed ([$enable])
417 root 1.21 $enabled = $json->get_allow_blessed
418 root 1.15 If $enable is true (or missing), then the "encode" method will not
419     barf when it encounters a blessed reference. Instead, the value of
420 root 1.20 the convert_blessed option will decide whether "null"
421 root 1.21 ("convert_blessed" disabled or no "TO_JSON" method found) or a
422 root 1.15 representation of the object ("convert_blessed" enabled and
423 root 1.21 "TO_JSON" method found) is being encoded. Has no effect on "decode".
424 root 1.15
425     If $enable is false (the default), then "encode" will throw an
426     exception when it encounters a blessed object.
427    
428     $json = $json->convert_blessed ([$enable])
429 root 1.21 $enabled = $json->get_convert_blessed
430 root 1.15 If $enable is true (or missing), then "encode", upon encountering a
431     blessed object, will check for the availability of the "TO_JSON"
432     method on the object's class. If found, it will be called in scalar
433     context and the resulting scalar will be encoded instead of the
434     object. If no "TO_JSON" method is found, the value of
435     "allow_blessed" will decide what to do.
436    
437     The "TO_JSON" method may safely call die if it wants. If "TO_JSON"
438     returns other blessed objects, those will be handled in the same
439     way. "TO_JSON" must take care of not causing an endless recursion
440     cycle (== crash) in this case. The name of "TO_JSON" was chosen
441     because other methods called by the Perl core (== not by the user of
442     the object) are usually in upper case letters and to avoid
443 root 1.22 collisions with any "to_json" function or method.
444 root 1.15
445     This setting does not yet influence "decode" in any way, but in the
446     future, global hooks might get installed that influence "decode" and
447     are enabled by this setting.
448    
449     If $enable is false, then the "allow_blessed" setting will decide
450     what to do when a blessed object is found.
451    
452     $json = $json->filter_json_object ([$coderef->($hashref)])
453     When $coderef is specified, it will be called from "decode" each
454     time it decodes a JSON object. The only argument is a reference to
455     the newly-created hash. If the code references returns a single
456     scalar (which need not be a reference), this value (i.e. a copy of
457     that scalar to avoid aliasing) is inserted into the deserialised
458     data structure. If it returns an empty list (NOTE: *not* "undef",
459     which is a valid scalar), the original deserialised hash will be
460     inserted. This setting can slow down decoding considerably.
461    
462     When $coderef is omitted or undefined, any existing callback will be
463     removed and "decode" will not change the deserialised hash in any
464     way.
465    
466     Example, convert all JSON objects into the integer 5:
467    
468     my $js = JSON::XS->new->filter_json_object (sub { 5 });
469     # returns [5]
470     $js->decode ('[{}]')
471     # throw an exception because allow_nonref is not enabled
472     # so a lone 5 is not allowed.
473     $js->decode ('{"a":1, "b":2}');
474    
475     $json = $json->filter_json_single_key_object ($key [=>
476     $coderef->($value)])
477     Works remotely similar to "filter_json_object", but is only called
478     for JSON objects having a single key named $key.
479    
480     This $coderef is called before the one specified via
481     "filter_json_object", if any. It gets passed the single value in the
482     JSON object. If it returns a single value, it will be inserted into
483     the data structure. If it returns nothing (not even "undef" but the
484     empty list), the callback from "filter_json_object" will be called
485     next, as if no single-key callback were specified.
486    
487     If $coderef is omitted or undefined, the corresponding callback will
488     be disabled. There can only ever be one callback for a given key.
489    
490     As this callback gets called less often then the
491     "filter_json_object" one, decoding speed will not usually suffer as
492     much. Therefore, single-key objects make excellent targets to
493     serialise Perl objects into, especially as single-key JSON objects
494 root 1.20 are as close to the type-tagged value concept as JSON gets (it's
495 root 1.15 basically an ID/VALUE tuple). Of course, JSON does not support this
496     in any way, so you need to make sure your data never looks like a
497     serialised Perl hash.
498    
499     Typical names for the single object key are "__class_whatever__", or
500     "$__dollars_are_rarely_used__$" or "}ugly_brace_placement", or even
501     things like "__class_md5sum(classname)__", to reduce the risk of
502     clashing with real hashes.
503    
504     Example, decode JSON objects of the form "{ "__widget__" => <id> }"
505     into the corresponding $WIDGET{<id>} object:
506    
507     # return whatever is in $WIDGET{5}:
508     JSON::XS
509     ->new
510     ->filter_json_single_key_object (__widget__ => sub {
511     $WIDGET{ $_[0] }
512     })
513     ->decode ('{"__widget__": 5')
514    
515     # this can be used with a TO_JSON method in some "widget" class
516     # for serialisation to json:
517     sub WidgetBase::TO_JSON {
518     my ($self) = @_;
519    
520     unless ($self->{id}) {
521     $self->{id} = ..get..some..id..;
522     $WIDGET{$self->{id}} = $self;
523     }
524    
525     { __widget__ => $self->{id} }
526     }
527    
528 root 1.4 $json = $json->shrink ([$enable])
529 root 1.21 $enabled = $json->get_shrink
530 root 1.4 Perl usually over-allocates memory a bit when allocating space for
531     strings. This flag optionally resizes strings generated by either
532     "encode" or "decode" to their minimum size possible. This can save
533 root 1.6 memory when your JSON texts are either very very long or you have
534 root 1.4 many short strings. It will also try to downgrade any strings to
535     octet-form if possible: perl stores strings internally either in an
536     encoding called UTF-X or in octet-form. The latter cannot store
537 root 1.9 everything but uses less space in general (and some buggy Perl or C
538     code might even rely on that internal representation being used).
539    
540     The actual definition of what shrink does might change in future
541     versions, but it will always try to save space at the expense of
542     time.
543 root 1.4
544     If $enable is true (or missing), the string returned by "encode"
545     will be shrunk-to-fit, while all strings generated by "decode" will
546     also be shrunk-to-fit.
547    
548     If $enable is false, then the normal perl allocation algorithms are
549     used. If you work with your data, then this is likely to be faster.
550    
551     In the future, this setting might control other things, such as
552     converting strings that look like integers or floats into integers
553     or floats internally (there is no difference on the Perl level),
554     saving space.
555    
556 root 1.8 $json = $json->max_depth ([$maximum_nesting_depth])
557 root 1.21 $max_depth = $json->get_max_depth
558 root 1.10 Sets the maximum nesting level (default 512) accepted while encoding
559 root 1.25 or decoding. If a higher nesting level is detected in JSON text or a
560     Perl data structure, then the encoder and decoder will stop and
561     croak at that point.
562 root 1.8
563     Nesting level is defined by number of hash- or arrayrefs that the
564     encoder needs to traverse to reach a given point or the number of
565     "{" or "[" characters without their matching closing parenthesis
566     crossed to reach a given character in a string.
567    
568     Setting the maximum depth to one disallows any nesting, so that
569     ensures that the object is only a single hash/object or array.
570    
571 root 1.25 If no argument is given, the highest possible setting will be used,
572     which is rarely useful.
573    
574     Note that nesting is implemented by recursion in C. The default
575     value has been chosen to be as large as typical operating systems
576     allow without crashing.
577 root 1.15
578     See SECURITY CONSIDERATIONS, below, for more info on why this is
579     useful.
580    
581     $json = $json->max_size ([$maximum_string_size])
582 root 1.21 $max_size = $json->get_max_size
583 root 1.15 Set the maximum length a JSON text may have (in bytes) where
584     decoding is being attempted. The default is 0, meaning no limit.
585 root 1.25 When "decode" is called on a string that is longer then this many
586     bytes, it will not attempt to decode the string but throw an
587 root 1.15 exception. This setting has no effect on "encode" (yet).
588    
589 root 1.25 If no argument is given, the limit check will be deactivated (same
590     as when 0 is specified).
591 root 1.8
592     See SECURITY CONSIDERATIONS, below, for more info on why this is
593     useful.
594    
595 root 1.6 $json_text = $json->encode ($perl_scalar)
596 root 1.2 Converts the given Perl data structure (a simple scalar or a
597     reference to a hash or array) to its JSON representation. Simple
598     scalars will be converted into JSON string or number sequences,
599     while references to arrays become JSON arrays and references to
600     hashes become JSON objects. Undefined Perl values (e.g. "undef")
601     become JSON "null" values. Neither "true" nor "false" values will be
602     generated.
603    
604 root 1.6 $perl_scalar = $json->decode ($json_text)
605     The opposite of "encode": expects a JSON text and tries to parse it,
606     returning the resulting simple scalar or reference. Croaks on error.
607 root 1.2
608     JSON numbers and strings become simple Perl scalars. JSON arrays
609     become Perl arrayrefs and JSON objects become Perl hashrefs. "true"
610     becomes 1, "false" becomes 0 and "null" becomes "undef".
611    
612 root 1.11 ($perl_scalar, $characters) = $json->decode_prefix ($json_text)
613     This works like the "decode" method, but instead of raising an
614     exception when there is trailing garbage after the first JSON
615     object, it will silently stop parsing there and return the number of
616     characters consumed so far.
617    
618     This is useful if your JSON texts are not delimited by an outer
619     protocol (which is not the brightest thing to do in the first place)
620     and you need to know where the JSON text ends.
621    
622     JSON::XS->new->decode_prefix ("[1] the tail")
623     => ([], 3)
624    
625 root 1.24 INCREMENTAL PARSING
626     In some cases, there is the need for incremental parsing of JSON texts.
627     While this module always has to keep both JSON text and resulting Perl
628     data structure in memory at one time, it does allow you to parse a JSON
629     stream incrementally. It does so by accumulating text until it has a
630     full JSON object, which it then can decode. This process is similar to
631     using "decode_prefix" to see if a full JSON object is available, but is
632 root 1.27 much more efficient (and can be implemented with a minimum of method
633     calls).
634 root 1.24
635 root 1.27 JSON::XS will only attempt to parse the JSON text once it is sure it has
636     enough text to get a decisive result, using a very simple but truly
637     incremental parser. This means that it sometimes won't stop as early as
638     the full parser, for example, it doesn't detect parenthese mismatches.
639     The only thing it guarantees is that it starts decoding as soon as a
640     syntactically valid JSON text has been seen. This means you need to set
641     resource limits (e.g. "max_size") to ensure the parser will stop parsing
642     in the presence if syntax errors.
643    
644     The following methods implement this incremental parser.
645 root 1.24
646     [void, scalar or list context] = $json->incr_parse ([$string])
647     This is the central parsing function. It can both append new text
648     and extract objects from the stream accumulated so far (both of
649     these functions are optional).
650    
651     If $string is given, then this string is appended to the already
652     existing JSON fragment stored in the $json object.
653    
654     After that, if the function is called in void context, it will
655     simply return without doing anything further. This can be used to
656     add more text in as many chunks as you want.
657    
658     If the method is called in scalar context, then it will try to
659     extract exactly *one* JSON object. If that is successful, it will
660     return this object, otherwise it will return "undef". If there is a
661     parse error, this method will croak just as "decode" would do (one
662     can then use "incr_skip" to skip the errornous part). This is the
663     most common way of using the method.
664    
665     And finally, in list context, it will try to extract as many objects
666     from the stream as it can find and return them, or the empty list
667     otherwise. For this to work, there must be no separators between the
668     JSON objects or arrays, instead they must be concatenated
669     back-to-back. If an error occurs, an exception will be raised as in
670     the scalar context case. Note that in this case, any
671     previously-parsed JSON texts will be lost.
672    
673     $lvalue_string = $json->incr_text
674     This method returns the currently stored JSON fragment as an lvalue,
675     that is, you can manipulate it. This *only* works when a preceding
676     call to "incr_parse" in *scalar context* successfully returned an
677     object. Under all other circumstances you must not call this
678     function (I mean it. although in simple tests it might actually
679     work, it *will* fail under real world conditions). As a special
680     exception, you can also call this method before having parsed
681     anything.
682    
683     This function is useful in two cases: a) finding the trailing text
684     after a JSON object or b) parsing multiple JSON objects separated by
685     non-JSON text (such as commas).
686    
687     $json->incr_skip
688     This will reset the state of the incremental parser and will remove
689     the parsed text from the input buffer. This is useful after
690     "incr_parse" died, in which case the input buffer and incremental
691     parser state is left unchanged, to skip the text parsed so far and
692     to reset the parse state.
693    
694 root 1.26 $json->incr_reset
695     This completely resets the incremental parser, that is, after this
696     call, it will be as if the parser had never parsed anything.
697    
698     This is useful if you want ot repeatedly parse JSON objects and want
699     to ignore any trailing data, which means you have to reset the
700     parser after each successful decode.
701    
702 root 1.24 LIMITATIONS
703     All options that affect decoding are supported, except "allow_nonref".
704     The reason for this is that it cannot be made to work sensibly: JSON
705     objects and arrays are self-delimited, i.e. you can concatenate them
706     back to back and still decode them perfectly. This does not hold true
707     for JSON numbers, however.
708    
709     For example, is the string 1 a single JSON number, or is it simply the
710     start of 12? Or is 12 a single JSON number, or the concatenation of 1
711     and 2? In neither case you can tell, and this is why JSON::XS takes the
712     conservative route and disallows this case.
713    
714     EXAMPLES
715     Some examples will make all this clearer. First, a simple example that
716     works similarly to "decode_prefix": We want to decode the JSON object at
717     the start of a string and identify the portion after the JSON object:
718    
719     my $text = "[1,2,3] hello";
720    
721     my $json = new JSON::XS;
722    
723     my $obj = $json->incr_parse ($text)
724     or die "expected JSON object or array at beginning of string";
725    
726     my $tail = $json->incr_text;
727     # $tail now contains " hello"
728    
729     Easy, isn't it?
730    
731     Now for a more complicated example: Imagine a hypothetical protocol
732     where you read some requests from a TCP stream, and each request is a
733     JSON array, without any separation between them (in fact, it is often
734     useful to use newlines as "separators", as these get interpreted as
735     whitespace at the start of the JSON text, which makes it possible to
736     test said protocol with "telnet"...).
737    
738     Here is how you'd do it (it is trivial to write this in an event-based
739     manner):
740    
741     my $json = new JSON::XS;
742    
743     # read some data from the socket
744     while (sysread $socket, my $buf, 4096) {
745    
746     # split and decode as many requests as possible
747     for my $request ($json->incr_parse ($buf)) {
748     # act on the $request
749     }
750     }
751    
752     Another complicated example: Assume you have a string with JSON objects
753     or arrays, all separated by (optional) comma characters (e.g. "[1],[2],
754     [3]"). To parse them, we have to skip the commas between the JSON texts,
755     and here is where the lvalue-ness of "incr_text" comes in useful:
756    
757     my $text = "[1],[2], [3]";
758     my $json = new JSON::XS;
759    
760     # void context, so no parsing done
761     $json->incr_parse ($text);
762    
763     # now extract as many objects as possible. note the
764     # use of scalar context so incr_text can be called.
765     while (my $obj = $json->incr_parse) {
766     # do something with $obj
767    
768     # now skip the optional comma
769     $json->incr_text =~ s/^ \s* , //x;
770     }
771    
772     Now lets go for a very complex example: Assume that you have a gigantic
773     JSON array-of-objects, many gigabytes in size, and you want to parse it,
774     but you cannot load it into memory fully (this has actually happened in
775     the real world :).
776    
777     Well, you lost, you have to implement your own JSON parser. But JSON::XS
778     can still help you: You implement a (very simple) array parser and let
779     JSON decode the array elements, which are all full JSON objects on their
780     own (this wouldn't work if the array elements could be JSON numbers, for
781     example):
782    
783     my $json = new JSON::XS;
784    
785     # open the monster
786     open my $fh, "<bigfile.json"
787     or die "bigfile: $!";
788    
789     # first parse the initial "["
790     for (;;) {
791     sysread $fh, my $buf, 65536
792     or die "read error: $!";
793     $json->incr_parse ($buf); # void context, so no parsing
794    
795     # Exit the loop once we found and removed(!) the initial "[".
796     # In essence, we are (ab-)using the $json object as a simple scalar
797     # we append data to.
798     last if $json->incr_text =~ s/^ \s* \[ //x;
799     }
800    
801     # now we have the skipped the initial "[", so continue
802     # parsing all the elements.
803     for (;;) {
804     # in this loop we read data until we got a single JSON object
805     for (;;) {
806     if (my $obj = $json->incr_parse) {
807     # do something with $obj
808     last;
809     }
810    
811     # add more data
812     sysread $fh, my $buf, 65536
813     or die "read error: $!";
814     $json->incr_parse ($buf); # void context, so no parsing
815     }
816    
817     # in this loop we read data until we either found and parsed the
818     # separating "," between elements, or the final "]"
819     for (;;) {
820     # first skip whitespace
821     $json->incr_text =~ s/^\s*//;
822    
823     # if we find "]", we are done
824     if ($json->incr_text =~ s/^\]//) {
825     print "finished.\n";
826     exit;
827     }
828    
829     # if we find ",", we can continue with the next element
830     if ($json->incr_text =~ s/^,//) {
831     last;
832     }
833    
834     # if we find anything else, we have a parse error!
835     if (length $json->incr_text) {
836     die "parse error near ", $json->incr_text;
837     }
838    
839     # else add more data
840     sysread $fh, my $buf, 65536
841     or die "read error: $!";
842     $json->incr_parse ($buf); # void context, so no parsing
843     }
844    
845     This is a complex example, but most of the complexity comes from the
846     fact that we are trying to be correct (bear with me if I am wrong, I
847     never ran the above example :).
848    
849 root 1.4 MAPPING
850     This section describes how JSON::XS maps Perl values to JSON values and
851     vice versa. These mappings are designed to "do the right thing" in most
852     circumstances automatically, preserving round-tripping characteristics
853     (what you put in comes out as something equivalent).
854    
855     For the more enlightened: note that in the following descriptions,
856 root 1.20 lowercase *perl* refers to the Perl interpreter, while uppercase *Perl*
857 root 1.4 refers to the abstract Perl language itself.
858    
859     JSON -> PERL
860     object
861     A JSON object becomes a reference to a hash in Perl. No ordering of
862 root 1.20 object keys is preserved (JSON does not preserve object key ordering
863     itself).
864 root 1.4
865     array
866     A JSON array becomes a reference to an array in Perl.
867    
868     string
869     A JSON string becomes a string scalar in Perl - Unicode codepoints
870     in JSON are represented by the same codepoints in the Perl string,
871     so no manual decoding is necessary.
872    
873     number
874 root 1.16 A JSON number becomes either an integer, numeric (floating point) or
875     string scalar in perl, depending on its range and any fractional
876     parts. On the Perl level, there is no difference between those as
877     Perl handles all the conversion details, but an integer may take
878     slightly less memory and might represent more values exactly than
879 root 1.23 floating point numbers.
880 root 1.16
881     If the number consists of digits only, JSON::XS will try to
882     represent it as an integer value. If that fails, it will try to
883     represent it as a numeric (floating point) value if that is possible
884     without loss of precision. Otherwise it will preserve the number as
885 root 1.23 a string value (in which case you lose roundtripping ability, as the
886     JSON number will be re-encoded toa JSON string).
887 root 1.16
888     Numbers containing a fractional or exponential part will always be
889     represented as numeric (floating point) values, possibly at a loss
890 root 1.23 of precision (in which case you might lose perfect roundtripping
891     ability, but the JSON number will still be re-encoded as a JSON
892     number).
893 root 1.4
894     true, false
895 root 1.14 These JSON atoms become "JSON::XS::true" and "JSON::XS::false",
896     respectively. They are overloaded to act almost exactly like the
897 root 1.20 numbers 1 and 0. You can check whether a scalar is a JSON boolean by
898 root 1.14 using the "JSON::XS::is_bool" function.
899 root 1.4
900     null
901     A JSON null atom becomes "undef" in Perl.
902    
903     PERL -> JSON
904     The mapping from Perl to JSON is slightly more difficult, as Perl is a
905     truly typeless language, so we can only guess which JSON type is meant
906     by a Perl value.
907    
908     hash references
909     Perl hash references become JSON objects. As there is no inherent
910 root 1.9 ordering in hash keys (or JSON objects), they will usually be
911     encoded in a pseudo-random order that can change between runs of the
912     same program but stays generally the same within a single run of a
913     program. JSON::XS can optionally sort the hash keys (determined by
914     the *canonical* flag), so the same datastructure will serialise to
915     the same JSON text (given same settings and version of JSON::XS),
916     but this incurs a runtime overhead and is only rarely useful, e.g.
917     when you want to compare some JSON text against another for
918     equality.
919 root 1.4
920     array references
921     Perl array references become JSON arrays.
922    
923 root 1.9 other references
924     Other unblessed references are generally not allowed and will cause
925     an exception to be thrown, except for references to the integers 0
926     and 1, which get turned into "false" and "true" atoms in JSON. You
927     can also use "JSON::XS::false" and "JSON::XS::true" to improve
928     readability.
929    
930 root 1.26 encode_json [\0, JSON::XS::true] # yields [false,true]
931 root 1.9
932 root 1.14 JSON::XS::true, JSON::XS::false
933     These special values become JSON true and JSON false values,
934 root 1.19 respectively. You can also use "\1" and "\0" directly if you want.
935 root 1.14
936 root 1.4 blessed objects
937 root 1.23 Blessed objects are not directly representable in JSON. See the
938     "allow_blessed" and "convert_blessed" methods on various options on
939     how to deal with this: basically, you can choose between throwing an
940     exception, encoding the reference as if it weren't blessed, or
941     provide your own serialiser method.
942 root 1.4
943     simple scalars
944     Simple Perl scalars (any scalar that is not a reference) are the
945     most difficult objects to encode: JSON::XS will encode undefined
946 root 1.23 scalars as JSON "null" values, scalars that have last been used in a
947     string context before encoding as JSON strings, and anything else as
948 root 1.4 number value:
949    
950     # dump as number
951 root 1.22 encode_json [2] # yields [2]
952     encode_json [-3.0e17] # yields [-3e+17]
953     my $value = 5; encode_json [$value] # yields [5]
954 root 1.4
955     # used as string, so dump as string
956     print $value;
957 root 1.22 encode_json [$value] # yields ["5"]
958 root 1.4
959     # undef becomes null
960 root 1.22 encode_json [undef] # yields [null]
961 root 1.4
962 root 1.20 You can force the type to be a JSON string by stringifying it:
963 root 1.4
964     my $x = 3.1; # some variable containing a number
965     "$x"; # stringified
966     $x .= ""; # another, more awkward way to stringify
967     print $x; # perl does it for you, too, quite often
968    
969 root 1.20 You can force the type to be a JSON number by numifying it:
970 root 1.4
971     my $x = "3"; # some variable containing a string
972     $x += 0; # numify it, ensuring it will be dumped as a number
973 root 1.20 $x *= 1; # same thing, the choice is yours.
974 root 1.4
975 root 1.20 You can not currently force the type in other, less obscure, ways.
976 root 1.23 Tell me if you need this capability (but don't forget to explain why
977 root 1.24 it's needed :).
978 root 1.23
979     ENCODING/CODESET FLAG NOTES
980     The interested reader might have seen a number of flags that signify
981     encodings or codesets - "utf8", "latin1" and "ascii". There seems to be
982     some confusion on what these do, so here is a short comparison:
983    
984 root 1.24 "utf8" controls whether the JSON text created by "encode" (and expected
985 root 1.23 by "decode") is UTF-8 encoded or not, while "latin1" and "ascii" only
986 root 1.24 control whether "encode" escapes character values outside their
987 root 1.23 respective codeset range. Neither of these flags conflict with each
988     other, although some combinations make less sense than others.
989    
990     Care has been taken to make all flags symmetrical with respect to
991     "encode" and "decode", that is, texts encoded with any combination of
992     these flag values will be correctly decoded when the same flags are used
993     - in general, if you use different flag settings while encoding vs. when
994     decoding you likely have a bug somewhere.
995    
996     Below comes a verbose discussion of these flags. Note that a "codeset"
997     is simply an abstract set of character-codepoint pairs, while an
998     encoding takes those codepoint numbers and *encodes* them, in our case
999     into octets. Unicode is (among other things) a codeset, UTF-8 is an
1000     encoding, and ISO-8859-1 (= latin 1) and ASCII are both codesets *and*
1001     encodings at the same time, which can be confusing.
1002    
1003     "utf8" flag disabled
1004     When "utf8" is disabled (the default), then "encode"/"decode"
1005     generate and expect Unicode strings, that is, characters with high
1006     ordinal Unicode values (> 255) will be encoded as such characters,
1007     and likewise such characters are decoded as-is, no canges to them
1008     will be done, except "(re-)interpreting" them as Unicode codepoints
1009     or Unicode characters, respectively (to Perl, these are the same
1010     thing in strings unless you do funny/weird/dumb stuff).
1011    
1012     This is useful when you want to do the encoding yourself (e.g. when
1013     you want to have UTF-16 encoded JSON texts) or when some other layer
1014     does the encoding for you (for example, when printing to a terminal
1015     using a filehandle that transparently encodes to UTF-8 you certainly
1016     do NOT want to UTF-8 encode your data first and have Perl encode it
1017     another time).
1018    
1019     "utf8" flag enabled
1020     If the "utf8"-flag is enabled, "encode"/"decode" will encode all
1021     characters using the corresponding UTF-8 multi-byte sequence, and
1022     will expect your input strings to be encoded as UTF-8, that is, no
1023     "character" of the input string must have any value > 255, as UTF-8
1024     does not allow that.
1025    
1026     The "utf8" flag therefore switches between two modes: disabled means
1027     you will get a Unicode string in Perl, enabled means you get an
1028     UTF-8 encoded octet/binary string in Perl.
1029    
1030     "latin1" or "ascii" flags enabled
1031     With "latin1" (or "ascii") enabled, "encode" will escape characters
1032     with ordinal values > 255 (> 127 with "ascii") and encode the
1033     remaining characters as specified by the "utf8" flag.
1034    
1035     If "utf8" is disabled, then the result is also correctly encoded in
1036     those character sets (as both are proper subsets of Unicode, meaning
1037     that a Unicode string with all character values < 256 is the same
1038     thing as a ISO-8859-1 string, and a Unicode string with all
1039     character values < 128 is the same thing as an ASCII string in
1040     Perl).
1041    
1042     If "utf8" is enabled, you still get a correct UTF-8-encoded string,
1043     regardless of these flags, just some more characters will be escaped
1044     using "\uXXXX" then before.
1045    
1046     Note that ISO-8859-1-*encoded* strings are not compatible with UTF-8
1047     encoding, while ASCII-encoded strings are. That is because the
1048     ISO-8859-1 encoding is NOT a subset of UTF-8 (despite the ISO-8859-1
1049     *codeset* being a subset of Unicode), while ASCII is.
1050    
1051     Surprisingly, "decode" will ignore these flags and so treat all
1052     input values as governed by the "utf8" flag. If it is disabled, this
1053     allows you to decode ISO-8859-1- and ASCII-encoded strings, as both
1054     strict subsets of Unicode. If it is enabled, you can correctly
1055     decode UTF-8 encoded strings.
1056    
1057     So neither "latin1" nor "ascii" are incompatible with the "utf8"
1058     flag - they only govern when the JSON output engine escapes a
1059     character or not.
1060    
1061     The main use for "latin1" is to relatively efficiently store binary
1062     data as JSON, at the expense of breaking compatibility with most
1063     JSON decoders.
1064    
1065     The main use for "ascii" is to force the output to not contain
1066     characters with values > 127, which means you can interpret the
1067     resulting string as UTF-8, ISO-8859-1, ASCII, KOI8-R or most about
1068     any character set and 8-bit-encoding, and still get the same data
1069     structure back. This is useful when your channel for JSON transfer
1070     is not 8-bit clean or the encoding might be mangled in between (e.g.
1071     in mail), and works because ASCII is a proper subset of most 8-bit
1072     and multibyte encodings in use in the world.
1073 root 1.4
1074 root 1.13 JSON and YAML
1075 root 1.23 You often hear that JSON is a subset of YAML. This is, however, a mass
1076     hysteria(*) and very far from the truth (as of the time of this
1077     writing), so let me state it clearly: *in general, there is no way to
1078     configure JSON::XS to output a data structure as valid YAML* that works
1079     in all cases.
1080 root 1.13
1081     If you really must use JSON::XS to generate YAML, you should use this
1082     algorithm (subject to change in future versions):
1083    
1084     my $to_yaml = JSON::XS->new->utf8->space_after (1);
1085     my $yaml = $to_yaml->encode ($ref) . "\n";
1086    
1087 root 1.23 This will *usually* generate JSON texts that also parse as valid YAML.
1088 root 1.13 Please note that YAML has hardcoded limits on (simple) object key
1089 root 1.23 lengths that JSON doesn't have and also has different and incompatible
1090     unicode handling, so you should make sure that your hash keys are
1091     noticeably shorter than the 1024 "stream characters" YAML allows and
1092     that you do not have characters with codepoint values outside the
1093     Unicode BMP (basic multilingual page). YAML also does not allow "\/"
1094     sequences in strings (which JSON::XS does not *currently* generate, but
1095     other JSON generators might).
1096    
1097     There might be other incompatibilities that I am not aware of (or the
1098     YAML specification has been changed yet again - it does so quite often).
1099     In general you should not try to generate YAML with a JSON generator or
1100     vice versa, or try to parse JSON with a YAML parser or vice versa:
1101     chances are high that you will run into severe interoperability problems
1102     when you least expect it.
1103 root 1.13
1104 root 1.23 (*) I have been pressured multiple times by Brian Ingerson (one of the
1105     authors of the YAML specification) to remove this paragraph, despite
1106     him acknowledging that the actual incompatibilities exist. As I was
1107     personally bitten by this "JSON is YAML" lie, I refused and said I
1108     will continue to educate people about these issues, so others do not
1109     run into the same problem again and again. After this, Brian called
1110     me a (quote)*complete and worthless idiot*(unquote).
1111    
1112     In my opinion, instead of pressuring and insulting people who
1113     actually clarify issues with YAML and the wrong statements of some
1114     of its proponents, I would kindly suggest reading the JSON spec
1115     (which is not that difficult or long) and finally make YAML
1116     compatible to it, and educating users about the changes, instead of
1117     spreading lies about the real compatibility for many *years* and
1118     trying to silence people who point out that it isn't true.
1119 root 1.13
1120 root 1.2 SPEED
1121     It seems that JSON::XS is surprisingly fast, as shown in the following
1122     tables. They have been generated with the help of the "eg/bench" program
1123     in the JSON::XS distribution, to make it easy to compare on your own
1124     system.
1125    
1126 root 1.12 First comes a comparison between various modules using a very short
1127 root 1.23 single-line JSON string (also available at
1128     <http://dist.schmorp.de/misc/json/short.json>).
1129 root 1.7
1130 root 1.25 {"method": "handleMessage", "params": ["user1",
1131     "we were just talking"], "id": null, "array":[1,11,234,-5,1e5,1e7,
1132     true, false]}
1133 root 1.7
1134     It shows the number of encodes/decodes per second (JSON::XS uses the
1135     functional interface, while JSON::XS/2 uses the OO interface with
1136 root 1.13 pretty-printing and hashkey sorting enabled, JSON::XS/3 enables shrink).
1137     Higher is better:
1138 root 1.2
1139     module | encode | decode |
1140     -----------|------------|------------|
1141 root 1.21 JSON 1.x | 4990.842 | 4088.813 |
1142 root 1.15 JSON::DWIW | 51653.990 | 71575.154 |
1143     JSON::PC | 65948.176 | 74631.744 |
1144     JSON::PP | 8931.652 | 3817.168 |
1145     JSON::Syck | 24877.248 | 27776.848 |
1146     JSON::XS | 388361.481 | 227951.304 |
1147     JSON::XS/2 | 227951.304 | 218453.333 |
1148     JSON::XS/3 | 338250.323 | 218453.333 |
1149     Storable | 16500.016 | 135300.129 |
1150 root 1.2 -----------+------------+------------+
1151    
1152 root 1.12 That is, JSON::XS is about five times faster than JSON::DWIW on
1153 root 1.20 encoding, about three times faster on decoding, and over forty times
1154 root 1.12 faster than JSON, even with pretty-printing and key sorting. It also
1155     compares favourably to Storable for small amounts of data.
1156 root 1.2
1157 root 1.5 Using a longer test string (roughly 18KB, generated from Yahoo! Locals
1158 root 1.23 search API (<http://dist.schmorp.de/misc/json/long.json>).
1159 root 1.2
1160     module | encode | decode |
1161     -----------|------------|------------|
1162 root 1.21 JSON 1.x | 55.260 | 34.971 |
1163 root 1.15 JSON::DWIW | 825.228 | 1082.513 |
1164     JSON::PC | 3571.444 | 2394.829 |
1165     JSON::PP | 210.987 | 32.574 |
1166     JSON::Syck | 552.551 | 787.544 |
1167     JSON::XS | 5780.463 | 4854.519 |
1168     JSON::XS/2 | 3869.998 | 4798.975 |
1169     JSON::XS/3 | 5862.880 | 4798.975 |
1170     Storable | 4445.002 | 5235.027 |
1171 root 1.2 -----------+------------+------------+
1172    
1173 root 1.13 Again, JSON::XS leads by far (except for Storable which non-surprisingly
1174     decodes faster).
1175 root 1.2
1176 root 1.20 On large strings containing lots of high Unicode characters, some
1177 root 1.7 modules (such as JSON::PC) seem to decode faster than JSON::XS, but the
1178 root 1.20 result will be broken due to missing (or wrong) Unicode handling. Others
1179 root 1.7 refuse to decode or encode properly, so it was impossible to prepare a
1180     fair comparison table for that case.
1181 root 1.5
1182 root 1.8 SECURITY CONSIDERATIONS
1183     When you are using JSON in a protocol, talking to untrusted potentially
1184     hostile creatures requires relatively few measures.
1185    
1186     First of all, your JSON decoder should be secure, that is, should not
1187     have any buffer overflows. Obviously, this module should ensure that and
1188     I am trying hard on making that true, but you never know.
1189    
1190     Second, you need to avoid resource-starving attacks. That means you
1191     should limit the size of JSON texts you accept, or make sure then when
1192 root 1.20 your resources run out, that's just fine (e.g. by using a separate
1193 root 1.8 process that can crash safely). The size of a JSON text in octets or
1194     characters is usually a good indication of the size of the resources
1195 root 1.15 required to decode it into a Perl structure. While JSON::XS can check
1196     the size of the JSON text, it might be too late when you already have it
1197     in memory, so you might want to check the size before you accept the
1198     string.
1199 root 1.8
1200     Third, JSON::XS recurses using the C stack when decoding objects and
1201     arrays. The C stack is a limited resource: for instance, on my amd64
1202     machine with 8MB of stack size I can decode around 180k nested arrays
1203 root 1.10 but only 14k nested JSON objects (due to perl itself recursing deeply on
1204     croak to free the temporary). If that is exceeded, the program crashes.
1205 root 1.23 To be conservative, the default nesting limit is set to 512. If your
1206 root 1.8 process has a smaller stack, you should adjust this setting accordingly
1207     with the "max_depth" method.
1208    
1209 root 1.23 Something else could bomb you, too, that I forgot to think of. In that
1210     case, you get to keep the pieces. I am always open for hints, though...
1211    
1212     Also keep in mind that JSON::XS might leak contents of your Perl data
1213     structures in its error messages, so when you serialise sensitive
1214     information you might want to make sure that exceptions thrown by
1215     JSON::XS will not end up in front of untrusted eyes.
1216 root 1.2
1217 root 1.20 If you are using JSON::XS to return packets to consumption by JavaScript
1218 root 1.14 scripts in a browser you should have a look at
1219 root 1.20 <http://jpsykes.com/47/practical-csrf-and-json-security> to see whether
1220 root 1.14 you are vulnerable to some common attack vectors (which really are
1221     browser design bugs, but it is still you who will have to deal with it,
1222 root 1.23 as major browser developers care only for features, not about getting
1223 root 1.14 security right).
1224    
1225 root 1.19 THREADS
1226 root 1.20 This module is *not* guaranteed to be thread safe and there are no plans
1227 root 1.19 to change this until Perl gets thread support (as opposed to the
1228     horribly slow so-called "threads" which are simply slow and bloated
1229 root 1.24 process simulations - use fork, it's *much* faster, cheaper, better).
1230 root 1.19
1231 root 1.20 (It might actually work, but you have been warned).
1232 root 1.19
1233 root 1.2 BUGS
1234     While the goal of this module is to be correct, that unfortunately does
1235 root 1.26 not mean it's bug-free, only that I think its design is bug-free. If you
1236     keep reporting bugs they will be fixed swiftly, though.
1237 root 1.1
1238 root 1.19 Please refrain from using rt.cpan.org or any other bug reporting
1239     service. I put the contact address into my modules for a reason.
1240    
1241 root 1.24 SEE ALSO
1242     The json_xs command line utility for quick experiments.
1243    
1244 root 1.1 AUTHOR
1245     Marc Lehmann <schmorp@schmorp.de>
1246     http://home.schmorp.de/
1247