ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/JSON-XS/XS.pm
Revision: 1.73
Committed: Sun Nov 25 19:36:54 2007 UTC (16 years, 5 months ago) by root
Branch: MAIN
Changes since 1.72: +1 -1 lines
Log Message:
prelim docs

File Contents

# User Rev Content
1 root 1.1 =head1 NAME
2    
3     JSON::XS - JSON serialising/deserialising, done correctly and fast
4    
5 root 1.62 JSON::XS - 正しくて高速な JSON シリアライザ/デシリアライザ
6     (http://fleur.hio.jp/perldoc/mix/lib/JSON/XS.html)
7    
8 root 1.1 =head1 SYNOPSIS
9    
10     use JSON::XS;
11    
12 root 1.22 # exported functions, they croak on error
13     # and expect/generate UTF-8
14 root 1.12
15     $utf8_encoded_json_text = to_json $perl_hash_or_arrayref;
16     $perl_hash_or_arrayref = from_json $utf8_encoded_json_text;
17    
18 root 1.22 # OO-interface
19 root 1.12
20     $coder = JSON::XS->new->ascii->pretty->allow_nonref;
21     $pretty_printed_unencoded = $coder->encode ($perl_scalar);
22     $perl_scalar = $coder->decode ($unicode_json_text);
23    
24 root 1.1 =head1 DESCRIPTION
25    
26 root 1.2 This module converts Perl data structures to JSON and vice versa. Its
27     primary goal is to be I<correct> and its secondary goal is to be
28     I<fast>. To reach the latter goal it was written in C.
29    
30     As this is the n-th-something JSON module on CPAN, what was the reason
31     to write yet another JSON module? While it seems there are many JSON
32     modules, none of them correctly handle all corner cases, and in most cases
33     their maintainers are unresponsive, gone missing, or not listening to bug
34     reports for other reasons.
35    
36     See COMPARISON, below, for a comparison to some other JSON modules.
37    
38 root 1.10 See MAPPING, below, on how JSON::XS maps perl values to JSON values and
39     vice versa.
40    
41 root 1.2 =head2 FEATURES
42    
43 root 1.1 =over 4
44    
45 root 1.68 =item * correct Unicode handling
46 root 1.2
47 root 1.10 This module knows how to handle Unicode, and even documents how and when
48     it does so.
49 root 1.2
50     =item * round-trip integrity
51    
52     When you serialise a perl data structure using only datatypes supported
53     by JSON, the deserialised data structure is identical on the Perl level.
54 root 1.21 (e.g. the string "2.0" doesn't suddenly become "2" just because it looks
55     like a number).
56 root 1.2
57     =item * strict checking of JSON correctness
58    
59 root 1.16 There is no guessing, no generating of illegal JSON texts by default,
60 root 1.10 and only JSON is accepted as input by default (the latter is a security
61     feature).
62 root 1.2
63     =item * fast
64    
65 root 1.10 Compared to other JSON modules, this module compares favourably in terms
66     of speed, too.
67 root 1.2
68     =item * simple to use
69    
70     This module has both a simple functional interface as well as an OO
71     interface.
72    
73     =item * reasonably versatile output formats
74    
75 root 1.68 You can choose between the most compact guaranteed single-line format
76 root 1.21 possible (nice for simple line-based protocols), a pure-ascii format
77     (for when your transport is not 8-bit clean, still supports the whole
78 root 1.68 Unicode range), or a pretty-printed format (for when you want to read that
79 root 1.21 stuff). Or you can combine those features in whatever way you like.
80 root 1.2
81     =back
82    
83 root 1.1 =cut
84    
85     package JSON::XS;
86    
87 root 1.20 use strict;
88    
89 root 1.73 our $VERSION = '1.6';
90 root 1.43 our @ISA = qw(Exporter);
91 root 1.1
92 root 1.49 our @EXPORT = qw(to_json from_json);
93 root 1.1
94 root 1.43 use Exporter;
95     use XSLoader;
96 root 1.1
97 root 1.2 =head1 FUNCTIONAL INTERFACE
98    
99 root 1.68 The following convenience methods are provided by this module. They are
100 root 1.2 exported by default:
101    
102     =over 4
103    
104 root 1.16 =item $json_text = to_json $perl_scalar
105 root 1.2
106 root 1.63 Converts the given Perl data structure to a UTF-8 encoded, binary string
107     (that is, the string contains octets only). Croaks on error.
108 root 1.2
109 root 1.16 This function call is functionally identical to:
110 root 1.2
111 root 1.16 $json_text = JSON::XS->new->utf8->encode ($perl_scalar)
112    
113     except being faster.
114    
115     =item $perl_scalar = from_json $json_text
116 root 1.2
117 root 1.63 The opposite of C<to_json>: expects an UTF-8 (binary) string and tries
118     to parse that as an UTF-8 encoded JSON text, returning the resulting
119     reference. Croaks on error.
120 root 1.2
121 root 1.16 This function call is functionally identical to:
122    
123     $perl_scalar = JSON::XS->new->utf8->decode ($json_text)
124    
125     except being faster.
126 root 1.2
127 root 1.43 =item $is_boolean = JSON::XS::is_bool $scalar
128    
129     Returns true if the passed scalar represents either JSON::XS::true or
130     JSON::XS::false, two constants that act like C<1> and C<0>, respectively
131     and are used to represent JSON C<true> and C<false> values in Perl.
132    
133     See MAPPING, below, for more information on how JSON values are mapped to
134     Perl.
135    
136 root 1.2 =back
137    
138 root 1.23
139 root 1.63 =head1 A FEW NOTES ON UNICODE AND PERL
140    
141     Since this often leads to confusion, here are a few very clear words on
142     how Unicode works in Perl, modulo bugs.
143    
144     =over 4
145    
146     =item 1. Perl strings can store characters with ordinal values > 255.
147    
148 root 1.68 This enables you to store Unicode characters as single characters in a
149 root 1.63 Perl string - very natural.
150    
151     =item 2. Perl does I<not> associate an encoding with your strings.
152    
153     Unless you force it to, e.g. when matching it against a regex, or printing
154     the scalar to a file, in which case Perl either interprets your string as
155     locale-encoded text, octets/binary, or as Unicode, depending on various
156     settings. In no case is an encoding stored together with your data, it is
157     I<use> that decides encoding, not any magical metadata.
158    
159     =item 3. The internal utf-8 flag has no meaning with regards to the
160     encoding of your string.
161    
162     Just ignore that flag unless you debug a Perl bug, a module written in
163     XS or want to dive into the internals of perl. Otherwise it will only
164     confuse you, as, despite the name, it says nothing about how your string
165 root 1.68 is encoded. You can have Unicode strings with that flag set, with that
166 root 1.63 flag clear, and you can have binary data with that flag set and that flag
167     clear. Other possibilities exist, too.
168    
169     If you didn't know about that flag, just the better, pretend it doesn't
170     exist.
171    
172     =item 4. A "Unicode String" is simply a string where each character can be
173     validly interpreted as a Unicode codepoint.
174    
175     If you have UTF-8 encoded data, it is no longer a Unicode string, but a
176     Unicode string encoded in UTF-8, giving you a binary string.
177    
178     =item 5. A string containing "high" (> 255) character values is I<not> a UTF-8 string.
179    
180 root 1.68 It's a fact. Learn to live with it.
181 root 1.63
182     =back
183    
184     I hope this helps :)
185    
186    
187 root 1.2 =head1 OBJECT-ORIENTED INTERFACE
188    
189     The object oriented interface lets you configure your own encoding or
190     decoding style, within the limits of supported formats.
191    
192     =over 4
193    
194     =item $json = new JSON::XS
195    
196     Creates a new JSON::XS object that can be used to de/encode JSON
197     strings. All boolean flags described below are by default I<disabled>.
198 root 1.1
199 root 1.2 The mutators for flags all return the JSON object again and thus calls can
200     be chained:
201    
202 root 1.16 my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]})
203 root 1.3 => {"a": [1, 2]}
204 root 1.2
205 root 1.7 =item $json = $json->ascii ([$enable])
206 root 1.2
207 root 1.72 =item $enabled = $json->get_ascii
208    
209 root 1.16 If C<$enable> is true (or missing), then the C<encode> method will not
210     generate characters outside the code range C<0..127> (which is ASCII). Any
211 root 1.68 Unicode characters outside that range will be escaped using either a
212 root 1.16 single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL escape sequence,
213 root 1.32 as per RFC4627. The resulting encoded JSON text can be treated as a native
214 root 1.68 Unicode string, an ascii-encoded, latin1-encoded or UTF-8 encoded string,
215 root 1.32 or any other superset of ASCII.
216 root 1.2
217     If C<$enable> is false, then the C<encode> method will not escape Unicode
218 root 1.33 characters unless required by the JSON syntax or other flags. This results
219     in a faster and more compact format.
220    
221     The main use for this flag is to produce JSON texts that can be
222     transmitted over a 7-bit channel, as the encoded JSON texts will not
223     contain any 8 bit characters.
224 root 1.2
225 root 1.16 JSON::XS->new->ascii (1)->encode ([chr 0x10401])
226     => ["\ud801\udc01"]
227 root 1.3
228 root 1.33 =item $json = $json->latin1 ([$enable])
229    
230 root 1.72 =item $enabled = $json->get_latin1
231    
232 root 1.33 If C<$enable> is true (or missing), then the C<encode> method will encode
233     the resulting JSON text as latin1 (or iso-8859-1), escaping any characters
234     outside the code range C<0..255>. The resulting string can be treated as a
235 root 1.68 latin1-encoded JSON text or a native Unicode string. The C<decode> method
236 root 1.33 will not be affected in any way by this flag, as C<decode> by default
237 root 1.68 expects Unicode, which is a strict superset of latin1.
238 root 1.33
239     If C<$enable> is false, then the C<encode> method will not escape Unicode
240     characters unless required by the JSON syntax or other flags.
241    
242     The main use for this flag is efficiently encoding binary data as JSON
243     text, as most octets will not be escaped, resulting in a smaller encoded
244     size. The disadvantage is that the resulting JSON text is encoded
245     in latin1 (and must correctly be treated as such when storing and
246 root 1.68 transferring), a rare encoding for JSON. It is therefore most useful when
247 root 1.33 you want to store data structures known to contain binary data efficiently
248     in files or databases, not when talking to other JSON encoders/decoders.
249    
250     JSON::XS->new->latin1->encode (["\x{89}\x{abc}"]
251     => ["\x{89}\\u0abc"] # (perl syntax, U+abc escaped, U+89 not)
252    
253 root 1.7 =item $json = $json->utf8 ([$enable])
254 root 1.2
255 root 1.72 =item $enabled = $json->get_utf8
256    
257 root 1.7 If C<$enable> is true (or missing), then the C<encode> method will encode
258 root 1.16 the JSON result into UTF-8, as required by many protocols, while the
259 root 1.7 C<decode> method expects to be handled an UTF-8-encoded string. Please
260     note that UTF-8-encoded strings do not contain any characters outside the
261 root 1.16 range C<0..255>, they are thus useful for bytewise/binary I/O. In future
262     versions, enabling this option might enable autodetection of the UTF-16
263     and UTF-32 encoding families, as described in RFC4627.
264 root 1.2
265     If C<$enable> is false, then the C<encode> method will return the JSON
266 root 1.68 string as a (non-encoded) Unicode string, while C<decode> expects thus a
267     Unicode string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs
268 root 1.2 to be done yourself, e.g. using the Encode module.
269    
270 root 1.16 Example, output UTF-16BE-encoded JSON:
271    
272     use Encode;
273     $jsontext = encode "UTF-16BE", JSON::XS->new->encode ($object);
274    
275     Example, decode UTF-32LE-encoded JSON:
276    
277     use Encode;
278     $object = JSON::XS->new->decode (decode "UTF-32LE", $jsontext);
279 root 1.12
280 root 1.7 =item $json = $json->pretty ([$enable])
281 root 1.2
282 root 1.72 =item $enabled = $json->get_pretty
283    
284 root 1.2 This enables (or disables) all of the C<indent>, C<space_before> and
285 root 1.3 C<space_after> (and in the future possibly more) flags in one call to
286 root 1.2 generate the most readable (or most compact) form possible.
287    
288 root 1.12 Example, pretty-print some simple structure:
289    
290 root 1.3 my $json = JSON::XS->new->pretty(1)->encode ({a => [1,2]})
291     =>
292     {
293     "a" : [
294     1,
295     2
296     ]
297     }
298    
299 root 1.7 =item $json = $json->indent ([$enable])
300 root 1.2
301 root 1.72 =item $enabled = $json->get_indent
302    
303 root 1.7 If C<$enable> is true (or missing), then the C<encode> method will use a multiline
304 root 1.2 format as output, putting every array member or object/hash key-value pair
305 root 1.68 into its own line, indenting them properly.
306 root 1.2
307     If C<$enable> is false, no newlines or indenting will be produced, and the
308 root 1.68 resulting JSON text is guaranteed not to contain any C<newlines>.
309 root 1.2
310 root 1.16 This setting has no effect when decoding JSON texts.
311 root 1.2
312 root 1.7 =item $json = $json->space_before ([$enable])
313 root 1.2
314 root 1.72 =item $enabled = $json->get_space_before
315    
316 root 1.7 If C<$enable> is true (or missing), then the C<encode> method will add an extra
317 root 1.2 optional space before the C<:> separating keys from values in JSON objects.
318    
319     If C<$enable> is false, then the C<encode> method will not add any extra
320     space at those places.
321    
322 root 1.16 This setting has no effect when decoding JSON texts. You will also
323     most likely combine this setting with C<space_after>.
324 root 1.2
325 root 1.12 Example, space_before enabled, space_after and indent disabled:
326    
327     {"key" :"value"}
328    
329 root 1.7 =item $json = $json->space_after ([$enable])
330 root 1.2
331 root 1.72 =item $enabled = $json->get_space_after
332    
333 root 1.7 If C<$enable> is true (or missing), then the C<encode> method will add an extra
334 root 1.2 optional space after the C<:> separating keys from values in JSON objects
335     and extra whitespace after the C<,> separating key-value pairs and array
336     members.
337    
338     If C<$enable> is false, then the C<encode> method will not add any extra
339     space at those places.
340    
341 root 1.16 This setting has no effect when decoding JSON texts.
342 root 1.2
343 root 1.12 Example, space_before and indent disabled, space_after enabled:
344    
345     {"key": "value"}
346    
347 root 1.59 =item $json = $json->relaxed ([$enable])
348    
349 root 1.72 =item $enabled = $json->get_relaxed
350    
351 root 1.59 If C<$enable> is true (or missing), then C<decode> will accept some
352     extensions to normal JSON syntax (see below). C<encode> will not be
353     affected in anyway. I<Be aware that this option makes you accept invalid
354     JSON texts as if they were valid!>. I suggest only to use this option to
355     parse application-specific files written by humans (configuration files,
356     resource files etc.)
357    
358     If C<$enable> is false (the default), then C<decode> will only accept
359     valid JSON texts.
360    
361     Currently accepted extensions are:
362    
363     =over 4
364    
365     =item * list items can have an end-comma
366    
367     JSON I<separates> array elements and key-value pairs with commas. This
368     can be annoying if you write JSON texts manually and want to be able to
369     quickly append elements, so this extension accepts comma at the end of
370     such items not just between them:
371    
372     [
373     1,
374     2, <- this comma not normally allowed
375     ]
376     {
377     "k1": "v1",
378     "k2": "v2", <- this comma not normally allowed
379     }
380    
381 root 1.60 =item * shell-style '#'-comments
382    
383     Whenever JSON allows whitespace, shell-style comments are additionally
384     allowed. They are terminated by the first carriage-return or line-feed
385     character, after which more white-space and comments are allowed.
386    
387     [
388     1, # this comment not allowed in JSON
389     # neither this one...
390     ]
391    
392 root 1.59 =back
393    
394 root 1.7 =item $json = $json->canonical ([$enable])
395 root 1.2
396 root 1.72 =item $enabled = $json->get_canonical
397    
398 root 1.7 If C<$enable> is true (or missing), then the C<encode> method will output JSON objects
399 root 1.2 by sorting their keys. This is adding a comparatively high overhead.
400    
401     If C<$enable> is false, then the C<encode> method will output key-value
402     pairs in the order Perl stores them (which will likely change between runs
403     of the same script).
404    
405     This option is useful if you want the same data structure to be encoded as
406 root 1.16 the same JSON text (given the same overall settings). If it is disabled,
407 root 1.68 the same hash might be encoded differently even if contains the same data,
408 root 1.2 as key-value pairs have no inherent ordering in Perl.
409    
410 root 1.16 This setting has no effect when decoding JSON texts.
411 root 1.2
412 root 1.7 =item $json = $json->allow_nonref ([$enable])
413 root 1.3
414 root 1.72 =item $enabled = $json->get_allow_nonref
415    
416 root 1.7 If C<$enable> is true (or missing), then the C<encode> method can convert a
417 root 1.3 non-reference into its corresponding string, number or null JSON value,
418     which is an extension to RFC4627. Likewise, C<decode> will accept those JSON
419     values instead of croaking.
420    
421     If C<$enable> is false, then the C<encode> method will croak if it isn't
422 root 1.16 passed an arrayref or hashref, as JSON texts must either be an object
423 root 1.3 or array. Likewise, C<decode> will croak if given something that is not a
424     JSON object or array.
425    
426 root 1.12 Example, encode a Perl scalar as JSON value with enabled C<allow_nonref>,
427     resulting in an invalid JSON text:
428    
429     JSON::XS->new->allow_nonref->encode ("Hello, World!")
430     => "Hello, World!"
431    
432 root 1.44 =item $json = $json->allow_blessed ([$enable])
433    
434 root 1.72 =item $enabled = $json->get_allow_bless
435    
436 root 1.44 If C<$enable> is true (or missing), then the C<encode> method will not
437     barf when it encounters a blessed reference. Instead, the value of the
438 root 1.68 B<convert_blessed> option will decide whether C<null> (C<convert_blessed>
439 root 1.44 disabled or no C<to_json> method found) or a representation of the
440     object (C<convert_blessed> enabled and C<to_json> method found) is being
441     encoded. Has no effect on C<decode>.
442    
443     If C<$enable> is false (the default), then C<encode> will throw an
444     exception when it encounters a blessed object.
445    
446     =item $json = $json->convert_blessed ([$enable])
447    
448 root 1.72 =item $enabled = $json->get_convert_blessed
449    
450 root 1.44 If C<$enable> is true (or missing), then C<encode>, upon encountering a
451     blessed object, will check for the availability of the C<TO_JSON> method
452     on the object's class. If found, it will be called in scalar context
453     and the resulting scalar will be encoded instead of the object. If no
454     C<TO_JSON> method is found, the value of C<allow_blessed> will decide what
455     to do.
456    
457     The C<TO_JSON> method may safely call die if it wants. If C<TO_JSON>
458     returns other blessed objects, those will be handled in the same
459     way. C<TO_JSON> must take care of not causing an endless recursion cycle
460     (== crash) in this case. The name of C<TO_JSON> was chosen because other
461 root 1.46 methods called by the Perl core (== not by the user of the object) are
462 root 1.44 usually in upper case letters and to avoid collisions with the C<to_json>
463     function.
464    
465 root 1.45 This setting does not yet influence C<decode> in any way, but in the
466     future, global hooks might get installed that influence C<decode> and are
467     enabled by this setting.
468    
469 root 1.44 If C<$enable> is false, then the C<allow_blessed> setting will decide what
470     to do when a blessed object is found.
471    
472 root 1.52 =item $json = $json->filter_json_object ([$coderef->($hashref)])
473 root 1.51
474     When C<$coderef> is specified, it will be called from C<decode> each
475     time it decodes a JSON object. The only argument is a reference to the
476     newly-created hash. If the code references returns a single scalar (which
477     need not be a reference), this value (i.e. a copy of that scalar to avoid
478     aliasing) is inserted into the deserialised data structure. If it returns
479     an empty list (NOTE: I<not> C<undef>, which is a valid scalar), the
480     original deserialised hash will be inserted. This setting can slow down
481     decoding considerably.
482    
483 root 1.52 When C<$coderef> is omitted or undefined, any existing callback will
484     be removed and C<decode> will not change the deserialised hash in any
485     way.
486 root 1.51
487     Example, convert all JSON objects into the integer 5:
488    
489     my $js = JSON::XS->new->filter_json_object (sub { 5 });
490     # returns [5]
491     $js->decode ('[{}]')
492 root 1.52 # throw an exception because allow_nonref is not enabled
493     # so a lone 5 is not allowed.
494 root 1.51 $js->decode ('{"a":1, "b":2}');
495    
496 root 1.52 =item $json = $json->filter_json_single_key_object ($key [=> $coderef->($value)])
497 root 1.51
498 root 1.52 Works remotely similar to C<filter_json_object>, but is only called for
499     JSON objects having a single key named C<$key>.
500 root 1.51
501     This C<$coderef> is called before the one specified via
502 root 1.52 C<filter_json_object>, if any. It gets passed the single value in the JSON
503     object. If it returns a single value, it will be inserted into the data
504     structure. If it returns nothing (not even C<undef> but the empty list),
505     the callback from C<filter_json_object> will be called next, as if no
506     single-key callback were specified.
507    
508     If C<$coderef> is omitted or undefined, the corresponding callback will be
509     disabled. There can only ever be one callback for a given key.
510 root 1.51
511     As this callback gets called less often then the C<filter_json_object>
512     one, decoding speed will not usually suffer as much. Therefore, single-key
513     objects make excellent targets to serialise Perl objects into, especially
514     as single-key JSON objects are as close to the type-tagged value concept
515 root 1.68 as JSON gets (it's basically an ID/VALUE tuple). Of course, JSON does not
516 root 1.51 support this in any way, so you need to make sure your data never looks
517     like a serialised Perl hash.
518    
519     Typical names for the single object key are C<__class_whatever__>, or
520     C<$__dollars_are_rarely_used__$> or C<}ugly_brace_placement>, or even
521     things like C<__class_md5sum(classname)__>, to reduce the risk of clashing
522     with real hashes.
523    
524     Example, decode JSON objects of the form C<< { "__widget__" => <id> } >>
525     into the corresponding C<< $WIDGET{<id>} >> object:
526    
527     # return whatever is in $WIDGET{5}:
528     JSON::XS
529     ->new
530 root 1.52 ->filter_json_single_key_object (__widget__ => sub {
531     $WIDGET{ $_[0] }
532 root 1.51 })
533     ->decode ('{"__widget__": 5')
534    
535     # this can be used with a TO_JSON method in some "widget" class
536     # for serialisation to json:
537     sub WidgetBase::TO_JSON {
538     my ($self) = @_;
539    
540     unless ($self->{id}) {
541     $self->{id} = ..get..some..id..;
542     $WIDGET{$self->{id}} = $self;
543     }
544    
545     { __widget__ => $self->{id} }
546     }
547    
548 root 1.7 =item $json = $json->shrink ([$enable])
549    
550 root 1.72 =item $enabled = $json->get_shrink
551    
552 root 1.7 Perl usually over-allocates memory a bit when allocating space for
553 root 1.24 strings. This flag optionally resizes strings generated by either
554 root 1.7 C<encode> or C<decode> to their minimum size possible. This can save
555 root 1.16 memory when your JSON texts are either very very long or you have many
556 root 1.8 short strings. It will also try to downgrade any strings to octet-form
557     if possible: perl stores strings internally either in an encoding called
558     UTF-X or in octet-form. The latter cannot store everything but uses less
559 root 1.24 space in general (and some buggy Perl or C code might even rely on that
560     internal representation being used).
561 root 1.7
562 root 1.24 The actual definition of what shrink does might change in future versions,
563     but it will always try to save space at the expense of time.
564    
565     If C<$enable> is true (or missing), the string returned by C<encode> will
566     be shrunk-to-fit, while all strings generated by C<decode> will also be
567     shrunk-to-fit.
568 root 1.7
569     If C<$enable> is false, then the normal perl allocation algorithms are used.
570     If you work with your data, then this is likely to be faster.
571    
572     In the future, this setting might control other things, such as converting
573     strings that look like integers or floats into integers or floats
574     internally (there is no difference on the Perl level), saving space.
575    
576 root 1.23 =item $json = $json->max_depth ([$maximum_nesting_depth])
577    
578 root 1.72 =item $max_depth = $json->get_max_depth
579    
580 root 1.28 Sets the maximum nesting level (default C<512>) accepted while encoding
581 root 1.23 or decoding. If the JSON text or Perl data structure has an equal or
582     higher nesting level then this limit, then the encoder and decoder will
583     stop and croak at that point.
584    
585     Nesting level is defined by number of hash- or arrayrefs that the encoder
586     needs to traverse to reach a given point or the number of C<{> or C<[>
587     characters without their matching closing parenthesis crossed to reach a
588     given character in a string.
589    
590     Setting the maximum depth to one disallows any nesting, so that ensures
591     that the object is only a single hash/object or array.
592    
593 root 1.47 The argument to C<max_depth> will be rounded up to the next highest power
594     of two. If no argument is given, the highest possible setting will be
595     used, which is rarely useful.
596    
597     See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
598    
599     =item $json = $json->max_size ([$maximum_string_size])
600    
601 root 1.72 =item $max_size = $json->get_max_size
602    
603 root 1.47 Set the maximum length a JSON text may have (in bytes) where decoding is
604     being attempted. The default is C<0>, meaning no limit. When C<decode>
605     is called on a string longer then this number of characters it will not
606     attempt to decode the string but throw an exception. This setting has no
607     effect on C<encode> (yet).
608    
609     The argument to C<max_size> will be rounded up to the next B<highest>
610     power of two (so may be more than requested). If no argument is given, the
611     limit check will be deactivated (same as when C<0> is specified).
612 root 1.23
613     See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
614    
615 root 1.16 =item $json_text = $json->encode ($perl_scalar)
616 root 1.2
617     Converts the given Perl data structure (a simple scalar or a reference
618     to a hash or array) to its JSON representation. Simple scalars will be
619     converted into JSON string or number sequences, while references to arrays
620     become JSON arrays and references to hashes become JSON objects. Undefined
621     Perl values (e.g. C<undef>) become JSON C<null> values. Neither C<true>
622     nor C<false> values will be generated.
623 root 1.1
624 root 1.16 =item $perl_scalar = $json->decode ($json_text)
625 root 1.1
626 root 1.16 The opposite of C<encode>: expects a JSON text and tries to parse it,
627 root 1.2 returning the resulting simple scalar or reference. Croaks on error.
628 root 1.1
629 root 1.2 JSON numbers and strings become simple Perl scalars. JSON arrays become
630     Perl arrayrefs and JSON objects become Perl hashrefs. C<true> becomes
631     C<1>, C<false> becomes C<0> and C<null> becomes C<undef>.
632 root 1.1
633 root 1.34 =item ($perl_scalar, $characters) = $json->decode_prefix ($json_text)
634    
635     This works like the C<decode> method, but instead of raising an exception
636     when there is trailing garbage after the first JSON object, it will
637     silently stop parsing there and return the number of characters consumed
638     so far.
639    
640     This is useful if your JSON texts are not delimited by an outer protocol
641     (which is not the brightest thing to do in the first place) and you need
642     to know where the JSON text ends.
643    
644     JSON::XS->new->decode_prefix ("[1] the tail")
645     => ([], 3)
646    
647 root 1.1 =back
648    
649 root 1.23
650 root 1.10 =head1 MAPPING
651    
652     This section describes how JSON::XS maps Perl values to JSON values and
653     vice versa. These mappings are designed to "do the right thing" in most
654     circumstances automatically, preserving round-tripping characteristics
655     (what you put in comes out as something equivalent).
656    
657     For the more enlightened: note that in the following descriptions,
658 root 1.68 lowercase I<perl> refers to the Perl interpreter, while uppercase I<Perl>
659 root 1.10 refers to the abstract Perl language itself.
660    
661 root 1.39
662 root 1.10 =head2 JSON -> PERL
663    
664     =over 4
665    
666     =item object
667    
668     A JSON object becomes a reference to a hash in Perl. No ordering of object
669 root 1.68 keys is preserved (JSON does not preserve object key ordering itself).
670 root 1.10
671     =item array
672    
673     A JSON array becomes a reference to an array in Perl.
674    
675     =item string
676    
677     A JSON string becomes a string scalar in Perl - Unicode codepoints in JSON
678     are represented by the same codepoints in the Perl string, so no manual
679     decoding is necessary.
680    
681     =item number
682    
683 root 1.56 A JSON number becomes either an integer, numeric (floating point) or
684     string scalar in perl, depending on its range and any fractional parts. On
685     the Perl level, there is no difference between those as Perl handles all
686     the conversion details, but an integer may take slightly less memory and
687     might represent more values exactly than (floating point) numbers.
688    
689     If the number consists of digits only, JSON::XS will try to represent
690     it as an integer value. If that fails, it will try to represent it as
691     a numeric (floating point) value if that is possible without loss of
692     precision. Otherwise it will preserve the number as a string value.
693    
694     Numbers containing a fractional or exponential part will always be
695     represented as numeric (floating point) values, possibly at a loss of
696     precision.
697    
698     This might create round-tripping problems as numbers might become strings,
699     but as Perl is typeless there is no other way to do it.
700 root 1.10
701     =item true, false
702    
703 root 1.43 These JSON atoms become C<JSON::XS::true> and C<JSON::XS::false>,
704     respectively. They are overloaded to act almost exactly like the numbers
705 root 1.68 C<1> and C<0>. You can check whether a scalar is a JSON boolean by using
706 root 1.43 the C<JSON::XS::is_bool> function.
707 root 1.10
708     =item null
709    
710     A JSON null atom becomes C<undef> in Perl.
711    
712     =back
713    
714 root 1.39
715 root 1.10 =head2 PERL -> JSON
716    
717     The mapping from Perl to JSON is slightly more difficult, as Perl is a
718     truly typeless language, so we can only guess which JSON type is meant by
719     a Perl value.
720    
721     =over 4
722    
723     =item hash references
724    
725     Perl hash references become JSON objects. As there is no inherent ordering
726 root 1.25 in hash keys (or JSON objects), they will usually be encoded in a
727     pseudo-random order that can change between runs of the same program but
728     stays generally the same within a single run of a program. JSON::XS can
729     optionally sort the hash keys (determined by the I<canonical> flag), so
730     the same datastructure will serialise to the same JSON text (given same
731     settings and version of JSON::XS), but this incurs a runtime overhead
732     and is only rarely useful, e.g. when you want to compare some JSON text
733     against another for equality.
734 root 1.10
735     =item array references
736    
737     Perl array references become JSON arrays.
738    
739 root 1.25 =item other references
740    
741     Other unblessed references are generally not allowed and will cause an
742     exception to be thrown, except for references to the integers C<0> and
743     C<1>, which get turned into C<false> and C<true> atoms in JSON. You can
744     also use C<JSON::XS::false> and C<JSON::XS::true> to improve readability.
745    
746     to_json [\0,JSON::XS::true] # yields [false,true]
747    
748 root 1.43 =item JSON::XS::true, JSON::XS::false
749    
750     These special values become JSON true and JSON false values,
751 root 1.61 respectively. You can also use C<\1> and C<\0> directly if you want.
752 root 1.43
753 root 1.10 =item blessed objects
754    
755     Blessed objects are not allowed. JSON::XS currently tries to encode their
756     underlying representation (hash- or arrayref), but this behaviour might
757     change in future versions.
758    
759     =item simple scalars
760    
761     Simple Perl scalars (any scalar that is not a reference) are the most
762     difficult objects to encode: JSON::XS will encode undefined scalars as
763     JSON null value, scalars that have last been used in a string context
764     before encoding as JSON strings and anything else as number value:
765    
766     # dump as number
767     to_json [2] # yields [2]
768     to_json [-3.0e17] # yields [-3e+17]
769     my $value = 5; to_json [$value] # yields [5]
770    
771     # used as string, so dump as string
772     print $value;
773     to_json [$value] # yields ["5"]
774    
775     # undef becomes null
776     to_json [undef] # yields [null]
777    
778 root 1.68 You can force the type to be a JSON string by stringifying it:
779 root 1.10
780     my $x = 3.1; # some variable containing a number
781     "$x"; # stringified
782     $x .= ""; # another, more awkward way to stringify
783     print $x; # perl does it for you, too, quite often
784    
785 root 1.68 You can force the type to be a JSON number by numifying it:
786 root 1.10
787     my $x = "3"; # some variable containing a string
788     $x += 0; # numify it, ensuring it will be dumped as a number
789 root 1.68 $x *= 1; # same thing, the choice is yours.
790 root 1.10
791 root 1.68 You can not currently force the type in other, less obscure, ways. Tell me
792     if you need this capability.
793 root 1.10
794     =back
795    
796 root 1.23
797 root 1.3 =head1 COMPARISON
798    
799     As already mentioned, this module was created because none of the existing
800     JSON modules could be made to work correctly. First I will describe the
801     problems (or pleasures) I encountered with various existing JSON modules,
802 root 1.4 followed by some benchmark values. JSON::XS was designed not to suffer
803     from any of these problems or limitations.
804 root 1.3
805     =over 4
806    
807 root 1.5 =item JSON 1.07
808 root 1.3
809     Slow (but very portable, as it is written in pure Perl).
810    
811 root 1.68 Undocumented/buggy Unicode handling (how JSON handles Unicode values is
812     undocumented. One can get far by feeding it Unicode strings and doing
813     en-/decoding oneself, but Unicode escapes are not working properly).
814 root 1.3
815 root 1.69 No round-tripping (strings get clobbered if they look like numbers, e.g.
816 root 1.3 the string C<2.0> will encode to C<2.0> instead of C<"2.0">, and that will
817     decode into the number 2.
818    
819 root 1.5 =item JSON::PC 0.01
820 root 1.3
821     Very fast.
822    
823     Undocumented/buggy Unicode handling.
824    
825 root 1.69 No round-tripping.
826 root 1.3
827 root 1.4 Has problems handling many Perl values (e.g. regex results and other magic
828     values will make it croak).
829 root 1.3
830     Does not even generate valid JSON (C<{1,2}> gets converted to C<{1:2}>
831 root 1.16 which is not a valid JSON text.
832 root 1.3
833     Unmaintained (maintainer unresponsive for many months, bugs are not
834     getting fixed).
835    
836 root 1.5 =item JSON::Syck 0.21
837 root 1.3
838     Very buggy (often crashes).
839    
840 root 1.4 Very inflexible (no human-readable format supported, format pretty much
841     undocumented. I need at least a format for easy reading by humans and a
842     single-line compact format for use in a protocol, and preferably a way to
843 root 1.16 generate ASCII-only JSON texts).
844 root 1.3
845 root 1.68 Completely broken (and confusingly documented) Unicode handling (Unicode
846 root 1.3 escapes are not working properly, you need to set ImplicitUnicode to
847     I<different> values on en- and decoding to get symmetric behaviour).
848    
849 root 1.69 No round-tripping (simple cases work, but this depends on whether the scalar
850 root 1.3 value was used in a numeric context or not).
851    
852     Dumping hashes may skip hash values depending on iterator state.
853    
854     Unmaintained (maintainer unresponsive for many months, bugs are not
855     getting fixed).
856    
857     Does not check input for validity (i.e. will accept non-JSON input and
858     return "something" instead of raising an exception. This is a security
859 root 1.68 issue: imagine two banks transferring money between each other using
860 root 1.3 JSON. One bank might parse a given non-JSON request and deduct money,
861     while the other might reject the transaction with a syntax error. While a
862     good protocol will at least recover, that is extra unnecessary work and
863     the transaction will still not succeed).
864    
865 root 1.5 =item JSON::DWIW 0.04
866 root 1.3
867     Very fast. Very natural. Very nice.
868    
869 root 1.68 Undocumented Unicode handling (but the best of the pack. Unicode escapes
870 root 1.3 still don't get parsed properly).
871    
872     Very inflexible.
873    
874 root 1.69 No round-tripping.
875 root 1.3
876 root 1.16 Does not generate valid JSON texts (key strings are often unquoted, empty keys
877 root 1.4 result in nothing being output)
878    
879 root 1.3 Does not check input for validity.
880    
881     =back
882    
883 root 1.39
884     =head2 JSON and YAML
885    
886     You often hear that JSON is a subset (or a close subset) of YAML. This is,
887     however, a mass hysteria and very far from the truth. In general, there is
888     no way to configure JSON::XS to output a data structure as valid YAML.
889    
890 root 1.41 If you really must use JSON::XS to generate YAML, you should use this
891 root 1.39 algorithm (subject to change in future versions):
892    
893     my $to_yaml = JSON::XS->new->utf8->space_after (1);
894     my $yaml = $to_yaml->encode ($ref) . "\n";
895    
896     This will usually generate JSON texts that also parse as valid
897 root 1.41 YAML. Please note that YAML has hardcoded limits on (simple) object key
898     lengths that JSON doesn't have, so you should make sure that your hash
899 root 1.68 keys are noticeably shorter than the 1024 characters YAML allows.
900 root 1.39
901     There might be other incompatibilities that I am not aware of. In general
902     you should not try to generate YAML with a JSON generator or vice versa,
903 root 1.41 or try to parse JSON with a YAML parser or vice versa: chances are high
904     that you will run into severe interoperability problems.
905 root 1.39
906    
907 root 1.3 =head2 SPEED
908    
909 root 1.4 It seems that JSON::XS is surprisingly fast, as shown in the following
910     tables. They have been generated with the help of the C<eg/bench> program
911     in the JSON::XS distribution, to make it easy to compare on your own
912     system.
913    
914 root 1.37 First comes a comparison between various modules using a very short
915     single-line JSON string:
916 root 1.18
917 root 1.37 {"method": "handleMessage", "params": ["user1", "we were just talking"], \
918 root 1.38 "id": null, "array":[1,11,234,-5,1e5,1e7, true, false]}
919 root 1.18
920 root 1.39 It shows the number of encodes/decodes per second (JSON::XS uses
921     the functional interface, while JSON::XS/2 uses the OO interface
922     with pretty-printing and hashkey sorting enabled, JSON::XS/3 enables
923     shrink). Higher is better:
924 root 1.4
925     module | encode | decode |
926     -----------|------------|------------|
927 root 1.72 JSON 1.x | 4990.842 | 4088.813 |
928 root 1.48 JSON::DWIW | 51653.990 | 71575.154 |
929     JSON::PC | 65948.176 | 74631.744 |
930     JSON::PP | 8931.652 | 3817.168 |
931     JSON::Syck | 24877.248 | 27776.848 |
932     JSON::XS | 388361.481 | 227951.304 |
933     JSON::XS/2 | 227951.304 | 218453.333 |
934     JSON::XS/3 | 338250.323 | 218453.333 |
935     Storable | 16500.016 | 135300.129 |
936 root 1.4 -----------+------------+------------+
937    
938 root 1.37 That is, JSON::XS is about five times faster than JSON::DWIW on encoding,
939 root 1.68 about three times faster on decoding, and over forty times faster
940 root 1.37 than JSON, even with pretty-printing and key sorting. It also compares
941     favourably to Storable for small amounts of data.
942 root 1.4
943 root 1.13 Using a longer test string (roughly 18KB, generated from Yahoo! Locals
944 root 1.4 search API (http://nanoref.com/yahooapis/mgPdGg):
945    
946     module | encode | decode |
947     -----------|------------|------------|
948 root 1.72 JSON 1.x | 55.260 | 34.971 |
949 root 1.48 JSON::DWIW | 825.228 | 1082.513 |
950     JSON::PC | 3571.444 | 2394.829 |
951     JSON::PP | 210.987 | 32.574 |
952     JSON::Syck | 552.551 | 787.544 |
953     JSON::XS | 5780.463 | 4854.519 |
954     JSON::XS/2 | 3869.998 | 4798.975 |
955     JSON::XS/3 | 5862.880 | 4798.975 |
956     Storable | 4445.002 | 5235.027 |
957 root 1.4 -----------+------------+------------+
958    
959 root 1.40 Again, JSON::XS leads by far (except for Storable which non-surprisingly
960     decodes faster).
961 root 1.4
962 root 1.68 On large strings containing lots of high Unicode characters, some modules
963 root 1.18 (such as JSON::PC) seem to decode faster than JSON::XS, but the result
964 root 1.68 will be broken due to missing (or wrong) Unicode handling. Others refuse
965 root 1.18 to decode or encode properly, so it was impossible to prepare a fair
966     comparison table for that case.
967 root 1.13
968 root 1.11
969 root 1.23 =head1 SECURITY CONSIDERATIONS
970    
971     When you are using JSON in a protocol, talking to untrusted potentially
972     hostile creatures requires relatively few measures.
973    
974     First of all, your JSON decoder should be secure, that is, should not have
975     any buffer overflows. Obviously, this module should ensure that and I am
976     trying hard on making that true, but you never know.
977    
978     Second, you need to avoid resource-starving attacks. That means you should
979     limit the size of JSON texts you accept, or make sure then when your
980 root 1.68 resources run out, that's just fine (e.g. by using a separate process that
981 root 1.23 can crash safely). The size of a JSON text in octets or characters is
982     usually a good indication of the size of the resources required to decode
983 root 1.47 it into a Perl structure. While JSON::XS can check the size of the JSON
984     text, it might be too late when you already have it in memory, so you
985     might want to check the size before you accept the string.
986 root 1.23
987     Third, JSON::XS recurses using the C stack when decoding objects and
988     arrays. The C stack is a limited resource: for instance, on my amd64
989 root 1.28 machine with 8MB of stack size I can decode around 180k nested arrays but
990     only 14k nested JSON objects (due to perl itself recursing deeply on croak
991     to free the temporary). If that is exceeded, the program crashes. to be
992     conservative, the default nesting limit is set to 512. If your process
993     has a smaller stack, you should adjust this setting accordingly with the
994     C<max_depth> method.
995 root 1.23
996     And last but least, something else could bomb you that I forgot to think
997 root 1.30 of. In that case, you get to keep the pieces. I am always open for hints,
998 root 1.23 though...
999    
1000 root 1.42 If you are using JSON::XS to return packets to consumption
1001 root 1.68 by JavaScript scripts in a browser you should have a look at
1002     L<http://jpsykes.com/47/practical-csrf-and-json-security> to see whether
1003 root 1.42 you are vulnerable to some common attack vectors (which really are browser
1004     design bugs, but it is still you who will have to deal with it, as major
1005     browser developers care only for features, not about doing security
1006     right).
1007    
1008 root 1.11
1009 root 1.64 =head1 THREADS
1010    
1011 root 1.68 This module is I<not> guaranteed to be thread safe and there are no
1012 root 1.64 plans to change this until Perl gets thread support (as opposed to the
1013     horribly slow so-called "threads" which are simply slow and bloated
1014     process simulations - use fork, its I<much> faster, cheaper, better).
1015    
1016 root 1.68 (It might actually work, but you have been warned).
1017 root 1.64
1018    
1019 root 1.4 =head1 BUGS
1020    
1021     While the goal of this module is to be correct, that unfortunately does
1022     not mean its bug-free, only that I think its design is bug-free. It is
1023 root 1.23 still relatively early in its development. If you keep reporting bugs they
1024     will be fixed swiftly, though.
1025 root 1.4
1026 root 1.64 Please refrain from using rt.cpan.org or any other bug reporting
1027     service. I put the contact address into my modules for a reason.
1028    
1029 root 1.2 =cut
1030    
1031 root 1.53 our $true = do { bless \(my $dummy = 1), "JSON::XS::Boolean" };
1032     our $false = do { bless \(my $dummy = 0), "JSON::XS::Boolean" };
1033 root 1.43
1034     sub true() { $true }
1035     sub false() { $false }
1036    
1037     sub is_bool($) {
1038     UNIVERSAL::isa $_[0], "JSON::XS::Boolean"
1039 root 1.44 # or UNIVERSAL::isa $_[0], "JSON::Literal"
1040 root 1.43 }
1041    
1042     XSLoader::load "JSON::XS", $VERSION;
1043    
1044     package JSON::XS::Boolean;
1045    
1046     use overload
1047     "0+" => sub { ${$_[0]} },
1048     "++" => sub { $_[0] = ${$_[0]} + 1 },
1049     "--" => sub { $_[0] = ${$_[0]} - 1 },
1050     fallback => 1;
1051 root 1.25
1052 root 1.2 1;
1053    
1054 root 1.1 =head1 AUTHOR
1055    
1056     Marc Lehmann <schmorp@schmorp.de>
1057     http://home.schmorp.de/
1058    
1059     =cut
1060