ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/JSON-XS/XS.pm
Revision: 1.78
Committed: Wed Dec 5 10:59:28 2007 UTC (16 years, 5 months ago) by root
Branch: MAIN
CVS Tags: rel-2_01
Changes since 1.77: +25 -15 lines
Log Message:
*** empty log message ***

File Contents

# Content
1 =head1 NAME
2
3 JSON::XS - JSON serialising/deserialising, done correctly and fast
4
5 JSON::XS - 正しくて高速な JSON シリアライザ/デシリアライザ
6 (http://fleur.hio.jp/perldoc/mix/lib/JSON/XS.html)
7
8 =head1 SYNOPSIS
9
10 use JSON::XS;
11
12 # exported functions, they croak on error
13 # and expect/generate UTF-8
14
15 $utf8_encoded_json_text = encode_json $perl_hash_or_arrayref;
16 $perl_hash_or_arrayref = decode_json $utf8_encoded_json_text;
17
18 # OO-interface
19
20 $coder = JSON::XS->new->ascii->pretty->allow_nonref;
21 $pretty_printed_unencoded = $coder->encode ($perl_scalar);
22 $perl_scalar = $coder->decode ($unicode_json_text);
23
24 # Note that JSON version 2.0 and above will automatically use JSON::XS
25 # if available, at virtually no speed overhead either, so you should
26 # be able to just:
27
28 use JSON;
29
30 # and do the same things, except that you have a pure-perl fallback now.
31
32 =head1 DESCRIPTION
33
34 This module converts Perl data structures to JSON and vice versa. Its
35 primary goal is to be I<correct> and its secondary goal is to be
36 I<fast>. To reach the latter goal it was written in C.
37
38 Beginning with version 2.0 of the JSON module, when both JSON and
39 JSON::XS are installed, then JSON will fall back on JSON::XS (this can be
40 overriden) with no overhead due to emulation (by inheritign constructor
41 and methods). If JSON::XS is not available, it will fall back to the
42 compatible JSON::PP module as backend, so using JSON instead of JSON::XS
43 gives you a portable JSON API that can be fast when you need and doesn't
44 require a C compiler when that is a problem.
45
46 As this is the n-th-something JSON module on CPAN, what was the reason
47 to write yet another JSON module? While it seems there are many JSON
48 modules, none of them correctly handle all corner cases, and in most cases
49 their maintainers are unresponsive, gone missing, or not listening to bug
50 reports for other reasons.
51
52 See COMPARISON, below, for a comparison to some other JSON modules.
53
54 See MAPPING, below, on how JSON::XS maps perl values to JSON values and
55 vice versa.
56
57 =head2 FEATURES
58
59 =over 4
60
61 =item * correct Unicode handling
62
63 This module knows how to handle Unicode, and even documents how and when
64 it does so.
65
66 =item * round-trip integrity
67
68 When you serialise a perl data structure using only datatypes supported
69 by JSON, the deserialised data structure is identical on the Perl level.
70 (e.g. the string "2.0" doesn't suddenly become "2" just because it looks
71 like a number).
72
73 =item * strict checking of JSON correctness
74
75 There is no guessing, no generating of illegal JSON texts by default,
76 and only JSON is accepted as input by default (the latter is a security
77 feature).
78
79 =item * fast
80
81 Compared to other JSON modules, this module compares favourably in terms
82 of speed, too.
83
84 =item * simple to use
85
86 This module has both a simple functional interface as well as an OO
87 interface.
88
89 =item * reasonably versatile output formats
90
91 You can choose between the most compact guaranteed single-line format
92 possible (nice for simple line-based protocols), a pure-ascii format
93 (for when your transport is not 8-bit clean, still supports the whole
94 Unicode range), or a pretty-printed format (for when you want to read that
95 stuff). Or you can combine those features in whatever way you like.
96
97 =back
98
99 =cut
100
101 package JSON::XS;
102
103 use strict;
104
105 our $VERSION = '2.01';
106 our @ISA = qw(Exporter);
107
108 our @EXPORT = qw(encode_json decode_json to_json from_json);
109
110 sub to_json($) {
111 require Carp;
112 Carp::croak ("JSON::XS::to_json has been renamed to encode_json, either downgrade to pre-2.0 versions of JSON::XS or rename the call");
113 }
114
115 sub from_json($) {
116 require Carp;
117 Carp::croak ("JSON::XS::from_json has been renamed to decode_json, either downgrade to pre-2.0 versions of JSON::XS or rename the call");
118 }
119
120 use Exporter;
121 use XSLoader;
122
123 =head1 FUNCTIONAL INTERFACE
124
125 The following convenience methods are provided by this module. They are
126 exported by default:
127
128 =over 4
129
130 =item $json_text = encode_json $perl_scalar
131
132 Converts the given Perl data structure to a UTF-8 encoded, binary string
133 (that is, the string contains octets only). Croaks on error.
134
135 This function call is functionally identical to:
136
137 $json_text = JSON::XS->new->utf8->encode ($perl_scalar)
138
139 except being faster.
140
141 =item $perl_scalar = decode_json $json_text
142
143 The opposite of C<encode_json>: expects an UTF-8 (binary) string and tries
144 to parse that as an UTF-8 encoded JSON text, returning the resulting
145 reference. Croaks on error.
146
147 This function call is functionally identical to:
148
149 $perl_scalar = JSON::XS->new->utf8->decode ($json_text)
150
151 except being faster.
152
153 =item $is_boolean = JSON::XS::is_bool $scalar
154
155 Returns true if the passed scalar represents either JSON::XS::true or
156 JSON::XS::false, two constants that act like C<1> and C<0>, respectively
157 and are used to represent JSON C<true> and C<false> values in Perl.
158
159 See MAPPING, below, for more information on how JSON values are mapped to
160 Perl.
161
162 =back
163
164
165 =head1 A FEW NOTES ON UNICODE AND PERL
166
167 Since this often leads to confusion, here are a few very clear words on
168 how Unicode works in Perl, modulo bugs.
169
170 =over 4
171
172 =item 1. Perl strings can store characters with ordinal values > 255.
173
174 This enables you to store Unicode characters as single characters in a
175 Perl string - very natural.
176
177 =item 2. Perl does I<not> associate an encoding with your strings.
178
179 Unless you force it to, e.g. when matching it against a regex, or printing
180 the scalar to a file, in which case Perl either interprets your string as
181 locale-encoded text, octets/binary, or as Unicode, depending on various
182 settings. In no case is an encoding stored together with your data, it is
183 I<use> that decides encoding, not any magical metadata.
184
185 =item 3. The internal utf-8 flag has no meaning with regards to the
186 encoding of your string.
187
188 Just ignore that flag unless you debug a Perl bug, a module written in
189 XS or want to dive into the internals of perl. Otherwise it will only
190 confuse you, as, despite the name, it says nothing about how your string
191 is encoded. You can have Unicode strings with that flag set, with that
192 flag clear, and you can have binary data with that flag set and that flag
193 clear. Other possibilities exist, too.
194
195 If you didn't know about that flag, just the better, pretend it doesn't
196 exist.
197
198 =item 4. A "Unicode String" is simply a string where each character can be
199 validly interpreted as a Unicode codepoint.
200
201 If you have UTF-8 encoded data, it is no longer a Unicode string, but a
202 Unicode string encoded in UTF-8, giving you a binary string.
203
204 =item 5. A string containing "high" (> 255) character values is I<not> a UTF-8 string.
205
206 It's a fact. Learn to live with it.
207
208 =back
209
210 I hope this helps :)
211
212
213 =head1 OBJECT-ORIENTED INTERFACE
214
215 The object oriented interface lets you configure your own encoding or
216 decoding style, within the limits of supported formats.
217
218 =over 4
219
220 =item $json = new JSON::XS
221
222 Creates a new JSON::XS object that can be used to de/encode JSON
223 strings. All boolean flags described below are by default I<disabled>.
224
225 The mutators for flags all return the JSON object again and thus calls can
226 be chained:
227
228 my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]})
229 => {"a": [1, 2]}
230
231 =item $json = $json->ascii ([$enable])
232
233 =item $enabled = $json->get_ascii
234
235 If C<$enable> is true (or missing), then the C<encode> method will not
236 generate characters outside the code range C<0..127> (which is ASCII). Any
237 Unicode characters outside that range will be escaped using either a
238 single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL escape sequence,
239 as per RFC4627. The resulting encoded JSON text can be treated as a native
240 Unicode string, an ascii-encoded, latin1-encoded or UTF-8 encoded string,
241 or any other superset of ASCII.
242
243 If C<$enable> is false, then the C<encode> method will not escape Unicode
244 characters unless required by the JSON syntax or other flags. This results
245 in a faster and more compact format.
246
247 The main use for this flag is to produce JSON texts that can be
248 transmitted over a 7-bit channel, as the encoded JSON texts will not
249 contain any 8 bit characters.
250
251 JSON::XS->new->ascii (1)->encode ([chr 0x10401])
252 => ["\ud801\udc01"]
253
254 =item $json = $json->latin1 ([$enable])
255
256 =item $enabled = $json->get_latin1
257
258 If C<$enable> is true (or missing), then the C<encode> method will encode
259 the resulting JSON text as latin1 (or iso-8859-1), escaping any characters
260 outside the code range C<0..255>. The resulting string can be treated as a
261 latin1-encoded JSON text or a native Unicode string. The C<decode> method
262 will not be affected in any way by this flag, as C<decode> by default
263 expects Unicode, which is a strict superset of latin1.
264
265 If C<$enable> is false, then the C<encode> method will not escape Unicode
266 characters unless required by the JSON syntax or other flags.
267
268 The main use for this flag is efficiently encoding binary data as JSON
269 text, as most octets will not be escaped, resulting in a smaller encoded
270 size. The disadvantage is that the resulting JSON text is encoded
271 in latin1 (and must correctly be treated as such when storing and
272 transferring), a rare encoding for JSON. It is therefore most useful when
273 you want to store data structures known to contain binary data efficiently
274 in files or databases, not when talking to other JSON encoders/decoders.
275
276 JSON::XS->new->latin1->encode (["\x{89}\x{abc}"]
277 => ["\x{89}\\u0abc"] # (perl syntax, U+abc escaped, U+89 not)
278
279 =item $json = $json->utf8 ([$enable])
280
281 =item $enabled = $json->get_utf8
282
283 If C<$enable> is true (or missing), then the C<encode> method will encode
284 the JSON result into UTF-8, as required by many protocols, while the
285 C<decode> method expects to be handled an UTF-8-encoded string. Please
286 note that UTF-8-encoded strings do not contain any characters outside the
287 range C<0..255>, they are thus useful for bytewise/binary I/O. In future
288 versions, enabling this option might enable autodetection of the UTF-16
289 and UTF-32 encoding families, as described in RFC4627.
290
291 If C<$enable> is false, then the C<encode> method will return the JSON
292 string as a (non-encoded) Unicode string, while C<decode> expects thus a
293 Unicode string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs
294 to be done yourself, e.g. using the Encode module.
295
296 Example, output UTF-16BE-encoded JSON:
297
298 use Encode;
299 $jsontext = encode "UTF-16BE", JSON::XS->new->encode ($object);
300
301 Example, decode UTF-32LE-encoded JSON:
302
303 use Encode;
304 $object = JSON::XS->new->decode (decode "UTF-32LE", $jsontext);
305
306 =item $json = $json->pretty ([$enable])
307
308 This enables (or disables) all of the C<indent>, C<space_before> and
309 C<space_after> (and in the future possibly more) flags in one call to
310 generate the most readable (or most compact) form possible.
311
312 Example, pretty-print some simple structure:
313
314 my $json = JSON::XS->new->pretty(1)->encode ({a => [1,2]})
315 =>
316 {
317 "a" : [
318 1,
319 2
320 ]
321 }
322
323 =item $json = $json->indent ([$enable])
324
325 =item $enabled = $json->get_indent
326
327 If C<$enable> is true (or missing), then the C<encode> method will use a multiline
328 format as output, putting every array member or object/hash key-value pair
329 into its own line, indenting them properly.
330
331 If C<$enable> is false, no newlines or indenting will be produced, and the
332 resulting JSON text is guaranteed not to contain any C<newlines>.
333
334 This setting has no effect when decoding JSON texts.
335
336 =item $json = $json->space_before ([$enable])
337
338 =item $enabled = $json->get_space_before
339
340 If C<$enable> is true (or missing), then the C<encode> method will add an extra
341 optional space before the C<:> separating keys from values in JSON objects.
342
343 If C<$enable> is false, then the C<encode> method will not add any extra
344 space at those places.
345
346 This setting has no effect when decoding JSON texts. You will also
347 most likely combine this setting with C<space_after>.
348
349 Example, space_before enabled, space_after and indent disabled:
350
351 {"key" :"value"}
352
353 =item $json = $json->space_after ([$enable])
354
355 =item $enabled = $json->get_space_after
356
357 If C<$enable> is true (or missing), then the C<encode> method will add an extra
358 optional space after the C<:> separating keys from values in JSON objects
359 and extra whitespace after the C<,> separating key-value pairs and array
360 members.
361
362 If C<$enable> is false, then the C<encode> method will not add any extra
363 space at those places.
364
365 This setting has no effect when decoding JSON texts.
366
367 Example, space_before and indent disabled, space_after enabled:
368
369 {"key": "value"}
370
371 =item $json = $json->relaxed ([$enable])
372
373 =item $enabled = $json->get_relaxed
374
375 If C<$enable> is true (or missing), then C<decode> will accept some
376 extensions to normal JSON syntax (see below). C<encode> will not be
377 affected in anyway. I<Be aware that this option makes you accept invalid
378 JSON texts as if they were valid!>. I suggest only to use this option to
379 parse application-specific files written by humans (configuration files,
380 resource files etc.)
381
382 If C<$enable> is false (the default), then C<decode> will only accept
383 valid JSON texts.
384
385 Currently accepted extensions are:
386
387 =over 4
388
389 =item * list items can have an end-comma
390
391 JSON I<separates> array elements and key-value pairs with commas. This
392 can be annoying if you write JSON texts manually and want to be able to
393 quickly append elements, so this extension accepts comma at the end of
394 such items not just between them:
395
396 [
397 1,
398 2, <- this comma not normally allowed
399 ]
400 {
401 "k1": "v1",
402 "k2": "v2", <- this comma not normally allowed
403 }
404
405 =item * shell-style '#'-comments
406
407 Whenever JSON allows whitespace, shell-style comments are additionally
408 allowed. They are terminated by the first carriage-return or line-feed
409 character, after which more white-space and comments are allowed.
410
411 [
412 1, # this comment not allowed in JSON
413 # neither this one...
414 ]
415
416 =back
417
418 =item $json = $json->canonical ([$enable])
419
420 =item $enabled = $json->get_canonical
421
422 If C<$enable> is true (or missing), then the C<encode> method will output JSON objects
423 by sorting their keys. This is adding a comparatively high overhead.
424
425 If C<$enable> is false, then the C<encode> method will output key-value
426 pairs in the order Perl stores them (which will likely change between runs
427 of the same script).
428
429 This option is useful if you want the same data structure to be encoded as
430 the same JSON text (given the same overall settings). If it is disabled,
431 the same hash might be encoded differently even if contains the same data,
432 as key-value pairs have no inherent ordering in Perl.
433
434 This setting has no effect when decoding JSON texts.
435
436 =item $json = $json->allow_nonref ([$enable])
437
438 =item $enabled = $json->get_allow_nonref
439
440 If C<$enable> is true (or missing), then the C<encode> method can convert a
441 non-reference into its corresponding string, number or null JSON value,
442 which is an extension to RFC4627. Likewise, C<decode> will accept those JSON
443 values instead of croaking.
444
445 If C<$enable> is false, then the C<encode> method will croak if it isn't
446 passed an arrayref or hashref, as JSON texts must either be an object
447 or array. Likewise, C<decode> will croak if given something that is not a
448 JSON object or array.
449
450 Example, encode a Perl scalar as JSON value with enabled C<allow_nonref>,
451 resulting in an invalid JSON text:
452
453 JSON::XS->new->allow_nonref->encode ("Hello, World!")
454 => "Hello, World!"
455
456 =item $json = $json->allow_blessed ([$enable])
457
458 =item $enabled = $json->get_allow_blessed
459
460 If C<$enable> is true (or missing), then the C<encode> method will not
461 barf when it encounters a blessed reference. Instead, the value of the
462 B<convert_blessed> option will decide whether C<null> (C<convert_blessed>
463 disabled or no C<TO_JSON> method found) or a representation of the
464 object (C<convert_blessed> enabled and C<TO_JSON> method found) is being
465 encoded. Has no effect on C<decode>.
466
467 If C<$enable> is false (the default), then C<encode> will throw an
468 exception when it encounters a blessed object.
469
470 =item $json = $json->convert_blessed ([$enable])
471
472 =item $enabled = $json->get_convert_blessed
473
474 If C<$enable> is true (or missing), then C<encode>, upon encountering a
475 blessed object, will check for the availability of the C<TO_JSON> method
476 on the object's class. If found, it will be called in scalar context
477 and the resulting scalar will be encoded instead of the object. If no
478 C<TO_JSON> method is found, the value of C<allow_blessed> will decide what
479 to do.
480
481 The C<TO_JSON> method may safely call die if it wants. If C<TO_JSON>
482 returns other blessed objects, those will be handled in the same
483 way. C<TO_JSON> must take care of not causing an endless recursion cycle
484 (== crash) in this case. The name of C<TO_JSON> was chosen because other
485 methods called by the Perl core (== not by the user of the object) are
486 usually in upper case letters and to avoid collisions with any C<to_json>
487 function or method.
488
489 This setting does not yet influence C<decode> in any way, but in the
490 future, global hooks might get installed that influence C<decode> and are
491 enabled by this setting.
492
493 If C<$enable> is false, then the C<allow_blessed> setting will decide what
494 to do when a blessed object is found.
495
496 =item $json = $json->filter_json_object ([$coderef->($hashref)])
497
498 When C<$coderef> is specified, it will be called from C<decode> each
499 time it decodes a JSON object. The only argument is a reference to the
500 newly-created hash. If the code references returns a single scalar (which
501 need not be a reference), this value (i.e. a copy of that scalar to avoid
502 aliasing) is inserted into the deserialised data structure. If it returns
503 an empty list (NOTE: I<not> C<undef>, which is a valid scalar), the
504 original deserialised hash will be inserted. This setting can slow down
505 decoding considerably.
506
507 When C<$coderef> is omitted or undefined, any existing callback will
508 be removed and C<decode> will not change the deserialised hash in any
509 way.
510
511 Example, convert all JSON objects into the integer 5:
512
513 my $js = JSON::XS->new->filter_json_object (sub { 5 });
514 # returns [5]
515 $js->decode ('[{}]')
516 # throw an exception because allow_nonref is not enabled
517 # so a lone 5 is not allowed.
518 $js->decode ('{"a":1, "b":2}');
519
520 =item $json = $json->filter_json_single_key_object ($key [=> $coderef->($value)])
521
522 Works remotely similar to C<filter_json_object>, but is only called for
523 JSON objects having a single key named C<$key>.
524
525 This C<$coderef> is called before the one specified via
526 C<filter_json_object>, if any. It gets passed the single value in the JSON
527 object. If it returns a single value, it will be inserted into the data
528 structure. If it returns nothing (not even C<undef> but the empty list),
529 the callback from C<filter_json_object> will be called next, as if no
530 single-key callback were specified.
531
532 If C<$coderef> is omitted or undefined, the corresponding callback will be
533 disabled. There can only ever be one callback for a given key.
534
535 As this callback gets called less often then the C<filter_json_object>
536 one, decoding speed will not usually suffer as much. Therefore, single-key
537 objects make excellent targets to serialise Perl objects into, especially
538 as single-key JSON objects are as close to the type-tagged value concept
539 as JSON gets (it's basically an ID/VALUE tuple). Of course, JSON does not
540 support this in any way, so you need to make sure your data never looks
541 like a serialised Perl hash.
542
543 Typical names for the single object key are C<__class_whatever__>, or
544 C<$__dollars_are_rarely_used__$> or C<}ugly_brace_placement>, or even
545 things like C<__class_md5sum(classname)__>, to reduce the risk of clashing
546 with real hashes.
547
548 Example, decode JSON objects of the form C<< { "__widget__" => <id> } >>
549 into the corresponding C<< $WIDGET{<id>} >> object:
550
551 # return whatever is in $WIDGET{5}:
552 JSON::XS
553 ->new
554 ->filter_json_single_key_object (__widget__ => sub {
555 $WIDGET{ $_[0] }
556 })
557 ->decode ('{"__widget__": 5')
558
559 # this can be used with a TO_JSON method in some "widget" class
560 # for serialisation to json:
561 sub WidgetBase::TO_JSON {
562 my ($self) = @_;
563
564 unless ($self->{id}) {
565 $self->{id} = ..get..some..id..;
566 $WIDGET{$self->{id}} = $self;
567 }
568
569 { __widget__ => $self->{id} }
570 }
571
572 =item $json = $json->shrink ([$enable])
573
574 =item $enabled = $json->get_shrink
575
576 Perl usually over-allocates memory a bit when allocating space for
577 strings. This flag optionally resizes strings generated by either
578 C<encode> or C<decode> to their minimum size possible. This can save
579 memory when your JSON texts are either very very long or you have many
580 short strings. It will also try to downgrade any strings to octet-form
581 if possible: perl stores strings internally either in an encoding called
582 UTF-X or in octet-form. The latter cannot store everything but uses less
583 space in general (and some buggy Perl or C code might even rely on that
584 internal representation being used).
585
586 The actual definition of what shrink does might change in future versions,
587 but it will always try to save space at the expense of time.
588
589 If C<$enable> is true (or missing), the string returned by C<encode> will
590 be shrunk-to-fit, while all strings generated by C<decode> will also be
591 shrunk-to-fit.
592
593 If C<$enable> is false, then the normal perl allocation algorithms are used.
594 If you work with your data, then this is likely to be faster.
595
596 In the future, this setting might control other things, such as converting
597 strings that look like integers or floats into integers or floats
598 internally (there is no difference on the Perl level), saving space.
599
600 =item $json = $json->max_depth ([$maximum_nesting_depth])
601
602 =item $max_depth = $json->get_max_depth
603
604 Sets the maximum nesting level (default C<512>) accepted while encoding
605 or decoding. If the JSON text or Perl data structure has an equal or
606 higher nesting level then this limit, then the encoder and decoder will
607 stop and croak at that point.
608
609 Nesting level is defined by number of hash- or arrayrefs that the encoder
610 needs to traverse to reach a given point or the number of C<{> or C<[>
611 characters without their matching closing parenthesis crossed to reach a
612 given character in a string.
613
614 Setting the maximum depth to one disallows any nesting, so that ensures
615 that the object is only a single hash/object or array.
616
617 The argument to C<max_depth> will be rounded up to the next highest power
618 of two. If no argument is given, the highest possible setting will be
619 used, which is rarely useful.
620
621 See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
622
623 =item $json = $json->max_size ([$maximum_string_size])
624
625 =item $max_size = $json->get_max_size
626
627 Set the maximum length a JSON text may have (in bytes) where decoding is
628 being attempted. The default is C<0>, meaning no limit. When C<decode>
629 is called on a string longer then this number of characters it will not
630 attempt to decode the string but throw an exception. This setting has no
631 effect on C<encode> (yet).
632
633 The argument to C<max_size> will be rounded up to the next B<highest>
634 power of two (so may be more than requested). If no argument is given, the
635 limit check will be deactivated (same as when C<0> is specified).
636
637 See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
638
639 =item $json_text = $json->encode ($perl_scalar)
640
641 Converts the given Perl data structure (a simple scalar or a reference
642 to a hash or array) to its JSON representation. Simple scalars will be
643 converted into JSON string or number sequences, while references to arrays
644 become JSON arrays and references to hashes become JSON objects. Undefined
645 Perl values (e.g. C<undef>) become JSON C<null> values. Neither C<true>
646 nor C<false> values will be generated.
647
648 =item $perl_scalar = $json->decode ($json_text)
649
650 The opposite of C<encode>: expects a JSON text and tries to parse it,
651 returning the resulting simple scalar or reference. Croaks on error.
652
653 JSON numbers and strings become simple Perl scalars. JSON arrays become
654 Perl arrayrefs and JSON objects become Perl hashrefs. C<true> becomes
655 C<1>, C<false> becomes C<0> and C<null> becomes C<undef>.
656
657 =item ($perl_scalar, $characters) = $json->decode_prefix ($json_text)
658
659 This works like the C<decode> method, but instead of raising an exception
660 when there is trailing garbage after the first JSON object, it will
661 silently stop parsing there and return the number of characters consumed
662 so far.
663
664 This is useful if your JSON texts are not delimited by an outer protocol
665 (which is not the brightest thing to do in the first place) and you need
666 to know where the JSON text ends.
667
668 JSON::XS->new->decode_prefix ("[1] the tail")
669 => ([], 3)
670
671 =back
672
673
674 =head1 MAPPING
675
676 This section describes how JSON::XS maps Perl values to JSON values and
677 vice versa. These mappings are designed to "do the right thing" in most
678 circumstances automatically, preserving round-tripping characteristics
679 (what you put in comes out as something equivalent).
680
681 For the more enlightened: note that in the following descriptions,
682 lowercase I<perl> refers to the Perl interpreter, while uppercase I<Perl>
683 refers to the abstract Perl language itself.
684
685
686 =head2 JSON -> PERL
687
688 =over 4
689
690 =item object
691
692 A JSON object becomes a reference to a hash in Perl. No ordering of object
693 keys is preserved (JSON does not preserve object key ordering itself).
694
695 =item array
696
697 A JSON array becomes a reference to an array in Perl.
698
699 =item string
700
701 A JSON string becomes a string scalar in Perl - Unicode codepoints in JSON
702 are represented by the same codepoints in the Perl string, so no manual
703 decoding is necessary.
704
705 =item number
706
707 A JSON number becomes either an integer, numeric (floating point) or
708 string scalar in perl, depending on its range and any fractional parts. On
709 the Perl level, there is no difference between those as Perl handles all
710 the conversion details, but an integer may take slightly less memory and
711 might represent more values exactly than (floating point) numbers.
712
713 If the number consists of digits only, JSON::XS will try to represent
714 it as an integer value. If that fails, it will try to represent it as
715 a numeric (floating point) value if that is possible without loss of
716 precision. Otherwise it will preserve the number as a string value.
717
718 Numbers containing a fractional or exponential part will always be
719 represented as numeric (floating point) values, possibly at a loss of
720 precision.
721
722 This might create round-tripping problems as numbers might become strings,
723 but as Perl is typeless there is no other way to do it.
724
725 =item true, false
726
727 These JSON atoms become C<JSON::XS::true> and C<JSON::XS::false>,
728 respectively. They are overloaded to act almost exactly like the numbers
729 C<1> and C<0>. You can check whether a scalar is a JSON boolean by using
730 the C<JSON::XS::is_bool> function.
731
732 =item null
733
734 A JSON null atom becomes C<undef> in Perl.
735
736 =back
737
738
739 =head2 PERL -> JSON
740
741 The mapping from Perl to JSON is slightly more difficult, as Perl is a
742 truly typeless language, so we can only guess which JSON type is meant by
743 a Perl value.
744
745 =over 4
746
747 =item hash references
748
749 Perl hash references become JSON objects. As there is no inherent ordering
750 in hash keys (or JSON objects), they will usually be encoded in a
751 pseudo-random order that can change between runs of the same program but
752 stays generally the same within a single run of a program. JSON::XS can
753 optionally sort the hash keys (determined by the I<canonical> flag), so
754 the same datastructure will serialise to the same JSON text (given same
755 settings and version of JSON::XS), but this incurs a runtime overhead
756 and is only rarely useful, e.g. when you want to compare some JSON text
757 against another for equality.
758
759 =item array references
760
761 Perl array references become JSON arrays.
762
763 =item other references
764
765 Other unblessed references are generally not allowed and will cause an
766 exception to be thrown, except for references to the integers C<0> and
767 C<1>, which get turned into C<false> and C<true> atoms in JSON. You can
768 also use C<JSON::XS::false> and C<JSON::XS::true> to improve readability.
769
770 encode_json [\0,JSON::XS::true] # yields [false,true]
771
772 =item JSON::XS::true, JSON::XS::false
773
774 These special values become JSON true and JSON false values,
775 respectively. You can also use C<\1> and C<\0> directly if you want.
776
777 =item blessed objects
778
779 Blessed objects are not allowed. JSON::XS currently tries to encode their
780 underlying representation (hash- or arrayref), but this behaviour might
781 change in future versions.
782
783 =item simple scalars
784
785 Simple Perl scalars (any scalar that is not a reference) are the most
786 difficult objects to encode: JSON::XS will encode undefined scalars as
787 JSON null value, scalars that have last been used in a string context
788 before encoding as JSON strings and anything else as number value:
789
790 # dump as number
791 encode_json [2] # yields [2]
792 encode_json [-3.0e17] # yields [-3e+17]
793 my $value = 5; encode_json [$value] # yields [5]
794
795 # used as string, so dump as string
796 print $value;
797 encode_json [$value] # yields ["5"]
798
799 # undef becomes null
800 encode_json [undef] # yields [null]
801
802 You can force the type to be a JSON string by stringifying it:
803
804 my $x = 3.1; # some variable containing a number
805 "$x"; # stringified
806 $x .= ""; # another, more awkward way to stringify
807 print $x; # perl does it for you, too, quite often
808
809 You can force the type to be a JSON number by numifying it:
810
811 my $x = "3"; # some variable containing a string
812 $x += 0; # numify it, ensuring it will be dumped as a number
813 $x *= 1; # same thing, the choice is yours.
814
815 You can not currently force the type in other, less obscure, ways. Tell me
816 if you need this capability.
817
818 =back
819
820
821 =head1 COMPARISON
822
823 As already mentioned, this module was created because none of the existing
824 JSON modules could be made to work correctly. First I will describe the
825 problems (or pleasures) I encountered with various existing JSON modules,
826 followed by some benchmark values. JSON::XS was designed not to suffer
827 from any of these problems or limitations.
828
829 =over 4
830
831 =item JSON 1.07
832
833 Slow (but very portable, as it is written in pure Perl).
834
835 Undocumented/buggy Unicode handling (how JSON handles Unicode values is
836 undocumented. One can get far by feeding it Unicode strings and doing
837 en-/decoding oneself, but Unicode escapes are not working properly).
838
839 No round-tripping (strings get clobbered if they look like numbers, e.g.
840 the string C<2.0> will encode to C<2.0> instead of C<"2.0">, and that will
841 decode into the number 2.
842
843 =item JSON::PC 0.01
844
845 Very fast.
846
847 Undocumented/buggy Unicode handling.
848
849 No round-tripping.
850
851 Has problems handling many Perl values (e.g. regex results and other magic
852 values will make it croak).
853
854 Does not even generate valid JSON (C<{1,2}> gets converted to C<{1:2}>
855 which is not a valid JSON text.
856
857 Unmaintained (maintainer unresponsive for many months, bugs are not
858 getting fixed).
859
860 =item JSON::Syck 0.21
861
862 Very buggy (often crashes).
863
864 Very inflexible (no human-readable format supported, format pretty much
865 undocumented. I need at least a format for easy reading by humans and a
866 single-line compact format for use in a protocol, and preferably a way to
867 generate ASCII-only JSON texts).
868
869 Completely broken (and confusingly documented) Unicode handling (Unicode
870 escapes are not working properly, you need to set ImplicitUnicode to
871 I<different> values on en- and decoding to get symmetric behaviour).
872
873 No round-tripping (simple cases work, but this depends on whether the scalar
874 value was used in a numeric context or not).
875
876 Dumping hashes may skip hash values depending on iterator state.
877
878 Unmaintained (maintainer unresponsive for many months, bugs are not
879 getting fixed).
880
881 Does not check input for validity (i.e. will accept non-JSON input and
882 return "something" instead of raising an exception. This is a security
883 issue: imagine two banks transferring money between each other using
884 JSON. One bank might parse a given non-JSON request and deduct money,
885 while the other might reject the transaction with a syntax error. While a
886 good protocol will at least recover, that is extra unnecessary work and
887 the transaction will still not succeed).
888
889 =item JSON::DWIW 0.04
890
891 Very fast. Very natural. Very nice.
892
893 Undocumented Unicode handling (but the best of the pack. Unicode escapes
894 still don't get parsed properly).
895
896 Very inflexible.
897
898 No round-tripping.
899
900 Does not generate valid JSON texts (key strings are often unquoted, empty keys
901 result in nothing being output)
902
903 Does not check input for validity.
904
905 =back
906
907
908 =head2 JSON and YAML
909
910 You often hear that JSON is a subset (or a close subset) of YAML. This is,
911 however, a mass hysteria and very far from the truth. In general, there is
912 no way to configure JSON::XS to output a data structure as valid YAML.
913
914 If you really must use JSON::XS to generate YAML, you should use this
915 algorithm (subject to change in future versions):
916
917 my $to_yaml = JSON::XS->new->utf8->space_after (1);
918 my $yaml = $to_yaml->encode ($ref) . "\n";
919
920 This will usually generate JSON texts that also parse as valid
921 YAML. Please note that YAML has hardcoded limits on (simple) object key
922 lengths that JSON doesn't have, so you should make sure that your hash
923 keys are noticeably shorter than the 1024 characters YAML allows.
924
925 There might be other incompatibilities that I am not aware of. In general
926 you should not try to generate YAML with a JSON generator or vice versa,
927 or try to parse JSON with a YAML parser or vice versa: chances are high
928 that you will run into severe interoperability problems.
929
930
931 =head2 SPEED
932
933 It seems that JSON::XS is surprisingly fast, as shown in the following
934 tables. They have been generated with the help of the C<eg/bench> program
935 in the JSON::XS distribution, to make it easy to compare on your own
936 system.
937
938 First comes a comparison between various modules using a very short
939 single-line JSON string:
940
941 {"method": "handleMessage", "params": ["user1", "we were just talking"], \
942 "id": null, "array":[1,11,234,-5,1e5,1e7, true, false]}
943
944 It shows the number of encodes/decodes per second (JSON::XS uses
945 the functional interface, while JSON::XS/2 uses the OO interface
946 with pretty-printing and hashkey sorting enabled, JSON::XS/3 enables
947 shrink). Higher is better:
948
949 module | encode | decode |
950 -----------|------------|------------|
951 JSON 1.x | 4990.842 | 4088.813 |
952 JSON::DWIW | 51653.990 | 71575.154 |
953 JSON::PC | 65948.176 | 74631.744 |
954 JSON::PP | 8931.652 | 3817.168 |
955 JSON::Syck | 24877.248 | 27776.848 |
956 JSON::XS | 388361.481 | 227951.304 |
957 JSON::XS/2 | 227951.304 | 218453.333 |
958 JSON::XS/3 | 338250.323 | 218453.333 |
959 Storable | 16500.016 | 135300.129 |
960 -----------+------------+------------+
961
962 That is, JSON::XS is about five times faster than JSON::DWIW on encoding,
963 about three times faster on decoding, and over forty times faster
964 than JSON, even with pretty-printing and key sorting. It also compares
965 favourably to Storable for small amounts of data.
966
967 Using a longer test string (roughly 18KB, generated from Yahoo! Locals
968 search API (http://nanoref.com/yahooapis/mgPdGg):
969
970 module | encode | decode |
971 -----------|------------|------------|
972 JSON 1.x | 55.260 | 34.971 |
973 JSON::DWIW | 825.228 | 1082.513 |
974 JSON::PC | 3571.444 | 2394.829 |
975 JSON::PP | 210.987 | 32.574 |
976 JSON::Syck | 552.551 | 787.544 |
977 JSON::XS | 5780.463 | 4854.519 |
978 JSON::XS/2 | 3869.998 | 4798.975 |
979 JSON::XS/3 | 5862.880 | 4798.975 |
980 Storable | 4445.002 | 5235.027 |
981 -----------+------------+------------+
982
983 Again, JSON::XS leads by far (except for Storable which non-surprisingly
984 decodes faster).
985
986 On large strings containing lots of high Unicode characters, some modules
987 (such as JSON::PC) seem to decode faster than JSON::XS, but the result
988 will be broken due to missing (or wrong) Unicode handling. Others refuse
989 to decode or encode properly, so it was impossible to prepare a fair
990 comparison table for that case.
991
992
993 =head1 SECURITY CONSIDERATIONS
994
995 When you are using JSON in a protocol, talking to untrusted potentially
996 hostile creatures requires relatively few measures.
997
998 First of all, your JSON decoder should be secure, that is, should not have
999 any buffer overflows. Obviously, this module should ensure that and I am
1000 trying hard on making that true, but you never know.
1001
1002 Second, you need to avoid resource-starving attacks. That means you should
1003 limit the size of JSON texts you accept, or make sure then when your
1004 resources run out, that's just fine (e.g. by using a separate process that
1005 can crash safely). The size of a JSON text in octets or characters is
1006 usually a good indication of the size of the resources required to decode
1007 it into a Perl structure. While JSON::XS can check the size of the JSON
1008 text, it might be too late when you already have it in memory, so you
1009 might want to check the size before you accept the string.
1010
1011 Third, JSON::XS recurses using the C stack when decoding objects and
1012 arrays. The C stack is a limited resource: for instance, on my amd64
1013 machine with 8MB of stack size I can decode around 180k nested arrays but
1014 only 14k nested JSON objects (due to perl itself recursing deeply on croak
1015 to free the temporary). If that is exceeded, the program crashes. to be
1016 conservative, the default nesting limit is set to 512. If your process
1017 has a smaller stack, you should adjust this setting accordingly with the
1018 C<max_depth> method.
1019
1020 And last but least, something else could bomb you that I forgot to think
1021 of. In that case, you get to keep the pieces. I am always open for hints,
1022 though...
1023
1024 If you are using JSON::XS to return packets to consumption
1025 by JavaScript scripts in a browser you should have a look at
1026 L<http://jpsykes.com/47/practical-csrf-and-json-security> to see whether
1027 you are vulnerable to some common attack vectors (which really are browser
1028 design bugs, but it is still you who will have to deal with it, as major
1029 browser developers care only for features, not about doing security
1030 right).
1031
1032
1033 =head1 THREADS
1034
1035 This module is I<not> guaranteed to be thread safe and there are no
1036 plans to change this until Perl gets thread support (as opposed to the
1037 horribly slow so-called "threads" which are simply slow and bloated
1038 process simulations - use fork, its I<much> faster, cheaper, better).
1039
1040 (It might actually work, but you have been warned).
1041
1042
1043 =head1 BUGS
1044
1045 While the goal of this module is to be correct, that unfortunately does
1046 not mean its bug-free, only that I think its design is bug-free. It is
1047 still relatively early in its development. If you keep reporting bugs they
1048 will be fixed swiftly, though.
1049
1050 Please refrain from using rt.cpan.org or any other bug reporting
1051 service. I put the contact address into my modules for a reason.
1052
1053 =cut
1054
1055 our $true = do { bless \(my $dummy = 1), "JSON::XS::Boolean" };
1056 our $false = do { bless \(my $dummy = 0), "JSON::XS::Boolean" };
1057
1058 sub true() { $true }
1059 sub false() { $false }
1060
1061 sub is_bool($) {
1062 UNIVERSAL::isa $_[0], "JSON::XS::Boolean"
1063 # or UNIVERSAL::isa $_[0], "JSON::Literal"
1064 }
1065
1066 XSLoader::load "JSON::XS", $VERSION;
1067
1068 package JSON::XS::Boolean;
1069
1070 use overload
1071 "0+" => sub { ${$_[0]} },
1072 "++" => sub { $_[0] = ${$_[0]} + 1 },
1073 "--" => sub { $_[0] = ${$_[0]} - 1 },
1074 fallback => 1;
1075
1076 1;
1077
1078 =head1 AUTHOR
1079
1080 Marc Lehmann <schmorp@schmorp.de>
1081 http://home.schmorp.de/
1082
1083 =cut
1084