ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/JSON-XS/XS.pm
Revision: 1.61
Committed: Wed Sep 12 17:42:36 2007 UTC (16 years, 8 months ago) by root
Branch: MAIN
Changes since 1.60: +1 -1 lines
Log Message:
*** empty log message ***

File Contents

# Content
1 =head1 NAME
2
3 JSON::XS - JSON serialising/deserialising, done correctly and fast
4
5 =head1 SYNOPSIS
6
7 use JSON::XS;
8
9 # exported functions, they croak on error
10 # and expect/generate UTF-8
11
12 $utf8_encoded_json_text = to_json $perl_hash_or_arrayref;
13 $perl_hash_or_arrayref = from_json $utf8_encoded_json_text;
14
15 # OO-interface
16
17 $coder = JSON::XS->new->ascii->pretty->allow_nonref;
18 $pretty_printed_unencoded = $coder->encode ($perl_scalar);
19 $perl_scalar = $coder->decode ($unicode_json_text);
20
21 =head1 DESCRIPTION
22
23 This module converts Perl data structures to JSON and vice versa. Its
24 primary goal is to be I<correct> and its secondary goal is to be
25 I<fast>. To reach the latter goal it was written in C.
26
27 As this is the n-th-something JSON module on CPAN, what was the reason
28 to write yet another JSON module? While it seems there are many JSON
29 modules, none of them correctly handle all corner cases, and in most cases
30 their maintainers are unresponsive, gone missing, or not listening to bug
31 reports for other reasons.
32
33 See COMPARISON, below, for a comparison to some other JSON modules.
34
35 See MAPPING, below, on how JSON::XS maps perl values to JSON values and
36 vice versa.
37
38 =head2 FEATURES
39
40 =over 4
41
42 =item * correct unicode handling
43
44 This module knows how to handle Unicode, and even documents how and when
45 it does so.
46
47 =item * round-trip integrity
48
49 When you serialise a perl data structure using only datatypes supported
50 by JSON, the deserialised data structure is identical on the Perl level.
51 (e.g. the string "2.0" doesn't suddenly become "2" just because it looks
52 like a number).
53
54 =item * strict checking of JSON correctness
55
56 There is no guessing, no generating of illegal JSON texts by default,
57 and only JSON is accepted as input by default (the latter is a security
58 feature).
59
60 =item * fast
61
62 Compared to other JSON modules, this module compares favourably in terms
63 of speed, too.
64
65 =item * simple to use
66
67 This module has both a simple functional interface as well as an OO
68 interface.
69
70 =item * reasonably versatile output formats
71
72 You can choose between the most compact guarenteed single-line format
73 possible (nice for simple line-based protocols), a pure-ascii format
74 (for when your transport is not 8-bit clean, still supports the whole
75 unicode range), or a pretty-printed format (for when you want to read that
76 stuff). Or you can combine those features in whatever way you like.
77
78 =back
79
80 =cut
81
82 package JSON::XS;
83
84 use strict;
85
86 our $VERSION = '1.5';
87 our @ISA = qw(Exporter);
88
89 our @EXPORT = qw(to_json from_json);
90
91 use Exporter;
92 use XSLoader;
93
94 =head1 FUNCTIONAL INTERFACE
95
96 The following convinience methods are provided by this module. They are
97 exported by default:
98
99 =over 4
100
101 =item $json_text = to_json $perl_scalar
102
103 Converts the given Perl data structure (a simple scalar or a reference to
104 a hash or array) to a UTF-8 encoded, binary string (that is, the string contains
105 octets only). Croaks on error.
106
107 This function call is functionally identical to:
108
109 $json_text = JSON::XS->new->utf8->encode ($perl_scalar)
110
111 except being faster.
112
113 =item $perl_scalar = from_json $json_text
114
115 The opposite of C<to_json>: expects an UTF-8 (binary) string and tries to
116 parse that as an UTF-8 encoded JSON text, returning the resulting simple
117 scalar or reference. Croaks on error.
118
119 This function call is functionally identical to:
120
121 $perl_scalar = JSON::XS->new->utf8->decode ($json_text)
122
123 except being faster.
124
125 =item $is_boolean = JSON::XS::is_bool $scalar
126
127 Returns true if the passed scalar represents either JSON::XS::true or
128 JSON::XS::false, two constants that act like C<1> and C<0>, respectively
129 and are used to represent JSON C<true> and C<false> values in Perl.
130
131 See MAPPING, below, for more information on how JSON values are mapped to
132 Perl.
133
134 =back
135
136
137 =head1 OBJECT-ORIENTED INTERFACE
138
139 The object oriented interface lets you configure your own encoding or
140 decoding style, within the limits of supported formats.
141
142 =over 4
143
144 =item $json = new JSON::XS
145
146 Creates a new JSON::XS object that can be used to de/encode JSON
147 strings. All boolean flags described below are by default I<disabled>.
148
149 The mutators for flags all return the JSON object again and thus calls can
150 be chained:
151
152 my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]})
153 => {"a": [1, 2]}
154
155 =item $json = $json->ascii ([$enable])
156
157 If C<$enable> is true (or missing), then the C<encode> method will not
158 generate characters outside the code range C<0..127> (which is ASCII). Any
159 unicode characters outside that range will be escaped using either a
160 single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL escape sequence,
161 as per RFC4627. The resulting encoded JSON text can be treated as a native
162 unicode string, an ascii-encoded, latin1-encoded or UTF-8 encoded string,
163 or any other superset of ASCII.
164
165 If C<$enable> is false, then the C<encode> method will not escape Unicode
166 characters unless required by the JSON syntax or other flags. This results
167 in a faster and more compact format.
168
169 The main use for this flag is to produce JSON texts that can be
170 transmitted over a 7-bit channel, as the encoded JSON texts will not
171 contain any 8 bit characters.
172
173 JSON::XS->new->ascii (1)->encode ([chr 0x10401])
174 => ["\ud801\udc01"]
175
176 =item $json = $json->latin1 ([$enable])
177
178 If C<$enable> is true (or missing), then the C<encode> method will encode
179 the resulting JSON text as latin1 (or iso-8859-1), escaping any characters
180 outside the code range C<0..255>. The resulting string can be treated as a
181 latin1-encoded JSON text or a native unicode string. The C<decode> method
182 will not be affected in any way by this flag, as C<decode> by default
183 expects unicode, which is a strict superset of latin1.
184
185 If C<$enable> is false, then the C<encode> method will not escape Unicode
186 characters unless required by the JSON syntax or other flags.
187
188 The main use for this flag is efficiently encoding binary data as JSON
189 text, as most octets will not be escaped, resulting in a smaller encoded
190 size. The disadvantage is that the resulting JSON text is encoded
191 in latin1 (and must correctly be treated as such when storing and
192 transfering), a rare encoding for JSON. It is therefore most useful when
193 you want to store data structures known to contain binary data efficiently
194 in files or databases, not when talking to other JSON encoders/decoders.
195
196 JSON::XS->new->latin1->encode (["\x{89}\x{abc}"]
197 => ["\x{89}\\u0abc"] # (perl syntax, U+abc escaped, U+89 not)
198
199 =item $json = $json->utf8 ([$enable])
200
201 If C<$enable> is true (or missing), then the C<encode> method will encode
202 the JSON result into UTF-8, as required by many protocols, while the
203 C<decode> method expects to be handled an UTF-8-encoded string. Please
204 note that UTF-8-encoded strings do not contain any characters outside the
205 range C<0..255>, they are thus useful for bytewise/binary I/O. In future
206 versions, enabling this option might enable autodetection of the UTF-16
207 and UTF-32 encoding families, as described in RFC4627.
208
209 If C<$enable> is false, then the C<encode> method will return the JSON
210 string as a (non-encoded) unicode string, while C<decode> expects thus a
211 unicode string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs
212 to be done yourself, e.g. using the Encode module.
213
214 Example, output UTF-16BE-encoded JSON:
215
216 use Encode;
217 $jsontext = encode "UTF-16BE", JSON::XS->new->encode ($object);
218
219 Example, decode UTF-32LE-encoded JSON:
220
221 use Encode;
222 $object = JSON::XS->new->decode (decode "UTF-32LE", $jsontext);
223
224 =item $json = $json->pretty ([$enable])
225
226 This enables (or disables) all of the C<indent>, C<space_before> and
227 C<space_after> (and in the future possibly more) flags in one call to
228 generate the most readable (or most compact) form possible.
229
230 Example, pretty-print some simple structure:
231
232 my $json = JSON::XS->new->pretty(1)->encode ({a => [1,2]})
233 =>
234 {
235 "a" : [
236 1,
237 2
238 ]
239 }
240
241 =item $json = $json->indent ([$enable])
242
243 If C<$enable> is true (or missing), then the C<encode> method will use a multiline
244 format as output, putting every array member or object/hash key-value pair
245 into its own line, identing them properly.
246
247 If C<$enable> is false, no newlines or indenting will be produced, and the
248 resulting JSON text is guarenteed not to contain any C<newlines>.
249
250 This setting has no effect when decoding JSON texts.
251
252 =item $json = $json->space_before ([$enable])
253
254 If C<$enable> is true (or missing), then the C<encode> method will add an extra
255 optional space before the C<:> separating keys from values in JSON objects.
256
257 If C<$enable> is false, then the C<encode> method will not add any extra
258 space at those places.
259
260 This setting has no effect when decoding JSON texts. You will also
261 most likely combine this setting with C<space_after>.
262
263 Example, space_before enabled, space_after and indent disabled:
264
265 {"key" :"value"}
266
267 =item $json = $json->space_after ([$enable])
268
269 If C<$enable> is true (or missing), then the C<encode> method will add an extra
270 optional space after the C<:> separating keys from values in JSON objects
271 and extra whitespace after the C<,> separating key-value pairs and array
272 members.
273
274 If C<$enable> is false, then the C<encode> method will not add any extra
275 space at those places.
276
277 This setting has no effect when decoding JSON texts.
278
279 Example, space_before and indent disabled, space_after enabled:
280
281 {"key": "value"}
282
283 =item $json = $json->relaxed ([$enable])
284
285 If C<$enable> is true (or missing), then C<decode> will accept some
286 extensions to normal JSON syntax (see below). C<encode> will not be
287 affected in anyway. I<Be aware that this option makes you accept invalid
288 JSON texts as if they were valid!>. I suggest only to use this option to
289 parse application-specific files written by humans (configuration files,
290 resource files etc.)
291
292 If C<$enable> is false (the default), then C<decode> will only accept
293 valid JSON texts.
294
295 Currently accepted extensions are:
296
297 =over 4
298
299 =item * list items can have an end-comma
300
301 JSON I<separates> array elements and key-value pairs with commas. This
302 can be annoying if you write JSON texts manually and want to be able to
303 quickly append elements, so this extension accepts comma at the end of
304 such items not just between them:
305
306 [
307 1,
308 2, <- this comma not normally allowed
309 ]
310 {
311 "k1": "v1",
312 "k2": "v2", <- this comma not normally allowed
313 }
314
315 =item * shell-style '#'-comments
316
317 Whenever JSON allows whitespace, shell-style comments are additionally
318 allowed. They are terminated by the first carriage-return or line-feed
319 character, after which more white-space and comments are allowed.
320
321 [
322 1, # this comment not allowed in JSON
323 # neither this one...
324 ]
325
326 =back
327
328 =item $json = $json->canonical ([$enable])
329
330 If C<$enable> is true (or missing), then the C<encode> method will output JSON objects
331 by sorting their keys. This is adding a comparatively high overhead.
332
333 If C<$enable> is false, then the C<encode> method will output key-value
334 pairs in the order Perl stores them (which will likely change between runs
335 of the same script).
336
337 This option is useful if you want the same data structure to be encoded as
338 the same JSON text (given the same overall settings). If it is disabled,
339 the same hash migh be encoded differently even if contains the same data,
340 as key-value pairs have no inherent ordering in Perl.
341
342 This setting has no effect when decoding JSON texts.
343
344 =item $json = $json->allow_nonref ([$enable])
345
346 If C<$enable> is true (or missing), then the C<encode> method can convert a
347 non-reference into its corresponding string, number or null JSON value,
348 which is an extension to RFC4627. Likewise, C<decode> will accept those JSON
349 values instead of croaking.
350
351 If C<$enable> is false, then the C<encode> method will croak if it isn't
352 passed an arrayref or hashref, as JSON texts must either be an object
353 or array. Likewise, C<decode> will croak if given something that is not a
354 JSON object or array.
355
356 Example, encode a Perl scalar as JSON value with enabled C<allow_nonref>,
357 resulting in an invalid JSON text:
358
359 JSON::XS->new->allow_nonref->encode ("Hello, World!")
360 => "Hello, World!"
361
362 =item $json = $json->allow_blessed ([$enable])
363
364 If C<$enable> is true (or missing), then the C<encode> method will not
365 barf when it encounters a blessed reference. Instead, the value of the
366 B<convert_blessed> option will decide wether C<null> (C<convert_blessed>
367 disabled or no C<to_json> method found) or a representation of the
368 object (C<convert_blessed> enabled and C<to_json> method found) is being
369 encoded. Has no effect on C<decode>.
370
371 If C<$enable> is false (the default), then C<encode> will throw an
372 exception when it encounters a blessed object.
373
374 =item $json = $json->convert_blessed ([$enable])
375
376 If C<$enable> is true (or missing), then C<encode>, upon encountering a
377 blessed object, will check for the availability of the C<TO_JSON> method
378 on the object's class. If found, it will be called in scalar context
379 and the resulting scalar will be encoded instead of the object. If no
380 C<TO_JSON> method is found, the value of C<allow_blessed> will decide what
381 to do.
382
383 The C<TO_JSON> method may safely call die if it wants. If C<TO_JSON>
384 returns other blessed objects, those will be handled in the same
385 way. C<TO_JSON> must take care of not causing an endless recursion cycle
386 (== crash) in this case. The name of C<TO_JSON> was chosen because other
387 methods called by the Perl core (== not by the user of the object) are
388 usually in upper case letters and to avoid collisions with the C<to_json>
389 function.
390
391 This setting does not yet influence C<decode> in any way, but in the
392 future, global hooks might get installed that influence C<decode> and are
393 enabled by this setting.
394
395 If C<$enable> is false, then the C<allow_blessed> setting will decide what
396 to do when a blessed object is found.
397
398 =item $json = $json->filter_json_object ([$coderef->($hashref)])
399
400 When C<$coderef> is specified, it will be called from C<decode> each
401 time it decodes a JSON object. The only argument is a reference to the
402 newly-created hash. If the code references returns a single scalar (which
403 need not be a reference), this value (i.e. a copy of that scalar to avoid
404 aliasing) is inserted into the deserialised data structure. If it returns
405 an empty list (NOTE: I<not> C<undef>, which is a valid scalar), the
406 original deserialised hash will be inserted. This setting can slow down
407 decoding considerably.
408
409 When C<$coderef> is omitted or undefined, any existing callback will
410 be removed and C<decode> will not change the deserialised hash in any
411 way.
412
413 Example, convert all JSON objects into the integer 5:
414
415 my $js = JSON::XS->new->filter_json_object (sub { 5 });
416 # returns [5]
417 $js->decode ('[{}]')
418 # throw an exception because allow_nonref is not enabled
419 # so a lone 5 is not allowed.
420 $js->decode ('{"a":1, "b":2}');
421
422 =item $json = $json->filter_json_single_key_object ($key [=> $coderef->($value)])
423
424 Works remotely similar to C<filter_json_object>, but is only called for
425 JSON objects having a single key named C<$key>.
426
427 This C<$coderef> is called before the one specified via
428 C<filter_json_object>, if any. It gets passed the single value in the JSON
429 object. If it returns a single value, it will be inserted into the data
430 structure. If it returns nothing (not even C<undef> but the empty list),
431 the callback from C<filter_json_object> will be called next, as if no
432 single-key callback were specified.
433
434 If C<$coderef> is omitted or undefined, the corresponding callback will be
435 disabled. There can only ever be one callback for a given key.
436
437 As this callback gets called less often then the C<filter_json_object>
438 one, decoding speed will not usually suffer as much. Therefore, single-key
439 objects make excellent targets to serialise Perl objects into, especially
440 as single-key JSON objects are as close to the type-tagged value concept
441 as JSON gets (its basically an ID/VALUE tuple). Of course, JSON does not
442 support this in any way, so you need to make sure your data never looks
443 like a serialised Perl hash.
444
445 Typical names for the single object key are C<__class_whatever__>, or
446 C<$__dollars_are_rarely_used__$> or C<}ugly_brace_placement>, or even
447 things like C<__class_md5sum(classname)__>, to reduce the risk of clashing
448 with real hashes.
449
450 Example, decode JSON objects of the form C<< { "__widget__" => <id> } >>
451 into the corresponding C<< $WIDGET{<id>} >> object:
452
453 # return whatever is in $WIDGET{5}:
454 JSON::XS
455 ->new
456 ->filter_json_single_key_object (__widget__ => sub {
457 $WIDGET{ $_[0] }
458 })
459 ->decode ('{"__widget__": 5')
460
461 # this can be used with a TO_JSON method in some "widget" class
462 # for serialisation to json:
463 sub WidgetBase::TO_JSON {
464 my ($self) = @_;
465
466 unless ($self->{id}) {
467 $self->{id} = ..get..some..id..;
468 $WIDGET{$self->{id}} = $self;
469 }
470
471 { __widget__ => $self->{id} }
472 }
473
474 =item $json = $json->shrink ([$enable])
475
476 Perl usually over-allocates memory a bit when allocating space for
477 strings. This flag optionally resizes strings generated by either
478 C<encode> or C<decode> to their minimum size possible. This can save
479 memory when your JSON texts are either very very long or you have many
480 short strings. It will also try to downgrade any strings to octet-form
481 if possible: perl stores strings internally either in an encoding called
482 UTF-X or in octet-form. The latter cannot store everything but uses less
483 space in general (and some buggy Perl or C code might even rely on that
484 internal representation being used).
485
486 The actual definition of what shrink does might change in future versions,
487 but it will always try to save space at the expense of time.
488
489 If C<$enable> is true (or missing), the string returned by C<encode> will
490 be shrunk-to-fit, while all strings generated by C<decode> will also be
491 shrunk-to-fit.
492
493 If C<$enable> is false, then the normal perl allocation algorithms are used.
494 If you work with your data, then this is likely to be faster.
495
496 In the future, this setting might control other things, such as converting
497 strings that look like integers or floats into integers or floats
498 internally (there is no difference on the Perl level), saving space.
499
500 =item $json = $json->max_depth ([$maximum_nesting_depth])
501
502 Sets the maximum nesting level (default C<512>) accepted while encoding
503 or decoding. If the JSON text or Perl data structure has an equal or
504 higher nesting level then this limit, then the encoder and decoder will
505 stop and croak at that point.
506
507 Nesting level is defined by number of hash- or arrayrefs that the encoder
508 needs to traverse to reach a given point or the number of C<{> or C<[>
509 characters without their matching closing parenthesis crossed to reach a
510 given character in a string.
511
512 Setting the maximum depth to one disallows any nesting, so that ensures
513 that the object is only a single hash/object or array.
514
515 The argument to C<max_depth> will be rounded up to the next highest power
516 of two. If no argument is given, the highest possible setting will be
517 used, which is rarely useful.
518
519 See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
520
521 =item $json = $json->max_size ([$maximum_string_size])
522
523 Set the maximum length a JSON text may have (in bytes) where decoding is
524 being attempted. The default is C<0>, meaning no limit. When C<decode>
525 is called on a string longer then this number of characters it will not
526 attempt to decode the string but throw an exception. This setting has no
527 effect on C<encode> (yet).
528
529 The argument to C<max_size> will be rounded up to the next B<highest>
530 power of two (so may be more than requested). If no argument is given, the
531 limit check will be deactivated (same as when C<0> is specified).
532
533 See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
534
535 =item $json_text = $json->encode ($perl_scalar)
536
537 Converts the given Perl data structure (a simple scalar or a reference
538 to a hash or array) to its JSON representation. Simple scalars will be
539 converted into JSON string or number sequences, while references to arrays
540 become JSON arrays and references to hashes become JSON objects. Undefined
541 Perl values (e.g. C<undef>) become JSON C<null> values. Neither C<true>
542 nor C<false> values will be generated.
543
544 =item $perl_scalar = $json->decode ($json_text)
545
546 The opposite of C<encode>: expects a JSON text and tries to parse it,
547 returning the resulting simple scalar or reference. Croaks on error.
548
549 JSON numbers and strings become simple Perl scalars. JSON arrays become
550 Perl arrayrefs and JSON objects become Perl hashrefs. C<true> becomes
551 C<1>, C<false> becomes C<0> and C<null> becomes C<undef>.
552
553 =item ($perl_scalar, $characters) = $json->decode_prefix ($json_text)
554
555 This works like the C<decode> method, but instead of raising an exception
556 when there is trailing garbage after the first JSON object, it will
557 silently stop parsing there and return the number of characters consumed
558 so far.
559
560 This is useful if your JSON texts are not delimited by an outer protocol
561 (which is not the brightest thing to do in the first place) and you need
562 to know where the JSON text ends.
563
564 JSON::XS->new->decode_prefix ("[1] the tail")
565 => ([], 3)
566
567 =back
568
569
570 =head1 MAPPING
571
572 This section describes how JSON::XS maps Perl values to JSON values and
573 vice versa. These mappings are designed to "do the right thing" in most
574 circumstances automatically, preserving round-tripping characteristics
575 (what you put in comes out as something equivalent).
576
577 For the more enlightened: note that in the following descriptions,
578 lowercase I<perl> refers to the Perl interpreter, while uppcercase I<Perl>
579 refers to the abstract Perl language itself.
580
581
582 =head2 JSON -> PERL
583
584 =over 4
585
586 =item object
587
588 A JSON object becomes a reference to a hash in Perl. No ordering of object
589 keys is preserved (JSON does not preserver object key ordering itself).
590
591 =item array
592
593 A JSON array becomes a reference to an array in Perl.
594
595 =item string
596
597 A JSON string becomes a string scalar in Perl - Unicode codepoints in JSON
598 are represented by the same codepoints in the Perl string, so no manual
599 decoding is necessary.
600
601 =item number
602
603 A JSON number becomes either an integer, numeric (floating point) or
604 string scalar in perl, depending on its range and any fractional parts. On
605 the Perl level, there is no difference between those as Perl handles all
606 the conversion details, but an integer may take slightly less memory and
607 might represent more values exactly than (floating point) numbers.
608
609 If the number consists of digits only, JSON::XS will try to represent
610 it as an integer value. If that fails, it will try to represent it as
611 a numeric (floating point) value if that is possible without loss of
612 precision. Otherwise it will preserve the number as a string value.
613
614 Numbers containing a fractional or exponential part will always be
615 represented as numeric (floating point) values, possibly at a loss of
616 precision.
617
618 This might create round-tripping problems as numbers might become strings,
619 but as Perl is typeless there is no other way to do it.
620
621 =item true, false
622
623 These JSON atoms become C<JSON::XS::true> and C<JSON::XS::false>,
624 respectively. They are overloaded to act almost exactly like the numbers
625 C<1> and C<0>. You can check wether a scalar is a JSON boolean by using
626 the C<JSON::XS::is_bool> function.
627
628 =item null
629
630 A JSON null atom becomes C<undef> in Perl.
631
632 =back
633
634
635 =head2 PERL -> JSON
636
637 The mapping from Perl to JSON is slightly more difficult, as Perl is a
638 truly typeless language, so we can only guess which JSON type is meant by
639 a Perl value.
640
641 =over 4
642
643 =item hash references
644
645 Perl hash references become JSON objects. As there is no inherent ordering
646 in hash keys (or JSON objects), they will usually be encoded in a
647 pseudo-random order that can change between runs of the same program but
648 stays generally the same within a single run of a program. JSON::XS can
649 optionally sort the hash keys (determined by the I<canonical> flag), so
650 the same datastructure will serialise to the same JSON text (given same
651 settings and version of JSON::XS), but this incurs a runtime overhead
652 and is only rarely useful, e.g. when you want to compare some JSON text
653 against another for equality.
654
655 =item array references
656
657 Perl array references become JSON arrays.
658
659 =item other references
660
661 Other unblessed references are generally not allowed and will cause an
662 exception to be thrown, except for references to the integers C<0> and
663 C<1>, which get turned into C<false> and C<true> atoms in JSON. You can
664 also use C<JSON::XS::false> and C<JSON::XS::true> to improve readability.
665
666 to_json [\0,JSON::XS::true] # yields [false,true]
667
668 =item JSON::XS::true, JSON::XS::false
669
670 These special values become JSON true and JSON false values,
671 respectively. You can also use C<\1> and C<\0> directly if you want.
672
673 =item blessed objects
674
675 Blessed objects are not allowed. JSON::XS currently tries to encode their
676 underlying representation (hash- or arrayref), but this behaviour might
677 change in future versions.
678
679 =item simple scalars
680
681 Simple Perl scalars (any scalar that is not a reference) are the most
682 difficult objects to encode: JSON::XS will encode undefined scalars as
683 JSON null value, scalars that have last been used in a string context
684 before encoding as JSON strings and anything else as number value:
685
686 # dump as number
687 to_json [2] # yields [2]
688 to_json [-3.0e17] # yields [-3e+17]
689 my $value = 5; to_json [$value] # yields [5]
690
691 # used as string, so dump as string
692 print $value;
693 to_json [$value] # yields ["5"]
694
695 # undef becomes null
696 to_json [undef] # yields [null]
697
698 You can force the type to be a string by stringifying it:
699
700 my $x = 3.1; # some variable containing a number
701 "$x"; # stringified
702 $x .= ""; # another, more awkward way to stringify
703 print $x; # perl does it for you, too, quite often
704
705 You can force the type to be a number by numifying it:
706
707 my $x = "3"; # some variable containing a string
708 $x += 0; # numify it, ensuring it will be dumped as a number
709 $x *= 1; # same thing, the choise is yours.
710
711 You can not currently output JSON booleans or force the type in other,
712 less obscure, ways. Tell me if you need this capability.
713
714 =back
715
716
717 =head1 COMPARISON
718
719 As already mentioned, this module was created because none of the existing
720 JSON modules could be made to work correctly. First I will describe the
721 problems (or pleasures) I encountered with various existing JSON modules,
722 followed by some benchmark values. JSON::XS was designed not to suffer
723 from any of these problems or limitations.
724
725 =over 4
726
727 =item JSON 1.07
728
729 Slow (but very portable, as it is written in pure Perl).
730
731 Undocumented/buggy Unicode handling (how JSON handles unicode values is
732 undocumented. One can get far by feeding it unicode strings and doing
733 en-/decoding oneself, but unicode escapes are not working properly).
734
735 No roundtripping (strings get clobbered if they look like numbers, e.g.
736 the string C<2.0> will encode to C<2.0> instead of C<"2.0">, and that will
737 decode into the number 2.
738
739 =item JSON::PC 0.01
740
741 Very fast.
742
743 Undocumented/buggy Unicode handling.
744
745 No roundtripping.
746
747 Has problems handling many Perl values (e.g. regex results and other magic
748 values will make it croak).
749
750 Does not even generate valid JSON (C<{1,2}> gets converted to C<{1:2}>
751 which is not a valid JSON text.
752
753 Unmaintained (maintainer unresponsive for many months, bugs are not
754 getting fixed).
755
756 =item JSON::Syck 0.21
757
758 Very buggy (often crashes).
759
760 Very inflexible (no human-readable format supported, format pretty much
761 undocumented. I need at least a format for easy reading by humans and a
762 single-line compact format for use in a protocol, and preferably a way to
763 generate ASCII-only JSON texts).
764
765 Completely broken (and confusingly documented) Unicode handling (unicode
766 escapes are not working properly, you need to set ImplicitUnicode to
767 I<different> values on en- and decoding to get symmetric behaviour).
768
769 No roundtripping (simple cases work, but this depends on wether the scalar
770 value was used in a numeric context or not).
771
772 Dumping hashes may skip hash values depending on iterator state.
773
774 Unmaintained (maintainer unresponsive for many months, bugs are not
775 getting fixed).
776
777 Does not check input for validity (i.e. will accept non-JSON input and
778 return "something" instead of raising an exception. This is a security
779 issue: imagine two banks transfering money between each other using
780 JSON. One bank might parse a given non-JSON request and deduct money,
781 while the other might reject the transaction with a syntax error. While a
782 good protocol will at least recover, that is extra unnecessary work and
783 the transaction will still not succeed).
784
785 =item JSON::DWIW 0.04
786
787 Very fast. Very natural. Very nice.
788
789 Undocumented unicode handling (but the best of the pack. Unicode escapes
790 still don't get parsed properly).
791
792 Very inflexible.
793
794 No roundtripping.
795
796 Does not generate valid JSON texts (key strings are often unquoted, empty keys
797 result in nothing being output)
798
799 Does not check input for validity.
800
801 =back
802
803
804 =head2 JSON and YAML
805
806 You often hear that JSON is a subset (or a close subset) of YAML. This is,
807 however, a mass hysteria and very far from the truth. In general, there is
808 no way to configure JSON::XS to output a data structure as valid YAML.
809
810 If you really must use JSON::XS to generate YAML, you should use this
811 algorithm (subject to change in future versions):
812
813 my $to_yaml = JSON::XS->new->utf8->space_after (1);
814 my $yaml = $to_yaml->encode ($ref) . "\n";
815
816 This will usually generate JSON texts that also parse as valid
817 YAML. Please note that YAML has hardcoded limits on (simple) object key
818 lengths that JSON doesn't have, so you should make sure that your hash
819 keys are noticably shorter than the 1024 characters YAML allows.
820
821 There might be other incompatibilities that I am not aware of. In general
822 you should not try to generate YAML with a JSON generator or vice versa,
823 or try to parse JSON with a YAML parser or vice versa: chances are high
824 that you will run into severe interoperability problems.
825
826
827 =head2 SPEED
828
829 It seems that JSON::XS is surprisingly fast, as shown in the following
830 tables. They have been generated with the help of the C<eg/bench> program
831 in the JSON::XS distribution, to make it easy to compare on your own
832 system.
833
834 First comes a comparison between various modules using a very short
835 single-line JSON string:
836
837 {"method": "handleMessage", "params": ["user1", "we were just talking"], \
838 "id": null, "array":[1,11,234,-5,1e5,1e7, true, false]}
839
840 It shows the number of encodes/decodes per second (JSON::XS uses
841 the functional interface, while JSON::XS/2 uses the OO interface
842 with pretty-printing and hashkey sorting enabled, JSON::XS/3 enables
843 shrink). Higher is better:
844
845 Storable | 15779.925 | 14169.946 |
846 -----------+------------+------------+
847 module | encode | decode |
848 -----------|------------|------------|
849 JSON | 4990.842 | 4088.813 |
850 JSON::DWIW | 51653.990 | 71575.154 |
851 JSON::PC | 65948.176 | 74631.744 |
852 JSON::PP | 8931.652 | 3817.168 |
853 JSON::Syck | 24877.248 | 27776.848 |
854 JSON::XS | 388361.481 | 227951.304 |
855 JSON::XS/2 | 227951.304 | 218453.333 |
856 JSON::XS/3 | 338250.323 | 218453.333 |
857 Storable | 16500.016 | 135300.129 |
858 -----------+------------+------------+
859
860 That is, JSON::XS is about five times faster than JSON::DWIW on encoding,
861 about three times faster on decoding, and over fourty times faster
862 than JSON, even with pretty-printing and key sorting. It also compares
863 favourably to Storable for small amounts of data.
864
865 Using a longer test string (roughly 18KB, generated from Yahoo! Locals
866 search API (http://nanoref.com/yahooapis/mgPdGg):
867
868 module | encode | decode |
869 -----------|------------|------------|
870 JSON | 55.260 | 34.971 |
871 JSON::DWIW | 825.228 | 1082.513 |
872 JSON::PC | 3571.444 | 2394.829 |
873 JSON::PP | 210.987 | 32.574 |
874 JSON::Syck | 552.551 | 787.544 |
875 JSON::XS | 5780.463 | 4854.519 |
876 JSON::XS/2 | 3869.998 | 4798.975 |
877 JSON::XS/3 | 5862.880 | 4798.975 |
878 Storable | 4445.002 | 5235.027 |
879 -----------+------------+------------+
880
881 Again, JSON::XS leads by far (except for Storable which non-surprisingly
882 decodes faster).
883
884 On large strings containing lots of high unicode characters, some modules
885 (such as JSON::PC) seem to decode faster than JSON::XS, but the result
886 will be broken due to missing (or wrong) unicode handling. Others refuse
887 to decode or encode properly, so it was impossible to prepare a fair
888 comparison table for that case.
889
890
891 =head1 SECURITY CONSIDERATIONS
892
893 When you are using JSON in a protocol, talking to untrusted potentially
894 hostile creatures requires relatively few measures.
895
896 First of all, your JSON decoder should be secure, that is, should not have
897 any buffer overflows. Obviously, this module should ensure that and I am
898 trying hard on making that true, but you never know.
899
900 Second, you need to avoid resource-starving attacks. That means you should
901 limit the size of JSON texts you accept, or make sure then when your
902 resources run out, thats just fine (e.g. by using a separate process that
903 can crash safely). The size of a JSON text in octets or characters is
904 usually a good indication of the size of the resources required to decode
905 it into a Perl structure. While JSON::XS can check the size of the JSON
906 text, it might be too late when you already have it in memory, so you
907 might want to check the size before you accept the string.
908
909 Third, JSON::XS recurses using the C stack when decoding objects and
910 arrays. The C stack is a limited resource: for instance, on my amd64
911 machine with 8MB of stack size I can decode around 180k nested arrays but
912 only 14k nested JSON objects (due to perl itself recursing deeply on croak
913 to free the temporary). If that is exceeded, the program crashes. to be
914 conservative, the default nesting limit is set to 512. If your process
915 has a smaller stack, you should adjust this setting accordingly with the
916 C<max_depth> method.
917
918 And last but least, something else could bomb you that I forgot to think
919 of. In that case, you get to keep the pieces. I am always open for hints,
920 though...
921
922 If you are using JSON::XS to return packets to consumption
923 by javascript scripts in a browser you should have a look at
924 L<http://jpsykes.com/47/practical-csrf-and-json-security> to see wether
925 you are vulnerable to some common attack vectors (which really are browser
926 design bugs, but it is still you who will have to deal with it, as major
927 browser developers care only for features, not about doing security
928 right).
929
930
931 =head1 BUGS
932
933 While the goal of this module is to be correct, that unfortunately does
934 not mean its bug-free, only that I think its design is bug-free. It is
935 still relatively early in its development. If you keep reporting bugs they
936 will be fixed swiftly, though.
937
938 =cut
939
940 our $true = do { bless \(my $dummy = 1), "JSON::XS::Boolean" };
941 our $false = do { bless \(my $dummy = 0), "JSON::XS::Boolean" };
942
943 sub true() { $true }
944 sub false() { $false }
945
946 sub is_bool($) {
947 UNIVERSAL::isa $_[0], "JSON::XS::Boolean"
948 # or UNIVERSAL::isa $_[0], "JSON::Literal"
949 }
950
951 XSLoader::load "JSON::XS", $VERSION;
952
953 package JSON::XS::Boolean;
954
955 use overload
956 "0+" => sub { ${$_[0]} },
957 "++" => sub { $_[0] = ${$_[0]} + 1 },
958 "--" => sub { $_[0] = ${$_[0]} - 1 },
959 fallback => 1;
960
961 1;
962
963 =head1 AUTHOR
964
965 Marc Lehmann <schmorp@schmorp.de>
966 http://home.schmorp.de/
967
968 =cut
969