ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/JSON-XS/XS.pm
Revision: 1.59
Committed: Mon Aug 27 01:49:01 2007 UTC (16 years, 8 months ago) by root
Branch: MAIN
Changes since 1.58: +34 -0 lines
Log Message:
*** empty log message ***

File Contents

# Content
1 =head1 NAME
2
3 JSON::XS - JSON serialising/deserialising, done correctly and fast
4
5 =head1 SYNOPSIS
6
7 use JSON::XS;
8
9 # exported functions, they croak on error
10 # and expect/generate UTF-8
11
12 $utf8_encoded_json_text = to_json $perl_hash_or_arrayref;
13 $perl_hash_or_arrayref = from_json $utf8_encoded_json_text;
14
15 # OO-interface
16
17 $coder = JSON::XS->new->ascii->pretty->allow_nonref;
18 $pretty_printed_unencoded = $coder->encode ($perl_scalar);
19 $perl_scalar = $coder->decode ($unicode_json_text);
20
21 =head1 DESCRIPTION
22
23 This module converts Perl data structures to JSON and vice versa. Its
24 primary goal is to be I<correct> and its secondary goal is to be
25 I<fast>. To reach the latter goal it was written in C.
26
27 As this is the n-th-something JSON module on CPAN, what was the reason
28 to write yet another JSON module? While it seems there are many JSON
29 modules, none of them correctly handle all corner cases, and in most cases
30 their maintainers are unresponsive, gone missing, or not listening to bug
31 reports for other reasons.
32
33 See COMPARISON, below, for a comparison to some other JSON modules.
34
35 See MAPPING, below, on how JSON::XS maps perl values to JSON values and
36 vice versa.
37
38 =head2 FEATURES
39
40 =over 4
41
42 =item * correct unicode handling
43
44 This module knows how to handle Unicode, and even documents how and when
45 it does so.
46
47 =item * round-trip integrity
48
49 When you serialise a perl data structure using only datatypes supported
50 by JSON, the deserialised data structure is identical on the Perl level.
51 (e.g. the string "2.0" doesn't suddenly become "2" just because it looks
52 like a number).
53
54 =item * strict checking of JSON correctness
55
56 There is no guessing, no generating of illegal JSON texts by default,
57 and only JSON is accepted as input by default (the latter is a security
58 feature).
59
60 =item * fast
61
62 Compared to other JSON modules, this module compares favourably in terms
63 of speed, too.
64
65 =item * simple to use
66
67 This module has both a simple functional interface as well as an OO
68 interface.
69
70 =item * reasonably versatile output formats
71
72 You can choose between the most compact guarenteed single-line format
73 possible (nice for simple line-based protocols), a pure-ascii format
74 (for when your transport is not 8-bit clean, still supports the whole
75 unicode range), or a pretty-printed format (for when you want to read that
76 stuff). Or you can combine those features in whatever way you like.
77
78 =back
79
80 =cut
81
82 package JSON::XS;
83
84 use strict;
85
86 our $VERSION = '1.5';
87 our @ISA = qw(Exporter);
88
89 our @EXPORT = qw(to_json from_json);
90
91 use Exporter;
92 use XSLoader;
93
94 =head1 FUNCTIONAL INTERFACE
95
96 The following convinience methods are provided by this module. They are
97 exported by default:
98
99 =over 4
100
101 =item $json_text = to_json $perl_scalar
102
103 Converts the given Perl data structure (a simple scalar or a reference to
104 a hash or array) to a UTF-8 encoded, binary string (that is, the string contains
105 octets only). Croaks on error.
106
107 This function call is functionally identical to:
108
109 $json_text = JSON::XS->new->utf8->encode ($perl_scalar)
110
111 except being faster.
112
113 =item $perl_scalar = from_json $json_text
114
115 The opposite of C<to_json>: expects an UTF-8 (binary) string and tries to
116 parse that as an UTF-8 encoded JSON text, returning the resulting simple
117 scalar or reference. Croaks on error.
118
119 This function call is functionally identical to:
120
121 $perl_scalar = JSON::XS->new->utf8->decode ($json_text)
122
123 except being faster.
124
125 =item $is_boolean = JSON::XS::is_bool $scalar
126
127 Returns true if the passed scalar represents either JSON::XS::true or
128 JSON::XS::false, two constants that act like C<1> and C<0>, respectively
129 and are used to represent JSON C<true> and C<false> values in Perl.
130
131 See MAPPING, below, for more information on how JSON values are mapped to
132 Perl.
133
134 =back
135
136
137 =head1 OBJECT-ORIENTED INTERFACE
138
139 The object oriented interface lets you configure your own encoding or
140 decoding style, within the limits of supported formats.
141
142 =over 4
143
144 =item $json = new JSON::XS
145
146 Creates a new JSON::XS object that can be used to de/encode JSON
147 strings. All boolean flags described below are by default I<disabled>.
148
149 The mutators for flags all return the JSON object again and thus calls can
150 be chained:
151
152 my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]})
153 => {"a": [1, 2]}
154
155 =item $json = $json->ascii ([$enable])
156
157 If C<$enable> is true (or missing), then the C<encode> method will not
158 generate characters outside the code range C<0..127> (which is ASCII). Any
159 unicode characters outside that range will be escaped using either a
160 single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL escape sequence,
161 as per RFC4627. The resulting encoded JSON text can be treated as a native
162 unicode string, an ascii-encoded, latin1-encoded or UTF-8 encoded string,
163 or any other superset of ASCII.
164
165 If C<$enable> is false, then the C<encode> method will not escape Unicode
166 characters unless required by the JSON syntax or other flags. This results
167 in a faster and more compact format.
168
169 The main use for this flag is to produce JSON texts that can be
170 transmitted over a 7-bit channel, as the encoded JSON texts will not
171 contain any 8 bit characters.
172
173 JSON::XS->new->ascii (1)->encode ([chr 0x10401])
174 => ["\ud801\udc01"]
175
176 =item $json = $json->latin1 ([$enable])
177
178 If C<$enable> is true (or missing), then the C<encode> method will encode
179 the resulting JSON text as latin1 (or iso-8859-1), escaping any characters
180 outside the code range C<0..255>. The resulting string can be treated as a
181 latin1-encoded JSON text or a native unicode string. The C<decode> method
182 will not be affected in any way by this flag, as C<decode> by default
183 expects unicode, which is a strict superset of latin1.
184
185 If C<$enable> is false, then the C<encode> method will not escape Unicode
186 characters unless required by the JSON syntax or other flags.
187
188 The main use for this flag is efficiently encoding binary data as JSON
189 text, as most octets will not be escaped, resulting in a smaller encoded
190 size. The disadvantage is that the resulting JSON text is encoded
191 in latin1 (and must correctly be treated as such when storing and
192 transfering), a rare encoding for JSON. It is therefore most useful when
193 you want to store data structures known to contain binary data efficiently
194 in files or databases, not when talking to other JSON encoders/decoders.
195
196 JSON::XS->new->latin1->encode (["\x{89}\x{abc}"]
197 => ["\x{89}\\u0abc"] # (perl syntax, U+abc escaped, U+89 not)
198
199 =item $json = $json->utf8 ([$enable])
200
201 If C<$enable> is true (or missing), then the C<encode> method will encode
202 the JSON result into UTF-8, as required by many protocols, while the
203 C<decode> method expects to be handled an UTF-8-encoded string. Please
204 note that UTF-8-encoded strings do not contain any characters outside the
205 range C<0..255>, they are thus useful for bytewise/binary I/O. In future
206 versions, enabling this option might enable autodetection of the UTF-16
207 and UTF-32 encoding families, as described in RFC4627.
208
209 If C<$enable> is false, then the C<encode> method will return the JSON
210 string as a (non-encoded) unicode string, while C<decode> expects thus a
211 unicode string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs
212 to be done yourself, e.g. using the Encode module.
213
214 Example, output UTF-16BE-encoded JSON:
215
216 use Encode;
217 $jsontext = encode "UTF-16BE", JSON::XS->new->encode ($object);
218
219 Example, decode UTF-32LE-encoded JSON:
220
221 use Encode;
222 $object = JSON::XS->new->decode (decode "UTF-32LE", $jsontext);
223
224 =item $json = $json->pretty ([$enable])
225
226 This enables (or disables) all of the C<indent>, C<space_before> and
227 C<space_after> (and in the future possibly more) flags in one call to
228 generate the most readable (or most compact) form possible.
229
230 Example, pretty-print some simple structure:
231
232 my $json = JSON::XS->new->pretty(1)->encode ({a => [1,2]})
233 =>
234 {
235 "a" : [
236 1,
237 2
238 ]
239 }
240
241 =item $json = $json->indent ([$enable])
242
243 If C<$enable> is true (or missing), then the C<encode> method will use a multiline
244 format as output, putting every array member or object/hash key-value pair
245 into its own line, identing them properly.
246
247 If C<$enable> is false, no newlines or indenting will be produced, and the
248 resulting JSON text is guarenteed not to contain any C<newlines>.
249
250 This setting has no effect when decoding JSON texts.
251
252 =item $json = $json->space_before ([$enable])
253
254 If C<$enable> is true (or missing), then the C<encode> method will add an extra
255 optional space before the C<:> separating keys from values in JSON objects.
256
257 If C<$enable> is false, then the C<encode> method will not add any extra
258 space at those places.
259
260 This setting has no effect when decoding JSON texts. You will also
261 most likely combine this setting with C<space_after>.
262
263 Example, space_before enabled, space_after and indent disabled:
264
265 {"key" :"value"}
266
267 =item $json = $json->space_after ([$enable])
268
269 If C<$enable> is true (or missing), then the C<encode> method will add an extra
270 optional space after the C<:> separating keys from values in JSON objects
271 and extra whitespace after the C<,> separating key-value pairs and array
272 members.
273
274 If C<$enable> is false, then the C<encode> method will not add any extra
275 space at those places.
276
277 This setting has no effect when decoding JSON texts.
278
279 Example, space_before and indent disabled, space_after enabled:
280
281 {"key": "value"}
282
283 =item $json = $json->relaxed ([$enable])
284
285 If C<$enable> is true (or missing), then C<decode> will accept some
286 extensions to normal JSON syntax (see below). C<encode> will not be
287 affected in anyway. I<Be aware that this option makes you accept invalid
288 JSON texts as if they were valid!>. I suggest only to use this option to
289 parse application-specific files written by humans (configuration files,
290 resource files etc.)
291
292 If C<$enable> is false (the default), then C<decode> will only accept
293 valid JSON texts.
294
295 Currently accepted extensions are:
296
297 =over 4
298
299 =item * list items can have an end-comma
300
301 JSON I<separates> array elements and key-value pairs with commas. This
302 can be annoying if you write JSON texts manually and want to be able to
303 quickly append elements, so this extension accepts comma at the end of
304 such items not just between them:
305
306 [
307 1,
308 2, <- this comma not normally allowed
309 ]
310 {
311 "k1": "v1",
312 "k2": "v2", <- this comma not normally allowed
313 }
314
315 =back
316
317 =item $json = $json->canonical ([$enable])
318
319 If C<$enable> is true (or missing), then the C<encode> method will output JSON objects
320 by sorting their keys. This is adding a comparatively high overhead.
321
322 If C<$enable> is false, then the C<encode> method will output key-value
323 pairs in the order Perl stores them (which will likely change between runs
324 of the same script).
325
326 This option is useful if you want the same data structure to be encoded as
327 the same JSON text (given the same overall settings). If it is disabled,
328 the same hash migh be encoded differently even if contains the same data,
329 as key-value pairs have no inherent ordering in Perl.
330
331 This setting has no effect when decoding JSON texts.
332
333 =item $json = $json->allow_nonref ([$enable])
334
335 If C<$enable> is true (or missing), then the C<encode> method can convert a
336 non-reference into its corresponding string, number or null JSON value,
337 which is an extension to RFC4627. Likewise, C<decode> will accept those JSON
338 values instead of croaking.
339
340 If C<$enable> is false, then the C<encode> method will croak if it isn't
341 passed an arrayref or hashref, as JSON texts must either be an object
342 or array. Likewise, C<decode> will croak if given something that is not a
343 JSON object or array.
344
345 Example, encode a Perl scalar as JSON value with enabled C<allow_nonref>,
346 resulting in an invalid JSON text:
347
348 JSON::XS->new->allow_nonref->encode ("Hello, World!")
349 => "Hello, World!"
350
351 =item $json = $json->allow_blessed ([$enable])
352
353 If C<$enable> is true (or missing), then the C<encode> method will not
354 barf when it encounters a blessed reference. Instead, the value of the
355 B<convert_blessed> option will decide wether C<null> (C<convert_blessed>
356 disabled or no C<to_json> method found) or a representation of the
357 object (C<convert_blessed> enabled and C<to_json> method found) is being
358 encoded. Has no effect on C<decode>.
359
360 If C<$enable> is false (the default), then C<encode> will throw an
361 exception when it encounters a blessed object.
362
363 =item $json = $json->convert_blessed ([$enable])
364
365 If C<$enable> is true (or missing), then C<encode>, upon encountering a
366 blessed object, will check for the availability of the C<TO_JSON> method
367 on the object's class. If found, it will be called in scalar context
368 and the resulting scalar will be encoded instead of the object. If no
369 C<TO_JSON> method is found, the value of C<allow_blessed> will decide what
370 to do.
371
372 The C<TO_JSON> method may safely call die if it wants. If C<TO_JSON>
373 returns other blessed objects, those will be handled in the same
374 way. C<TO_JSON> must take care of not causing an endless recursion cycle
375 (== crash) in this case. The name of C<TO_JSON> was chosen because other
376 methods called by the Perl core (== not by the user of the object) are
377 usually in upper case letters and to avoid collisions with the C<to_json>
378 function.
379
380 This setting does not yet influence C<decode> in any way, but in the
381 future, global hooks might get installed that influence C<decode> and are
382 enabled by this setting.
383
384 If C<$enable> is false, then the C<allow_blessed> setting will decide what
385 to do when a blessed object is found.
386
387 =item $json = $json->filter_json_object ([$coderef->($hashref)])
388
389 When C<$coderef> is specified, it will be called from C<decode> each
390 time it decodes a JSON object. The only argument is a reference to the
391 newly-created hash. If the code references returns a single scalar (which
392 need not be a reference), this value (i.e. a copy of that scalar to avoid
393 aliasing) is inserted into the deserialised data structure. If it returns
394 an empty list (NOTE: I<not> C<undef>, which is a valid scalar), the
395 original deserialised hash will be inserted. This setting can slow down
396 decoding considerably.
397
398 When C<$coderef> is omitted or undefined, any existing callback will
399 be removed and C<decode> will not change the deserialised hash in any
400 way.
401
402 Example, convert all JSON objects into the integer 5:
403
404 my $js = JSON::XS->new->filter_json_object (sub { 5 });
405 # returns [5]
406 $js->decode ('[{}]')
407 # throw an exception because allow_nonref is not enabled
408 # so a lone 5 is not allowed.
409 $js->decode ('{"a":1, "b":2}');
410
411 =item $json = $json->filter_json_single_key_object ($key [=> $coderef->($value)])
412
413 Works remotely similar to C<filter_json_object>, but is only called for
414 JSON objects having a single key named C<$key>.
415
416 This C<$coderef> is called before the one specified via
417 C<filter_json_object>, if any. It gets passed the single value in the JSON
418 object. If it returns a single value, it will be inserted into the data
419 structure. If it returns nothing (not even C<undef> but the empty list),
420 the callback from C<filter_json_object> will be called next, as if no
421 single-key callback were specified.
422
423 If C<$coderef> is omitted or undefined, the corresponding callback will be
424 disabled. There can only ever be one callback for a given key.
425
426 As this callback gets called less often then the C<filter_json_object>
427 one, decoding speed will not usually suffer as much. Therefore, single-key
428 objects make excellent targets to serialise Perl objects into, especially
429 as single-key JSON objects are as close to the type-tagged value concept
430 as JSON gets (its basically an ID/VALUE tuple). Of course, JSON does not
431 support this in any way, so you need to make sure your data never looks
432 like a serialised Perl hash.
433
434 Typical names for the single object key are C<__class_whatever__>, or
435 C<$__dollars_are_rarely_used__$> or C<}ugly_brace_placement>, or even
436 things like C<__class_md5sum(classname)__>, to reduce the risk of clashing
437 with real hashes.
438
439 Example, decode JSON objects of the form C<< { "__widget__" => <id> } >>
440 into the corresponding C<< $WIDGET{<id>} >> object:
441
442 # return whatever is in $WIDGET{5}:
443 JSON::XS
444 ->new
445 ->filter_json_single_key_object (__widget__ => sub {
446 $WIDGET{ $_[0] }
447 })
448 ->decode ('{"__widget__": 5')
449
450 # this can be used with a TO_JSON method in some "widget" class
451 # for serialisation to json:
452 sub WidgetBase::TO_JSON {
453 my ($self) = @_;
454
455 unless ($self->{id}) {
456 $self->{id} = ..get..some..id..;
457 $WIDGET{$self->{id}} = $self;
458 }
459
460 { __widget__ => $self->{id} }
461 }
462
463 =item $json = $json->shrink ([$enable])
464
465 Perl usually over-allocates memory a bit when allocating space for
466 strings. This flag optionally resizes strings generated by either
467 C<encode> or C<decode> to their minimum size possible. This can save
468 memory when your JSON texts are either very very long or you have many
469 short strings. It will also try to downgrade any strings to octet-form
470 if possible: perl stores strings internally either in an encoding called
471 UTF-X or in octet-form. The latter cannot store everything but uses less
472 space in general (and some buggy Perl or C code might even rely on that
473 internal representation being used).
474
475 The actual definition of what shrink does might change in future versions,
476 but it will always try to save space at the expense of time.
477
478 If C<$enable> is true (or missing), the string returned by C<encode> will
479 be shrunk-to-fit, while all strings generated by C<decode> will also be
480 shrunk-to-fit.
481
482 If C<$enable> is false, then the normal perl allocation algorithms are used.
483 If you work with your data, then this is likely to be faster.
484
485 In the future, this setting might control other things, such as converting
486 strings that look like integers or floats into integers or floats
487 internally (there is no difference on the Perl level), saving space.
488
489 =item $json = $json->max_depth ([$maximum_nesting_depth])
490
491 Sets the maximum nesting level (default C<512>) accepted while encoding
492 or decoding. If the JSON text or Perl data structure has an equal or
493 higher nesting level then this limit, then the encoder and decoder will
494 stop and croak at that point.
495
496 Nesting level is defined by number of hash- or arrayrefs that the encoder
497 needs to traverse to reach a given point or the number of C<{> or C<[>
498 characters without their matching closing parenthesis crossed to reach a
499 given character in a string.
500
501 Setting the maximum depth to one disallows any nesting, so that ensures
502 that the object is only a single hash/object or array.
503
504 The argument to C<max_depth> will be rounded up to the next highest power
505 of two. If no argument is given, the highest possible setting will be
506 used, which is rarely useful.
507
508 See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
509
510 =item $json = $json->max_size ([$maximum_string_size])
511
512 Set the maximum length a JSON text may have (in bytes) where decoding is
513 being attempted. The default is C<0>, meaning no limit. When C<decode>
514 is called on a string longer then this number of characters it will not
515 attempt to decode the string but throw an exception. This setting has no
516 effect on C<encode> (yet).
517
518 The argument to C<max_size> will be rounded up to the next B<highest>
519 power of two (so may be more than requested). If no argument is given, the
520 limit check will be deactivated (same as when C<0> is specified).
521
522 See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
523
524 =item $json_text = $json->encode ($perl_scalar)
525
526 Converts the given Perl data structure (a simple scalar or a reference
527 to a hash or array) to its JSON representation. Simple scalars will be
528 converted into JSON string or number sequences, while references to arrays
529 become JSON arrays and references to hashes become JSON objects. Undefined
530 Perl values (e.g. C<undef>) become JSON C<null> values. Neither C<true>
531 nor C<false> values will be generated.
532
533 =item $perl_scalar = $json->decode ($json_text)
534
535 The opposite of C<encode>: expects a JSON text and tries to parse it,
536 returning the resulting simple scalar or reference. Croaks on error.
537
538 JSON numbers and strings become simple Perl scalars. JSON arrays become
539 Perl arrayrefs and JSON objects become Perl hashrefs. C<true> becomes
540 C<1>, C<false> becomes C<0> and C<null> becomes C<undef>.
541
542 =item ($perl_scalar, $characters) = $json->decode_prefix ($json_text)
543
544 This works like the C<decode> method, but instead of raising an exception
545 when there is trailing garbage after the first JSON object, it will
546 silently stop parsing there and return the number of characters consumed
547 so far.
548
549 This is useful if your JSON texts are not delimited by an outer protocol
550 (which is not the brightest thing to do in the first place) and you need
551 to know where the JSON text ends.
552
553 JSON::XS->new->decode_prefix ("[1] the tail")
554 => ([], 3)
555
556 =back
557
558
559 =head1 MAPPING
560
561 This section describes how JSON::XS maps Perl values to JSON values and
562 vice versa. These mappings are designed to "do the right thing" in most
563 circumstances automatically, preserving round-tripping characteristics
564 (what you put in comes out as something equivalent).
565
566 For the more enlightened: note that in the following descriptions,
567 lowercase I<perl> refers to the Perl interpreter, while uppcercase I<Perl>
568 refers to the abstract Perl language itself.
569
570
571 =head2 JSON -> PERL
572
573 =over 4
574
575 =item object
576
577 A JSON object becomes a reference to a hash in Perl. No ordering of object
578 keys is preserved (JSON does not preserver object key ordering itself).
579
580 =item array
581
582 A JSON array becomes a reference to an array in Perl.
583
584 =item string
585
586 A JSON string becomes a string scalar in Perl - Unicode codepoints in JSON
587 are represented by the same codepoints in the Perl string, so no manual
588 decoding is necessary.
589
590 =item number
591
592 A JSON number becomes either an integer, numeric (floating point) or
593 string scalar in perl, depending on its range and any fractional parts. On
594 the Perl level, there is no difference between those as Perl handles all
595 the conversion details, but an integer may take slightly less memory and
596 might represent more values exactly than (floating point) numbers.
597
598 If the number consists of digits only, JSON::XS will try to represent
599 it as an integer value. If that fails, it will try to represent it as
600 a numeric (floating point) value if that is possible without loss of
601 precision. Otherwise it will preserve the number as a string value.
602
603 Numbers containing a fractional or exponential part will always be
604 represented as numeric (floating point) values, possibly at a loss of
605 precision.
606
607 This might create round-tripping problems as numbers might become strings,
608 but as Perl is typeless there is no other way to do it.
609
610 =item true, false
611
612 These JSON atoms become C<JSON::XS::true> and C<JSON::XS::false>,
613 respectively. They are overloaded to act almost exactly like the numbers
614 C<1> and C<0>. You can check wether a scalar is a JSON boolean by using
615 the C<JSON::XS::is_bool> function.
616
617 =item null
618
619 A JSON null atom becomes C<undef> in Perl.
620
621 =back
622
623
624 =head2 PERL -> JSON
625
626 The mapping from Perl to JSON is slightly more difficult, as Perl is a
627 truly typeless language, so we can only guess which JSON type is meant by
628 a Perl value.
629
630 =over 4
631
632 =item hash references
633
634 Perl hash references become JSON objects. As there is no inherent ordering
635 in hash keys (or JSON objects), they will usually be encoded in a
636 pseudo-random order that can change between runs of the same program but
637 stays generally the same within a single run of a program. JSON::XS can
638 optionally sort the hash keys (determined by the I<canonical> flag), so
639 the same datastructure will serialise to the same JSON text (given same
640 settings and version of JSON::XS), but this incurs a runtime overhead
641 and is only rarely useful, e.g. when you want to compare some JSON text
642 against another for equality.
643
644 =item array references
645
646 Perl array references become JSON arrays.
647
648 =item other references
649
650 Other unblessed references are generally not allowed and will cause an
651 exception to be thrown, except for references to the integers C<0> and
652 C<1>, which get turned into C<false> and C<true> atoms in JSON. You can
653 also use C<JSON::XS::false> and C<JSON::XS::true> to improve readability.
654
655 to_json [\0,JSON::XS::true] # yields [false,true]
656
657 =item JSON::XS::true, JSON::XS::false
658
659 These special values become JSON true and JSON false values,
660 respectively. You cna alos use C<\1> and C<\0> directly if you want.
661
662 =item blessed objects
663
664 Blessed objects are not allowed. JSON::XS currently tries to encode their
665 underlying representation (hash- or arrayref), but this behaviour might
666 change in future versions.
667
668 =item simple scalars
669
670 Simple Perl scalars (any scalar that is not a reference) are the most
671 difficult objects to encode: JSON::XS will encode undefined scalars as
672 JSON null value, scalars that have last been used in a string context
673 before encoding as JSON strings and anything else as number value:
674
675 # dump as number
676 to_json [2] # yields [2]
677 to_json [-3.0e17] # yields [-3e+17]
678 my $value = 5; to_json [$value] # yields [5]
679
680 # used as string, so dump as string
681 print $value;
682 to_json [$value] # yields ["5"]
683
684 # undef becomes null
685 to_json [undef] # yields [null]
686
687 You can force the type to be a string by stringifying it:
688
689 my $x = 3.1; # some variable containing a number
690 "$x"; # stringified
691 $x .= ""; # another, more awkward way to stringify
692 print $x; # perl does it for you, too, quite often
693
694 You can force the type to be a number by numifying it:
695
696 my $x = "3"; # some variable containing a string
697 $x += 0; # numify it, ensuring it will be dumped as a number
698 $x *= 1; # same thing, the choise is yours.
699
700 You can not currently output JSON booleans or force the type in other,
701 less obscure, ways. Tell me if you need this capability.
702
703 =back
704
705
706 =head1 COMPARISON
707
708 As already mentioned, this module was created because none of the existing
709 JSON modules could be made to work correctly. First I will describe the
710 problems (or pleasures) I encountered with various existing JSON modules,
711 followed by some benchmark values. JSON::XS was designed not to suffer
712 from any of these problems or limitations.
713
714 =over 4
715
716 =item JSON 1.07
717
718 Slow (but very portable, as it is written in pure Perl).
719
720 Undocumented/buggy Unicode handling (how JSON handles unicode values is
721 undocumented. One can get far by feeding it unicode strings and doing
722 en-/decoding oneself, but unicode escapes are not working properly).
723
724 No roundtripping (strings get clobbered if they look like numbers, e.g.
725 the string C<2.0> will encode to C<2.0> instead of C<"2.0">, and that will
726 decode into the number 2.
727
728 =item JSON::PC 0.01
729
730 Very fast.
731
732 Undocumented/buggy Unicode handling.
733
734 No roundtripping.
735
736 Has problems handling many Perl values (e.g. regex results and other magic
737 values will make it croak).
738
739 Does not even generate valid JSON (C<{1,2}> gets converted to C<{1:2}>
740 which is not a valid JSON text.
741
742 Unmaintained (maintainer unresponsive for many months, bugs are not
743 getting fixed).
744
745 =item JSON::Syck 0.21
746
747 Very buggy (often crashes).
748
749 Very inflexible (no human-readable format supported, format pretty much
750 undocumented. I need at least a format for easy reading by humans and a
751 single-line compact format for use in a protocol, and preferably a way to
752 generate ASCII-only JSON texts).
753
754 Completely broken (and confusingly documented) Unicode handling (unicode
755 escapes are not working properly, you need to set ImplicitUnicode to
756 I<different> values on en- and decoding to get symmetric behaviour).
757
758 No roundtripping (simple cases work, but this depends on wether the scalar
759 value was used in a numeric context or not).
760
761 Dumping hashes may skip hash values depending on iterator state.
762
763 Unmaintained (maintainer unresponsive for many months, bugs are not
764 getting fixed).
765
766 Does not check input for validity (i.e. will accept non-JSON input and
767 return "something" instead of raising an exception. This is a security
768 issue: imagine two banks transfering money between each other using
769 JSON. One bank might parse a given non-JSON request and deduct money,
770 while the other might reject the transaction with a syntax error. While a
771 good protocol will at least recover, that is extra unnecessary work and
772 the transaction will still not succeed).
773
774 =item JSON::DWIW 0.04
775
776 Very fast. Very natural. Very nice.
777
778 Undocumented unicode handling (but the best of the pack. Unicode escapes
779 still don't get parsed properly).
780
781 Very inflexible.
782
783 No roundtripping.
784
785 Does not generate valid JSON texts (key strings are often unquoted, empty keys
786 result in nothing being output)
787
788 Does not check input for validity.
789
790 =back
791
792
793 =head2 JSON and YAML
794
795 You often hear that JSON is a subset (or a close subset) of YAML. This is,
796 however, a mass hysteria and very far from the truth. In general, there is
797 no way to configure JSON::XS to output a data structure as valid YAML.
798
799 If you really must use JSON::XS to generate YAML, you should use this
800 algorithm (subject to change in future versions):
801
802 my $to_yaml = JSON::XS->new->utf8->space_after (1);
803 my $yaml = $to_yaml->encode ($ref) . "\n";
804
805 This will usually generate JSON texts that also parse as valid
806 YAML. Please note that YAML has hardcoded limits on (simple) object key
807 lengths that JSON doesn't have, so you should make sure that your hash
808 keys are noticably shorter than the 1024 characters YAML allows.
809
810 There might be other incompatibilities that I am not aware of. In general
811 you should not try to generate YAML with a JSON generator or vice versa,
812 or try to parse JSON with a YAML parser or vice versa: chances are high
813 that you will run into severe interoperability problems.
814
815
816 =head2 SPEED
817
818 It seems that JSON::XS is surprisingly fast, as shown in the following
819 tables. They have been generated with the help of the C<eg/bench> program
820 in the JSON::XS distribution, to make it easy to compare on your own
821 system.
822
823 First comes a comparison between various modules using a very short
824 single-line JSON string:
825
826 {"method": "handleMessage", "params": ["user1", "we were just talking"], \
827 "id": null, "array":[1,11,234,-5,1e5,1e7, true, false]}
828
829 It shows the number of encodes/decodes per second (JSON::XS uses
830 the functional interface, while JSON::XS/2 uses the OO interface
831 with pretty-printing and hashkey sorting enabled, JSON::XS/3 enables
832 shrink). Higher is better:
833
834 Storable | 15779.925 | 14169.946 |
835 -----------+------------+------------+
836 module | encode | decode |
837 -----------|------------|------------|
838 JSON | 4990.842 | 4088.813 |
839 JSON::DWIW | 51653.990 | 71575.154 |
840 JSON::PC | 65948.176 | 74631.744 |
841 JSON::PP | 8931.652 | 3817.168 |
842 JSON::Syck | 24877.248 | 27776.848 |
843 JSON::XS | 388361.481 | 227951.304 |
844 JSON::XS/2 | 227951.304 | 218453.333 |
845 JSON::XS/3 | 338250.323 | 218453.333 |
846 Storable | 16500.016 | 135300.129 |
847 -----------+------------+------------+
848
849 That is, JSON::XS is about five times faster than JSON::DWIW on encoding,
850 about three times faster on decoding, and over fourty times faster
851 than JSON, even with pretty-printing and key sorting. It also compares
852 favourably to Storable for small amounts of data.
853
854 Using a longer test string (roughly 18KB, generated from Yahoo! Locals
855 search API (http://nanoref.com/yahooapis/mgPdGg):
856
857 module | encode | decode |
858 -----------|------------|------------|
859 JSON | 55.260 | 34.971 |
860 JSON::DWIW | 825.228 | 1082.513 |
861 JSON::PC | 3571.444 | 2394.829 |
862 JSON::PP | 210.987 | 32.574 |
863 JSON::Syck | 552.551 | 787.544 |
864 JSON::XS | 5780.463 | 4854.519 |
865 JSON::XS/2 | 3869.998 | 4798.975 |
866 JSON::XS/3 | 5862.880 | 4798.975 |
867 Storable | 4445.002 | 5235.027 |
868 -----------+------------+------------+
869
870 Again, JSON::XS leads by far (except for Storable which non-surprisingly
871 decodes faster).
872
873 On large strings containing lots of high unicode characters, some modules
874 (such as JSON::PC) seem to decode faster than JSON::XS, but the result
875 will be broken due to missing (or wrong) unicode handling. Others refuse
876 to decode or encode properly, so it was impossible to prepare a fair
877 comparison table for that case.
878
879
880 =head1 SECURITY CONSIDERATIONS
881
882 When you are using JSON in a protocol, talking to untrusted potentially
883 hostile creatures requires relatively few measures.
884
885 First of all, your JSON decoder should be secure, that is, should not have
886 any buffer overflows. Obviously, this module should ensure that and I am
887 trying hard on making that true, but you never know.
888
889 Second, you need to avoid resource-starving attacks. That means you should
890 limit the size of JSON texts you accept, or make sure then when your
891 resources run out, thats just fine (e.g. by using a separate process that
892 can crash safely). The size of a JSON text in octets or characters is
893 usually a good indication of the size of the resources required to decode
894 it into a Perl structure. While JSON::XS can check the size of the JSON
895 text, it might be too late when you already have it in memory, so you
896 might want to check the size before you accept the string.
897
898 Third, JSON::XS recurses using the C stack when decoding objects and
899 arrays. The C stack is a limited resource: for instance, on my amd64
900 machine with 8MB of stack size I can decode around 180k nested arrays but
901 only 14k nested JSON objects (due to perl itself recursing deeply on croak
902 to free the temporary). If that is exceeded, the program crashes. to be
903 conservative, the default nesting limit is set to 512. If your process
904 has a smaller stack, you should adjust this setting accordingly with the
905 C<max_depth> method.
906
907 And last but least, something else could bomb you that I forgot to think
908 of. In that case, you get to keep the pieces. I am always open for hints,
909 though...
910
911 If you are using JSON::XS to return packets to consumption
912 by javascript scripts in a browser you should have a look at
913 L<http://jpsykes.com/47/practical-csrf-and-json-security> to see wether
914 you are vulnerable to some common attack vectors (which really are browser
915 design bugs, but it is still you who will have to deal with it, as major
916 browser developers care only for features, not about doing security
917 right).
918
919
920 =head1 BUGS
921
922 While the goal of this module is to be correct, that unfortunately does
923 not mean its bug-free, only that I think its design is bug-free. It is
924 still relatively early in its development. If you keep reporting bugs they
925 will be fixed swiftly, though.
926
927 =cut
928
929 our $true = do { bless \(my $dummy = 1), "JSON::XS::Boolean" };
930 our $false = do { bless \(my $dummy = 0), "JSON::XS::Boolean" };
931
932 sub true() { $true }
933 sub false() { $false }
934
935 sub is_bool($) {
936 UNIVERSAL::isa $_[0], "JSON::XS::Boolean"
937 # or UNIVERSAL::isa $_[0], "JSON::Literal"
938 }
939
940 XSLoader::load "JSON::XS", $VERSION;
941
942 package JSON::XS::Boolean;
943
944 use overload
945 "0+" => sub { ${$_[0]} },
946 "++" => sub { $_[0] = ${$_[0]} + 1 },
947 "--" => sub { $_[0] = ${$_[0]} - 1 },
948 fallback => 1;
949
950 1;
951
952 =head1 AUTHOR
953
954 Marc Lehmann <schmorp@schmorp.de>
955 http://home.schmorp.de/
956
957 =cut
958