ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/JSON-XS/XS.pm
Revision: 1.73
Committed: Sun Nov 25 19:36:54 2007 UTC (16 years, 5 months ago) by root
Branch: MAIN
Changes since 1.72: +1 -1 lines
Log Message:
prelim docs

File Contents

# Content
1 =head1 NAME
2
3 JSON::XS - JSON serialising/deserialising, done correctly and fast
4
5 JSON::XS - 正しくて高速な JSON シリアライザ/デシリアライザ
6 (http://fleur.hio.jp/perldoc/mix/lib/JSON/XS.html)
7
8 =head1 SYNOPSIS
9
10 use JSON::XS;
11
12 # exported functions, they croak on error
13 # and expect/generate UTF-8
14
15 $utf8_encoded_json_text = to_json $perl_hash_or_arrayref;
16 $perl_hash_or_arrayref = from_json $utf8_encoded_json_text;
17
18 # OO-interface
19
20 $coder = JSON::XS->new->ascii->pretty->allow_nonref;
21 $pretty_printed_unencoded = $coder->encode ($perl_scalar);
22 $perl_scalar = $coder->decode ($unicode_json_text);
23
24 =head1 DESCRIPTION
25
26 This module converts Perl data structures to JSON and vice versa. Its
27 primary goal is to be I<correct> and its secondary goal is to be
28 I<fast>. To reach the latter goal it was written in C.
29
30 As this is the n-th-something JSON module on CPAN, what was the reason
31 to write yet another JSON module? While it seems there are many JSON
32 modules, none of them correctly handle all corner cases, and in most cases
33 their maintainers are unresponsive, gone missing, or not listening to bug
34 reports for other reasons.
35
36 See COMPARISON, below, for a comparison to some other JSON modules.
37
38 See MAPPING, below, on how JSON::XS maps perl values to JSON values and
39 vice versa.
40
41 =head2 FEATURES
42
43 =over 4
44
45 =item * correct Unicode handling
46
47 This module knows how to handle Unicode, and even documents how and when
48 it does so.
49
50 =item * round-trip integrity
51
52 When you serialise a perl data structure using only datatypes supported
53 by JSON, the deserialised data structure is identical on the Perl level.
54 (e.g. the string "2.0" doesn't suddenly become "2" just because it looks
55 like a number).
56
57 =item * strict checking of JSON correctness
58
59 There is no guessing, no generating of illegal JSON texts by default,
60 and only JSON is accepted as input by default (the latter is a security
61 feature).
62
63 =item * fast
64
65 Compared to other JSON modules, this module compares favourably in terms
66 of speed, too.
67
68 =item * simple to use
69
70 This module has both a simple functional interface as well as an OO
71 interface.
72
73 =item * reasonably versatile output formats
74
75 You can choose between the most compact guaranteed single-line format
76 possible (nice for simple line-based protocols), a pure-ascii format
77 (for when your transport is not 8-bit clean, still supports the whole
78 Unicode range), or a pretty-printed format (for when you want to read that
79 stuff). Or you can combine those features in whatever way you like.
80
81 =back
82
83 =cut
84
85 package JSON::XS;
86
87 use strict;
88
89 our $VERSION = '1.6';
90 our @ISA = qw(Exporter);
91
92 our @EXPORT = qw(to_json from_json);
93
94 use Exporter;
95 use XSLoader;
96
97 =head1 FUNCTIONAL INTERFACE
98
99 The following convenience methods are provided by this module. They are
100 exported by default:
101
102 =over 4
103
104 =item $json_text = to_json $perl_scalar
105
106 Converts the given Perl data structure to a UTF-8 encoded, binary string
107 (that is, the string contains octets only). Croaks on error.
108
109 This function call is functionally identical to:
110
111 $json_text = JSON::XS->new->utf8->encode ($perl_scalar)
112
113 except being faster.
114
115 =item $perl_scalar = from_json $json_text
116
117 The opposite of C<to_json>: expects an UTF-8 (binary) string and tries
118 to parse that as an UTF-8 encoded JSON text, returning the resulting
119 reference. Croaks on error.
120
121 This function call is functionally identical to:
122
123 $perl_scalar = JSON::XS->new->utf8->decode ($json_text)
124
125 except being faster.
126
127 =item $is_boolean = JSON::XS::is_bool $scalar
128
129 Returns true if the passed scalar represents either JSON::XS::true or
130 JSON::XS::false, two constants that act like C<1> and C<0>, respectively
131 and are used to represent JSON C<true> and C<false> values in Perl.
132
133 See MAPPING, below, for more information on how JSON values are mapped to
134 Perl.
135
136 =back
137
138
139 =head1 A FEW NOTES ON UNICODE AND PERL
140
141 Since this often leads to confusion, here are a few very clear words on
142 how Unicode works in Perl, modulo bugs.
143
144 =over 4
145
146 =item 1. Perl strings can store characters with ordinal values > 255.
147
148 This enables you to store Unicode characters as single characters in a
149 Perl string - very natural.
150
151 =item 2. Perl does I<not> associate an encoding with your strings.
152
153 Unless you force it to, e.g. when matching it against a regex, or printing
154 the scalar to a file, in which case Perl either interprets your string as
155 locale-encoded text, octets/binary, or as Unicode, depending on various
156 settings. In no case is an encoding stored together with your data, it is
157 I<use> that decides encoding, not any magical metadata.
158
159 =item 3. The internal utf-8 flag has no meaning with regards to the
160 encoding of your string.
161
162 Just ignore that flag unless you debug a Perl bug, a module written in
163 XS or want to dive into the internals of perl. Otherwise it will only
164 confuse you, as, despite the name, it says nothing about how your string
165 is encoded. You can have Unicode strings with that flag set, with that
166 flag clear, and you can have binary data with that flag set and that flag
167 clear. Other possibilities exist, too.
168
169 If you didn't know about that flag, just the better, pretend it doesn't
170 exist.
171
172 =item 4. A "Unicode String" is simply a string where each character can be
173 validly interpreted as a Unicode codepoint.
174
175 If you have UTF-8 encoded data, it is no longer a Unicode string, but a
176 Unicode string encoded in UTF-8, giving you a binary string.
177
178 =item 5. A string containing "high" (> 255) character values is I<not> a UTF-8 string.
179
180 It's a fact. Learn to live with it.
181
182 =back
183
184 I hope this helps :)
185
186
187 =head1 OBJECT-ORIENTED INTERFACE
188
189 The object oriented interface lets you configure your own encoding or
190 decoding style, within the limits of supported formats.
191
192 =over 4
193
194 =item $json = new JSON::XS
195
196 Creates a new JSON::XS object that can be used to de/encode JSON
197 strings. All boolean flags described below are by default I<disabled>.
198
199 The mutators for flags all return the JSON object again and thus calls can
200 be chained:
201
202 my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]})
203 => {"a": [1, 2]}
204
205 =item $json = $json->ascii ([$enable])
206
207 =item $enabled = $json->get_ascii
208
209 If C<$enable> is true (or missing), then the C<encode> method will not
210 generate characters outside the code range C<0..127> (which is ASCII). Any
211 Unicode characters outside that range will be escaped using either a
212 single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL escape sequence,
213 as per RFC4627. The resulting encoded JSON text can be treated as a native
214 Unicode string, an ascii-encoded, latin1-encoded or UTF-8 encoded string,
215 or any other superset of ASCII.
216
217 If C<$enable> is false, then the C<encode> method will not escape Unicode
218 characters unless required by the JSON syntax or other flags. This results
219 in a faster and more compact format.
220
221 The main use for this flag is to produce JSON texts that can be
222 transmitted over a 7-bit channel, as the encoded JSON texts will not
223 contain any 8 bit characters.
224
225 JSON::XS->new->ascii (1)->encode ([chr 0x10401])
226 => ["\ud801\udc01"]
227
228 =item $json = $json->latin1 ([$enable])
229
230 =item $enabled = $json->get_latin1
231
232 If C<$enable> is true (or missing), then the C<encode> method will encode
233 the resulting JSON text as latin1 (or iso-8859-1), escaping any characters
234 outside the code range C<0..255>. The resulting string can be treated as a
235 latin1-encoded JSON text or a native Unicode string. The C<decode> method
236 will not be affected in any way by this flag, as C<decode> by default
237 expects Unicode, which is a strict superset of latin1.
238
239 If C<$enable> is false, then the C<encode> method will not escape Unicode
240 characters unless required by the JSON syntax or other flags.
241
242 The main use for this flag is efficiently encoding binary data as JSON
243 text, as most octets will not be escaped, resulting in a smaller encoded
244 size. The disadvantage is that the resulting JSON text is encoded
245 in latin1 (and must correctly be treated as such when storing and
246 transferring), a rare encoding for JSON. It is therefore most useful when
247 you want to store data structures known to contain binary data efficiently
248 in files or databases, not when talking to other JSON encoders/decoders.
249
250 JSON::XS->new->latin1->encode (["\x{89}\x{abc}"]
251 => ["\x{89}\\u0abc"] # (perl syntax, U+abc escaped, U+89 not)
252
253 =item $json = $json->utf8 ([$enable])
254
255 =item $enabled = $json->get_utf8
256
257 If C<$enable> is true (or missing), then the C<encode> method will encode
258 the JSON result into UTF-8, as required by many protocols, while the
259 C<decode> method expects to be handled an UTF-8-encoded string. Please
260 note that UTF-8-encoded strings do not contain any characters outside the
261 range C<0..255>, they are thus useful for bytewise/binary I/O. In future
262 versions, enabling this option might enable autodetection of the UTF-16
263 and UTF-32 encoding families, as described in RFC4627.
264
265 If C<$enable> is false, then the C<encode> method will return the JSON
266 string as a (non-encoded) Unicode string, while C<decode> expects thus a
267 Unicode string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs
268 to be done yourself, e.g. using the Encode module.
269
270 Example, output UTF-16BE-encoded JSON:
271
272 use Encode;
273 $jsontext = encode "UTF-16BE", JSON::XS->new->encode ($object);
274
275 Example, decode UTF-32LE-encoded JSON:
276
277 use Encode;
278 $object = JSON::XS->new->decode (decode "UTF-32LE", $jsontext);
279
280 =item $json = $json->pretty ([$enable])
281
282 =item $enabled = $json->get_pretty
283
284 This enables (or disables) all of the C<indent>, C<space_before> and
285 C<space_after> (and in the future possibly more) flags in one call to
286 generate the most readable (or most compact) form possible.
287
288 Example, pretty-print some simple structure:
289
290 my $json = JSON::XS->new->pretty(1)->encode ({a => [1,2]})
291 =>
292 {
293 "a" : [
294 1,
295 2
296 ]
297 }
298
299 =item $json = $json->indent ([$enable])
300
301 =item $enabled = $json->get_indent
302
303 If C<$enable> is true (or missing), then the C<encode> method will use a multiline
304 format as output, putting every array member or object/hash key-value pair
305 into its own line, indenting them properly.
306
307 If C<$enable> is false, no newlines or indenting will be produced, and the
308 resulting JSON text is guaranteed not to contain any C<newlines>.
309
310 This setting has no effect when decoding JSON texts.
311
312 =item $json = $json->space_before ([$enable])
313
314 =item $enabled = $json->get_space_before
315
316 If C<$enable> is true (or missing), then the C<encode> method will add an extra
317 optional space before the C<:> separating keys from values in JSON objects.
318
319 If C<$enable> is false, then the C<encode> method will not add any extra
320 space at those places.
321
322 This setting has no effect when decoding JSON texts. You will also
323 most likely combine this setting with C<space_after>.
324
325 Example, space_before enabled, space_after and indent disabled:
326
327 {"key" :"value"}
328
329 =item $json = $json->space_after ([$enable])
330
331 =item $enabled = $json->get_space_after
332
333 If C<$enable> is true (or missing), then the C<encode> method will add an extra
334 optional space after the C<:> separating keys from values in JSON objects
335 and extra whitespace after the C<,> separating key-value pairs and array
336 members.
337
338 If C<$enable> is false, then the C<encode> method will not add any extra
339 space at those places.
340
341 This setting has no effect when decoding JSON texts.
342
343 Example, space_before and indent disabled, space_after enabled:
344
345 {"key": "value"}
346
347 =item $json = $json->relaxed ([$enable])
348
349 =item $enabled = $json->get_relaxed
350
351 If C<$enable> is true (or missing), then C<decode> will accept some
352 extensions to normal JSON syntax (see below). C<encode> will not be
353 affected in anyway. I<Be aware that this option makes you accept invalid
354 JSON texts as if they were valid!>. I suggest only to use this option to
355 parse application-specific files written by humans (configuration files,
356 resource files etc.)
357
358 If C<$enable> is false (the default), then C<decode> will only accept
359 valid JSON texts.
360
361 Currently accepted extensions are:
362
363 =over 4
364
365 =item * list items can have an end-comma
366
367 JSON I<separates> array elements and key-value pairs with commas. This
368 can be annoying if you write JSON texts manually and want to be able to
369 quickly append elements, so this extension accepts comma at the end of
370 such items not just between them:
371
372 [
373 1,
374 2, <- this comma not normally allowed
375 ]
376 {
377 "k1": "v1",
378 "k2": "v2", <- this comma not normally allowed
379 }
380
381 =item * shell-style '#'-comments
382
383 Whenever JSON allows whitespace, shell-style comments are additionally
384 allowed. They are terminated by the first carriage-return or line-feed
385 character, after which more white-space and comments are allowed.
386
387 [
388 1, # this comment not allowed in JSON
389 # neither this one...
390 ]
391
392 =back
393
394 =item $json = $json->canonical ([$enable])
395
396 =item $enabled = $json->get_canonical
397
398 If C<$enable> is true (or missing), then the C<encode> method will output JSON objects
399 by sorting their keys. This is adding a comparatively high overhead.
400
401 If C<$enable> is false, then the C<encode> method will output key-value
402 pairs in the order Perl stores them (which will likely change between runs
403 of the same script).
404
405 This option is useful if you want the same data structure to be encoded as
406 the same JSON text (given the same overall settings). If it is disabled,
407 the same hash might be encoded differently even if contains the same data,
408 as key-value pairs have no inherent ordering in Perl.
409
410 This setting has no effect when decoding JSON texts.
411
412 =item $json = $json->allow_nonref ([$enable])
413
414 =item $enabled = $json->get_allow_nonref
415
416 If C<$enable> is true (or missing), then the C<encode> method can convert a
417 non-reference into its corresponding string, number or null JSON value,
418 which is an extension to RFC4627. Likewise, C<decode> will accept those JSON
419 values instead of croaking.
420
421 If C<$enable> is false, then the C<encode> method will croak if it isn't
422 passed an arrayref or hashref, as JSON texts must either be an object
423 or array. Likewise, C<decode> will croak if given something that is not a
424 JSON object or array.
425
426 Example, encode a Perl scalar as JSON value with enabled C<allow_nonref>,
427 resulting in an invalid JSON text:
428
429 JSON::XS->new->allow_nonref->encode ("Hello, World!")
430 => "Hello, World!"
431
432 =item $json = $json->allow_blessed ([$enable])
433
434 =item $enabled = $json->get_allow_bless
435
436 If C<$enable> is true (or missing), then the C<encode> method will not
437 barf when it encounters a blessed reference. Instead, the value of the
438 B<convert_blessed> option will decide whether C<null> (C<convert_blessed>
439 disabled or no C<to_json> method found) or a representation of the
440 object (C<convert_blessed> enabled and C<to_json> method found) is being
441 encoded. Has no effect on C<decode>.
442
443 If C<$enable> is false (the default), then C<encode> will throw an
444 exception when it encounters a blessed object.
445
446 =item $json = $json->convert_blessed ([$enable])
447
448 =item $enabled = $json->get_convert_blessed
449
450 If C<$enable> is true (or missing), then C<encode>, upon encountering a
451 blessed object, will check for the availability of the C<TO_JSON> method
452 on the object's class. If found, it will be called in scalar context
453 and the resulting scalar will be encoded instead of the object. If no
454 C<TO_JSON> method is found, the value of C<allow_blessed> will decide what
455 to do.
456
457 The C<TO_JSON> method may safely call die if it wants. If C<TO_JSON>
458 returns other blessed objects, those will be handled in the same
459 way. C<TO_JSON> must take care of not causing an endless recursion cycle
460 (== crash) in this case. The name of C<TO_JSON> was chosen because other
461 methods called by the Perl core (== not by the user of the object) are
462 usually in upper case letters and to avoid collisions with the C<to_json>
463 function.
464
465 This setting does not yet influence C<decode> in any way, but in the
466 future, global hooks might get installed that influence C<decode> and are
467 enabled by this setting.
468
469 If C<$enable> is false, then the C<allow_blessed> setting will decide what
470 to do when a blessed object is found.
471
472 =item $json = $json->filter_json_object ([$coderef->($hashref)])
473
474 When C<$coderef> is specified, it will be called from C<decode> each
475 time it decodes a JSON object. The only argument is a reference to the
476 newly-created hash. If the code references returns a single scalar (which
477 need not be a reference), this value (i.e. a copy of that scalar to avoid
478 aliasing) is inserted into the deserialised data structure. If it returns
479 an empty list (NOTE: I<not> C<undef>, which is a valid scalar), the
480 original deserialised hash will be inserted. This setting can slow down
481 decoding considerably.
482
483 When C<$coderef> is omitted or undefined, any existing callback will
484 be removed and C<decode> will not change the deserialised hash in any
485 way.
486
487 Example, convert all JSON objects into the integer 5:
488
489 my $js = JSON::XS->new->filter_json_object (sub { 5 });
490 # returns [5]
491 $js->decode ('[{}]')
492 # throw an exception because allow_nonref is not enabled
493 # so a lone 5 is not allowed.
494 $js->decode ('{"a":1, "b":2}');
495
496 =item $json = $json->filter_json_single_key_object ($key [=> $coderef->($value)])
497
498 Works remotely similar to C<filter_json_object>, but is only called for
499 JSON objects having a single key named C<$key>.
500
501 This C<$coderef> is called before the one specified via
502 C<filter_json_object>, if any. It gets passed the single value in the JSON
503 object. If it returns a single value, it will be inserted into the data
504 structure. If it returns nothing (not even C<undef> but the empty list),
505 the callback from C<filter_json_object> will be called next, as if no
506 single-key callback were specified.
507
508 If C<$coderef> is omitted or undefined, the corresponding callback will be
509 disabled. There can only ever be one callback for a given key.
510
511 As this callback gets called less often then the C<filter_json_object>
512 one, decoding speed will not usually suffer as much. Therefore, single-key
513 objects make excellent targets to serialise Perl objects into, especially
514 as single-key JSON objects are as close to the type-tagged value concept
515 as JSON gets (it's basically an ID/VALUE tuple). Of course, JSON does not
516 support this in any way, so you need to make sure your data never looks
517 like a serialised Perl hash.
518
519 Typical names for the single object key are C<__class_whatever__>, or
520 C<$__dollars_are_rarely_used__$> or C<}ugly_brace_placement>, or even
521 things like C<__class_md5sum(classname)__>, to reduce the risk of clashing
522 with real hashes.
523
524 Example, decode JSON objects of the form C<< { "__widget__" => <id> } >>
525 into the corresponding C<< $WIDGET{<id>} >> object:
526
527 # return whatever is in $WIDGET{5}:
528 JSON::XS
529 ->new
530 ->filter_json_single_key_object (__widget__ => sub {
531 $WIDGET{ $_[0] }
532 })
533 ->decode ('{"__widget__": 5')
534
535 # this can be used with a TO_JSON method in some "widget" class
536 # for serialisation to json:
537 sub WidgetBase::TO_JSON {
538 my ($self) = @_;
539
540 unless ($self->{id}) {
541 $self->{id} = ..get..some..id..;
542 $WIDGET{$self->{id}} = $self;
543 }
544
545 { __widget__ => $self->{id} }
546 }
547
548 =item $json = $json->shrink ([$enable])
549
550 =item $enabled = $json->get_shrink
551
552 Perl usually over-allocates memory a bit when allocating space for
553 strings. This flag optionally resizes strings generated by either
554 C<encode> or C<decode> to their minimum size possible. This can save
555 memory when your JSON texts are either very very long or you have many
556 short strings. It will also try to downgrade any strings to octet-form
557 if possible: perl stores strings internally either in an encoding called
558 UTF-X or in octet-form. The latter cannot store everything but uses less
559 space in general (and some buggy Perl or C code might even rely on that
560 internal representation being used).
561
562 The actual definition of what shrink does might change in future versions,
563 but it will always try to save space at the expense of time.
564
565 If C<$enable> is true (or missing), the string returned by C<encode> will
566 be shrunk-to-fit, while all strings generated by C<decode> will also be
567 shrunk-to-fit.
568
569 If C<$enable> is false, then the normal perl allocation algorithms are used.
570 If you work with your data, then this is likely to be faster.
571
572 In the future, this setting might control other things, such as converting
573 strings that look like integers or floats into integers or floats
574 internally (there is no difference on the Perl level), saving space.
575
576 =item $json = $json->max_depth ([$maximum_nesting_depth])
577
578 =item $max_depth = $json->get_max_depth
579
580 Sets the maximum nesting level (default C<512>) accepted while encoding
581 or decoding. If the JSON text or Perl data structure has an equal or
582 higher nesting level then this limit, then the encoder and decoder will
583 stop and croak at that point.
584
585 Nesting level is defined by number of hash- or arrayrefs that the encoder
586 needs to traverse to reach a given point or the number of C<{> or C<[>
587 characters without their matching closing parenthesis crossed to reach a
588 given character in a string.
589
590 Setting the maximum depth to one disallows any nesting, so that ensures
591 that the object is only a single hash/object or array.
592
593 The argument to C<max_depth> will be rounded up to the next highest power
594 of two. If no argument is given, the highest possible setting will be
595 used, which is rarely useful.
596
597 See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
598
599 =item $json = $json->max_size ([$maximum_string_size])
600
601 =item $max_size = $json->get_max_size
602
603 Set the maximum length a JSON text may have (in bytes) where decoding is
604 being attempted. The default is C<0>, meaning no limit. When C<decode>
605 is called on a string longer then this number of characters it will not
606 attempt to decode the string but throw an exception. This setting has no
607 effect on C<encode> (yet).
608
609 The argument to C<max_size> will be rounded up to the next B<highest>
610 power of two (so may be more than requested). If no argument is given, the
611 limit check will be deactivated (same as when C<0> is specified).
612
613 See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
614
615 =item $json_text = $json->encode ($perl_scalar)
616
617 Converts the given Perl data structure (a simple scalar or a reference
618 to a hash or array) to its JSON representation. Simple scalars will be
619 converted into JSON string or number sequences, while references to arrays
620 become JSON arrays and references to hashes become JSON objects. Undefined
621 Perl values (e.g. C<undef>) become JSON C<null> values. Neither C<true>
622 nor C<false> values will be generated.
623
624 =item $perl_scalar = $json->decode ($json_text)
625
626 The opposite of C<encode>: expects a JSON text and tries to parse it,
627 returning the resulting simple scalar or reference. Croaks on error.
628
629 JSON numbers and strings become simple Perl scalars. JSON arrays become
630 Perl arrayrefs and JSON objects become Perl hashrefs. C<true> becomes
631 C<1>, C<false> becomes C<0> and C<null> becomes C<undef>.
632
633 =item ($perl_scalar, $characters) = $json->decode_prefix ($json_text)
634
635 This works like the C<decode> method, but instead of raising an exception
636 when there is trailing garbage after the first JSON object, it will
637 silently stop parsing there and return the number of characters consumed
638 so far.
639
640 This is useful if your JSON texts are not delimited by an outer protocol
641 (which is not the brightest thing to do in the first place) and you need
642 to know where the JSON text ends.
643
644 JSON::XS->new->decode_prefix ("[1] the tail")
645 => ([], 3)
646
647 =back
648
649
650 =head1 MAPPING
651
652 This section describes how JSON::XS maps Perl values to JSON values and
653 vice versa. These mappings are designed to "do the right thing" in most
654 circumstances automatically, preserving round-tripping characteristics
655 (what you put in comes out as something equivalent).
656
657 For the more enlightened: note that in the following descriptions,
658 lowercase I<perl> refers to the Perl interpreter, while uppercase I<Perl>
659 refers to the abstract Perl language itself.
660
661
662 =head2 JSON -> PERL
663
664 =over 4
665
666 =item object
667
668 A JSON object becomes a reference to a hash in Perl. No ordering of object
669 keys is preserved (JSON does not preserve object key ordering itself).
670
671 =item array
672
673 A JSON array becomes a reference to an array in Perl.
674
675 =item string
676
677 A JSON string becomes a string scalar in Perl - Unicode codepoints in JSON
678 are represented by the same codepoints in the Perl string, so no manual
679 decoding is necessary.
680
681 =item number
682
683 A JSON number becomes either an integer, numeric (floating point) or
684 string scalar in perl, depending on its range and any fractional parts. On
685 the Perl level, there is no difference between those as Perl handles all
686 the conversion details, but an integer may take slightly less memory and
687 might represent more values exactly than (floating point) numbers.
688
689 If the number consists of digits only, JSON::XS will try to represent
690 it as an integer value. If that fails, it will try to represent it as
691 a numeric (floating point) value if that is possible without loss of
692 precision. Otherwise it will preserve the number as a string value.
693
694 Numbers containing a fractional or exponential part will always be
695 represented as numeric (floating point) values, possibly at a loss of
696 precision.
697
698 This might create round-tripping problems as numbers might become strings,
699 but as Perl is typeless there is no other way to do it.
700
701 =item true, false
702
703 These JSON atoms become C<JSON::XS::true> and C<JSON::XS::false>,
704 respectively. They are overloaded to act almost exactly like the numbers
705 C<1> and C<0>. You can check whether a scalar is a JSON boolean by using
706 the C<JSON::XS::is_bool> function.
707
708 =item null
709
710 A JSON null atom becomes C<undef> in Perl.
711
712 =back
713
714
715 =head2 PERL -> JSON
716
717 The mapping from Perl to JSON is slightly more difficult, as Perl is a
718 truly typeless language, so we can only guess which JSON type is meant by
719 a Perl value.
720
721 =over 4
722
723 =item hash references
724
725 Perl hash references become JSON objects. As there is no inherent ordering
726 in hash keys (or JSON objects), they will usually be encoded in a
727 pseudo-random order that can change between runs of the same program but
728 stays generally the same within a single run of a program. JSON::XS can
729 optionally sort the hash keys (determined by the I<canonical> flag), so
730 the same datastructure will serialise to the same JSON text (given same
731 settings and version of JSON::XS), but this incurs a runtime overhead
732 and is only rarely useful, e.g. when you want to compare some JSON text
733 against another for equality.
734
735 =item array references
736
737 Perl array references become JSON arrays.
738
739 =item other references
740
741 Other unblessed references are generally not allowed and will cause an
742 exception to be thrown, except for references to the integers C<0> and
743 C<1>, which get turned into C<false> and C<true> atoms in JSON. You can
744 also use C<JSON::XS::false> and C<JSON::XS::true> to improve readability.
745
746 to_json [\0,JSON::XS::true] # yields [false,true]
747
748 =item JSON::XS::true, JSON::XS::false
749
750 These special values become JSON true and JSON false values,
751 respectively. You can also use C<\1> and C<\0> directly if you want.
752
753 =item blessed objects
754
755 Blessed objects are not allowed. JSON::XS currently tries to encode their
756 underlying representation (hash- or arrayref), but this behaviour might
757 change in future versions.
758
759 =item simple scalars
760
761 Simple Perl scalars (any scalar that is not a reference) are the most
762 difficult objects to encode: JSON::XS will encode undefined scalars as
763 JSON null value, scalars that have last been used in a string context
764 before encoding as JSON strings and anything else as number value:
765
766 # dump as number
767 to_json [2] # yields [2]
768 to_json [-3.0e17] # yields [-3e+17]
769 my $value = 5; to_json [$value] # yields [5]
770
771 # used as string, so dump as string
772 print $value;
773 to_json [$value] # yields ["5"]
774
775 # undef becomes null
776 to_json [undef] # yields [null]
777
778 You can force the type to be a JSON string by stringifying it:
779
780 my $x = 3.1; # some variable containing a number
781 "$x"; # stringified
782 $x .= ""; # another, more awkward way to stringify
783 print $x; # perl does it for you, too, quite often
784
785 You can force the type to be a JSON number by numifying it:
786
787 my $x = "3"; # some variable containing a string
788 $x += 0; # numify it, ensuring it will be dumped as a number
789 $x *= 1; # same thing, the choice is yours.
790
791 You can not currently force the type in other, less obscure, ways. Tell me
792 if you need this capability.
793
794 =back
795
796
797 =head1 COMPARISON
798
799 As already mentioned, this module was created because none of the existing
800 JSON modules could be made to work correctly. First I will describe the
801 problems (or pleasures) I encountered with various existing JSON modules,
802 followed by some benchmark values. JSON::XS was designed not to suffer
803 from any of these problems or limitations.
804
805 =over 4
806
807 =item JSON 1.07
808
809 Slow (but very portable, as it is written in pure Perl).
810
811 Undocumented/buggy Unicode handling (how JSON handles Unicode values is
812 undocumented. One can get far by feeding it Unicode strings and doing
813 en-/decoding oneself, but Unicode escapes are not working properly).
814
815 No round-tripping (strings get clobbered if they look like numbers, e.g.
816 the string C<2.0> will encode to C<2.0> instead of C<"2.0">, and that will
817 decode into the number 2.
818
819 =item JSON::PC 0.01
820
821 Very fast.
822
823 Undocumented/buggy Unicode handling.
824
825 No round-tripping.
826
827 Has problems handling many Perl values (e.g. regex results and other magic
828 values will make it croak).
829
830 Does not even generate valid JSON (C<{1,2}> gets converted to C<{1:2}>
831 which is not a valid JSON text.
832
833 Unmaintained (maintainer unresponsive for many months, bugs are not
834 getting fixed).
835
836 =item JSON::Syck 0.21
837
838 Very buggy (often crashes).
839
840 Very inflexible (no human-readable format supported, format pretty much
841 undocumented. I need at least a format for easy reading by humans and a
842 single-line compact format for use in a protocol, and preferably a way to
843 generate ASCII-only JSON texts).
844
845 Completely broken (and confusingly documented) Unicode handling (Unicode
846 escapes are not working properly, you need to set ImplicitUnicode to
847 I<different> values on en- and decoding to get symmetric behaviour).
848
849 No round-tripping (simple cases work, but this depends on whether the scalar
850 value was used in a numeric context or not).
851
852 Dumping hashes may skip hash values depending on iterator state.
853
854 Unmaintained (maintainer unresponsive for many months, bugs are not
855 getting fixed).
856
857 Does not check input for validity (i.e. will accept non-JSON input and
858 return "something" instead of raising an exception. This is a security
859 issue: imagine two banks transferring money between each other using
860 JSON. One bank might parse a given non-JSON request and deduct money,
861 while the other might reject the transaction with a syntax error. While a
862 good protocol will at least recover, that is extra unnecessary work and
863 the transaction will still not succeed).
864
865 =item JSON::DWIW 0.04
866
867 Very fast. Very natural. Very nice.
868
869 Undocumented Unicode handling (but the best of the pack. Unicode escapes
870 still don't get parsed properly).
871
872 Very inflexible.
873
874 No round-tripping.
875
876 Does not generate valid JSON texts (key strings are often unquoted, empty keys
877 result in nothing being output)
878
879 Does not check input for validity.
880
881 =back
882
883
884 =head2 JSON and YAML
885
886 You often hear that JSON is a subset (or a close subset) of YAML. This is,
887 however, a mass hysteria and very far from the truth. In general, there is
888 no way to configure JSON::XS to output a data structure as valid YAML.
889
890 If you really must use JSON::XS to generate YAML, you should use this
891 algorithm (subject to change in future versions):
892
893 my $to_yaml = JSON::XS->new->utf8->space_after (1);
894 my $yaml = $to_yaml->encode ($ref) . "\n";
895
896 This will usually generate JSON texts that also parse as valid
897 YAML. Please note that YAML has hardcoded limits on (simple) object key
898 lengths that JSON doesn't have, so you should make sure that your hash
899 keys are noticeably shorter than the 1024 characters YAML allows.
900
901 There might be other incompatibilities that I am not aware of. In general
902 you should not try to generate YAML with a JSON generator or vice versa,
903 or try to parse JSON with a YAML parser or vice versa: chances are high
904 that you will run into severe interoperability problems.
905
906
907 =head2 SPEED
908
909 It seems that JSON::XS is surprisingly fast, as shown in the following
910 tables. They have been generated with the help of the C<eg/bench> program
911 in the JSON::XS distribution, to make it easy to compare on your own
912 system.
913
914 First comes a comparison between various modules using a very short
915 single-line JSON string:
916
917 {"method": "handleMessage", "params": ["user1", "we were just talking"], \
918 "id": null, "array":[1,11,234,-5,1e5,1e7, true, false]}
919
920 It shows the number of encodes/decodes per second (JSON::XS uses
921 the functional interface, while JSON::XS/2 uses the OO interface
922 with pretty-printing and hashkey sorting enabled, JSON::XS/3 enables
923 shrink). Higher is better:
924
925 module | encode | decode |
926 -----------|------------|------------|
927 JSON 1.x | 4990.842 | 4088.813 |
928 JSON::DWIW | 51653.990 | 71575.154 |
929 JSON::PC | 65948.176 | 74631.744 |
930 JSON::PP | 8931.652 | 3817.168 |
931 JSON::Syck | 24877.248 | 27776.848 |
932 JSON::XS | 388361.481 | 227951.304 |
933 JSON::XS/2 | 227951.304 | 218453.333 |
934 JSON::XS/3 | 338250.323 | 218453.333 |
935 Storable | 16500.016 | 135300.129 |
936 -----------+------------+------------+
937
938 That is, JSON::XS is about five times faster than JSON::DWIW on encoding,
939 about three times faster on decoding, and over forty times faster
940 than JSON, even with pretty-printing and key sorting. It also compares
941 favourably to Storable for small amounts of data.
942
943 Using a longer test string (roughly 18KB, generated from Yahoo! Locals
944 search API (http://nanoref.com/yahooapis/mgPdGg):
945
946 module | encode | decode |
947 -----------|------------|------------|
948 JSON 1.x | 55.260 | 34.971 |
949 JSON::DWIW | 825.228 | 1082.513 |
950 JSON::PC | 3571.444 | 2394.829 |
951 JSON::PP | 210.987 | 32.574 |
952 JSON::Syck | 552.551 | 787.544 |
953 JSON::XS | 5780.463 | 4854.519 |
954 JSON::XS/2 | 3869.998 | 4798.975 |
955 JSON::XS/3 | 5862.880 | 4798.975 |
956 Storable | 4445.002 | 5235.027 |
957 -----------+------------+------------+
958
959 Again, JSON::XS leads by far (except for Storable which non-surprisingly
960 decodes faster).
961
962 On large strings containing lots of high Unicode characters, some modules
963 (such as JSON::PC) seem to decode faster than JSON::XS, but the result
964 will be broken due to missing (or wrong) Unicode handling. Others refuse
965 to decode or encode properly, so it was impossible to prepare a fair
966 comparison table for that case.
967
968
969 =head1 SECURITY CONSIDERATIONS
970
971 When you are using JSON in a protocol, talking to untrusted potentially
972 hostile creatures requires relatively few measures.
973
974 First of all, your JSON decoder should be secure, that is, should not have
975 any buffer overflows. Obviously, this module should ensure that and I am
976 trying hard on making that true, but you never know.
977
978 Second, you need to avoid resource-starving attacks. That means you should
979 limit the size of JSON texts you accept, or make sure then when your
980 resources run out, that's just fine (e.g. by using a separate process that
981 can crash safely). The size of a JSON text in octets or characters is
982 usually a good indication of the size of the resources required to decode
983 it into a Perl structure. While JSON::XS can check the size of the JSON
984 text, it might be too late when you already have it in memory, so you
985 might want to check the size before you accept the string.
986
987 Third, JSON::XS recurses using the C stack when decoding objects and
988 arrays. The C stack is a limited resource: for instance, on my amd64
989 machine with 8MB of stack size I can decode around 180k nested arrays but
990 only 14k nested JSON objects (due to perl itself recursing deeply on croak
991 to free the temporary). If that is exceeded, the program crashes. to be
992 conservative, the default nesting limit is set to 512. If your process
993 has a smaller stack, you should adjust this setting accordingly with the
994 C<max_depth> method.
995
996 And last but least, something else could bomb you that I forgot to think
997 of. In that case, you get to keep the pieces. I am always open for hints,
998 though...
999
1000 If you are using JSON::XS to return packets to consumption
1001 by JavaScript scripts in a browser you should have a look at
1002 L<http://jpsykes.com/47/practical-csrf-and-json-security> to see whether
1003 you are vulnerable to some common attack vectors (which really are browser
1004 design bugs, but it is still you who will have to deal with it, as major
1005 browser developers care only for features, not about doing security
1006 right).
1007
1008
1009 =head1 THREADS
1010
1011 This module is I<not> guaranteed to be thread safe and there are no
1012 plans to change this until Perl gets thread support (as opposed to the
1013 horribly slow so-called "threads" which are simply slow and bloated
1014 process simulations - use fork, its I<much> faster, cheaper, better).
1015
1016 (It might actually work, but you have been warned).
1017
1018
1019 =head1 BUGS
1020
1021 While the goal of this module is to be correct, that unfortunately does
1022 not mean its bug-free, only that I think its design is bug-free. It is
1023 still relatively early in its development. If you keep reporting bugs they
1024 will be fixed swiftly, though.
1025
1026 Please refrain from using rt.cpan.org or any other bug reporting
1027 service. I put the contact address into my modules for a reason.
1028
1029 =cut
1030
1031 our $true = do { bless \(my $dummy = 1), "JSON::XS::Boolean" };
1032 our $false = do { bless \(my $dummy = 0), "JSON::XS::Boolean" };
1033
1034 sub true() { $true }
1035 sub false() { $false }
1036
1037 sub is_bool($) {
1038 UNIVERSAL::isa $_[0], "JSON::XS::Boolean"
1039 # or UNIVERSAL::isa $_[0], "JSON::Literal"
1040 }
1041
1042 XSLoader::load "JSON::XS", $VERSION;
1043
1044 package JSON::XS::Boolean;
1045
1046 use overload
1047 "0+" => sub { ${$_[0]} },
1048 "++" => sub { $_[0] = ${$_[0]} + 1 },
1049 "--" => sub { $_[0] = ${$_[0]} - 1 },
1050 fallback => 1;
1051
1052 1;
1053
1054 =head1 AUTHOR
1055
1056 Marc Lehmann <schmorp@schmorp.de>
1057 http://home.schmorp.de/
1058
1059 =cut
1060