ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/JSON-XS/XS.pm
Revision: 1.69
Committed: Tue Oct 23 03:31:14 2007 UTC (16 years, 6 months ago) by root
Branch: MAIN
Changes since 1.68: +4 -4 lines
Log Message:
*** empty log message ***

File Contents

# Content
1 =head1 NAME
2
3 JSON::XS - JSON serialising/deserialising, done correctly and fast
4
5 JSON::XS - 正しくて高速な JSON シリアライザ/デシリアライザ
6 (http://fleur.hio.jp/perldoc/mix/lib/JSON/XS.html)
7
8 =head1 SYNOPSIS
9
10 use JSON::XS;
11
12 # exported functions, they croak on error
13 # and expect/generate UTF-8
14
15 $utf8_encoded_json_text = to_json $perl_hash_or_arrayref;
16 $perl_hash_or_arrayref = from_json $utf8_encoded_json_text;
17
18 # OO-interface
19
20 $coder = JSON::XS->new->ascii->pretty->allow_nonref;
21 $pretty_printed_unencoded = $coder->encode ($perl_scalar);
22 $perl_scalar = $coder->decode ($unicode_json_text);
23
24 =head1 DESCRIPTION
25
26 This module converts Perl data structures to JSON and vice versa. Its
27 primary goal is to be I<correct> and its secondary goal is to be
28 I<fast>. To reach the latter goal it was written in C.
29
30 As this is the n-th-something JSON module on CPAN, what was the reason
31 to write yet another JSON module? While it seems there are many JSON
32 modules, none of them correctly handle all corner cases, and in most cases
33 their maintainers are unresponsive, gone missing, or not listening to bug
34 reports for other reasons.
35
36 See COMPARISON, below, for a comparison to some other JSON modules.
37
38 See MAPPING, below, on how JSON::XS maps perl values to JSON values and
39 vice versa.
40
41 =head2 FEATURES
42
43 =over 4
44
45 =item * correct Unicode handling
46
47 This module knows how to handle Unicode, and even documents how and when
48 it does so.
49
50 =item * round-trip integrity
51
52 When you serialise a perl data structure using only datatypes supported
53 by JSON, the deserialised data structure is identical on the Perl level.
54 (e.g. the string "2.0" doesn't suddenly become "2" just because it looks
55 like a number).
56
57 =item * strict checking of JSON correctness
58
59 There is no guessing, no generating of illegal JSON texts by default,
60 and only JSON is accepted as input by default (the latter is a security
61 feature).
62
63 =item * fast
64
65 Compared to other JSON modules, this module compares favourably in terms
66 of speed, too.
67
68 =item * simple to use
69
70 This module has both a simple functional interface as well as an OO
71 interface.
72
73 =item * reasonably versatile output formats
74
75 You can choose between the most compact guaranteed single-line format
76 possible (nice for simple line-based protocols), a pure-ascii format
77 (for when your transport is not 8-bit clean, still supports the whole
78 Unicode range), or a pretty-printed format (for when you want to read that
79 stuff). Or you can combine those features in whatever way you like.
80
81 =back
82
83 =cut
84
85 package JSON::XS;
86
87 use strict;
88
89 our $VERSION = '1.52';
90 our @ISA = qw(Exporter);
91
92 our @EXPORT = qw(to_json from_json);
93
94 use Exporter;
95 use XSLoader;
96
97 =head1 FUNCTIONAL INTERFACE
98
99 The following convenience methods are provided by this module. They are
100 exported by default:
101
102 =over 4
103
104 =item $json_text = to_json $perl_scalar
105
106 Converts the given Perl data structure to a UTF-8 encoded, binary string
107 (that is, the string contains octets only). Croaks on error.
108
109 This function call is functionally identical to:
110
111 $json_text = JSON::XS->new->utf8->encode ($perl_scalar)
112
113 except being faster.
114
115 =item $perl_scalar = from_json $json_text
116
117 The opposite of C<to_json>: expects an UTF-8 (binary) string and tries
118 to parse that as an UTF-8 encoded JSON text, returning the resulting
119 reference. Croaks on error.
120
121 This function call is functionally identical to:
122
123 $perl_scalar = JSON::XS->new->utf8->decode ($json_text)
124
125 except being faster.
126
127 =item $is_boolean = JSON::XS::is_bool $scalar
128
129 Returns true if the passed scalar represents either JSON::XS::true or
130 JSON::XS::false, two constants that act like C<1> and C<0>, respectively
131 and are used to represent JSON C<true> and C<false> values in Perl.
132
133 See MAPPING, below, for more information on how JSON values are mapped to
134 Perl.
135
136 =back
137
138
139 =head1 A FEW NOTES ON UNICODE AND PERL
140
141 Since this often leads to confusion, here are a few very clear words on
142 how Unicode works in Perl, modulo bugs.
143
144 =over 4
145
146 =item 1. Perl strings can store characters with ordinal values > 255.
147
148 This enables you to store Unicode characters as single characters in a
149 Perl string - very natural.
150
151 =item 2. Perl does I<not> associate an encoding with your strings.
152
153 Unless you force it to, e.g. when matching it against a regex, or printing
154 the scalar to a file, in which case Perl either interprets your string as
155 locale-encoded text, octets/binary, or as Unicode, depending on various
156 settings. In no case is an encoding stored together with your data, it is
157 I<use> that decides encoding, not any magical metadata.
158
159 =item 3. The internal utf-8 flag has no meaning with regards to the
160 encoding of your string.
161
162 Just ignore that flag unless you debug a Perl bug, a module written in
163 XS or want to dive into the internals of perl. Otherwise it will only
164 confuse you, as, despite the name, it says nothing about how your string
165 is encoded. You can have Unicode strings with that flag set, with that
166 flag clear, and you can have binary data with that flag set and that flag
167 clear. Other possibilities exist, too.
168
169 If you didn't know about that flag, just the better, pretend it doesn't
170 exist.
171
172 =item 4. A "Unicode String" is simply a string where each character can be
173 validly interpreted as a Unicode codepoint.
174
175 If you have UTF-8 encoded data, it is no longer a Unicode string, but a
176 Unicode string encoded in UTF-8, giving you a binary string.
177
178 =item 5. A string containing "high" (> 255) character values is I<not> a UTF-8 string.
179
180 It's a fact. Learn to live with it.
181
182 =back
183
184 I hope this helps :)
185
186
187 =head1 OBJECT-ORIENTED INTERFACE
188
189 The object oriented interface lets you configure your own encoding or
190 decoding style, within the limits of supported formats.
191
192 =over 4
193
194 =item $json = new JSON::XS
195
196 Creates a new JSON::XS object that can be used to de/encode JSON
197 strings. All boolean flags described below are by default I<disabled>.
198
199 The mutators for flags all return the JSON object again and thus calls can
200 be chained:
201
202 my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]})
203 => {"a": [1, 2]}
204
205 =item $json = $json->ascii ([$enable])
206
207 If C<$enable> is true (or missing), then the C<encode> method will not
208 generate characters outside the code range C<0..127> (which is ASCII). Any
209 Unicode characters outside that range will be escaped using either a
210 single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL escape sequence,
211 as per RFC4627. The resulting encoded JSON text can be treated as a native
212 Unicode string, an ascii-encoded, latin1-encoded or UTF-8 encoded string,
213 or any other superset of ASCII.
214
215 If C<$enable> is false, then the C<encode> method will not escape Unicode
216 characters unless required by the JSON syntax or other flags. This results
217 in a faster and more compact format.
218
219 The main use for this flag is to produce JSON texts that can be
220 transmitted over a 7-bit channel, as the encoded JSON texts will not
221 contain any 8 bit characters.
222
223 JSON::XS->new->ascii (1)->encode ([chr 0x10401])
224 => ["\ud801\udc01"]
225
226 =item $json = $json->latin1 ([$enable])
227
228 If C<$enable> is true (or missing), then the C<encode> method will encode
229 the resulting JSON text as latin1 (or iso-8859-1), escaping any characters
230 outside the code range C<0..255>. The resulting string can be treated as a
231 latin1-encoded JSON text or a native Unicode string. The C<decode> method
232 will not be affected in any way by this flag, as C<decode> by default
233 expects Unicode, which is a strict superset of latin1.
234
235 If C<$enable> is false, then the C<encode> method will not escape Unicode
236 characters unless required by the JSON syntax or other flags.
237
238 The main use for this flag is efficiently encoding binary data as JSON
239 text, as most octets will not be escaped, resulting in a smaller encoded
240 size. The disadvantage is that the resulting JSON text is encoded
241 in latin1 (and must correctly be treated as such when storing and
242 transferring), a rare encoding for JSON. It is therefore most useful when
243 you want to store data structures known to contain binary data efficiently
244 in files or databases, not when talking to other JSON encoders/decoders.
245
246 JSON::XS->new->latin1->encode (["\x{89}\x{abc}"]
247 => ["\x{89}\\u0abc"] # (perl syntax, U+abc escaped, U+89 not)
248
249 =item $json = $json->utf8 ([$enable])
250
251 If C<$enable> is true (or missing), then the C<encode> method will encode
252 the JSON result into UTF-8, as required by many protocols, while the
253 C<decode> method expects to be handled an UTF-8-encoded string. Please
254 note that UTF-8-encoded strings do not contain any characters outside the
255 range C<0..255>, they are thus useful for bytewise/binary I/O. In future
256 versions, enabling this option might enable autodetection of the UTF-16
257 and UTF-32 encoding families, as described in RFC4627.
258
259 If C<$enable> is false, then the C<encode> method will return the JSON
260 string as a (non-encoded) Unicode string, while C<decode> expects thus a
261 Unicode string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs
262 to be done yourself, e.g. using the Encode module.
263
264 Example, output UTF-16BE-encoded JSON:
265
266 use Encode;
267 $jsontext = encode "UTF-16BE", JSON::XS->new->encode ($object);
268
269 Example, decode UTF-32LE-encoded JSON:
270
271 use Encode;
272 $object = JSON::XS->new->decode (decode "UTF-32LE", $jsontext);
273
274 =item $json = $json->pretty ([$enable])
275
276 This enables (or disables) all of the C<indent>, C<space_before> and
277 C<space_after> (and in the future possibly more) flags in one call to
278 generate the most readable (or most compact) form possible.
279
280 Example, pretty-print some simple structure:
281
282 my $json = JSON::XS->new->pretty(1)->encode ({a => [1,2]})
283 =>
284 {
285 "a" : [
286 1,
287 2
288 ]
289 }
290
291 =item $json = $json->indent ([$enable])
292
293 If C<$enable> is true (or missing), then the C<encode> method will use a multiline
294 format as output, putting every array member or object/hash key-value pair
295 into its own line, indenting them properly.
296
297 If C<$enable> is false, no newlines or indenting will be produced, and the
298 resulting JSON text is guaranteed not to contain any C<newlines>.
299
300 This setting has no effect when decoding JSON texts.
301
302 =item $json = $json->space_before ([$enable])
303
304 If C<$enable> is true (or missing), then the C<encode> method will add an extra
305 optional space before the C<:> separating keys from values in JSON objects.
306
307 If C<$enable> is false, then the C<encode> method will not add any extra
308 space at those places.
309
310 This setting has no effect when decoding JSON texts. You will also
311 most likely combine this setting with C<space_after>.
312
313 Example, space_before enabled, space_after and indent disabled:
314
315 {"key" :"value"}
316
317 =item $json = $json->space_after ([$enable])
318
319 If C<$enable> is true (or missing), then the C<encode> method will add an extra
320 optional space after the C<:> separating keys from values in JSON objects
321 and extra whitespace after the C<,> separating key-value pairs and array
322 members.
323
324 If C<$enable> is false, then the C<encode> method will not add any extra
325 space at those places.
326
327 This setting has no effect when decoding JSON texts.
328
329 Example, space_before and indent disabled, space_after enabled:
330
331 {"key": "value"}
332
333 =item $json = $json->relaxed ([$enable])
334
335 If C<$enable> is true (or missing), then C<decode> will accept some
336 extensions to normal JSON syntax (see below). C<encode> will not be
337 affected in anyway. I<Be aware that this option makes you accept invalid
338 JSON texts as if they were valid!>. I suggest only to use this option to
339 parse application-specific files written by humans (configuration files,
340 resource files etc.)
341
342 If C<$enable> is false (the default), then C<decode> will only accept
343 valid JSON texts.
344
345 Currently accepted extensions are:
346
347 =over 4
348
349 =item * list items can have an end-comma
350
351 JSON I<separates> array elements and key-value pairs with commas. This
352 can be annoying if you write JSON texts manually and want to be able to
353 quickly append elements, so this extension accepts comma at the end of
354 such items not just between them:
355
356 [
357 1,
358 2, <- this comma not normally allowed
359 ]
360 {
361 "k1": "v1",
362 "k2": "v2", <- this comma not normally allowed
363 }
364
365 =item * shell-style '#'-comments
366
367 Whenever JSON allows whitespace, shell-style comments are additionally
368 allowed. They are terminated by the first carriage-return or line-feed
369 character, after which more white-space and comments are allowed.
370
371 [
372 1, # this comment not allowed in JSON
373 # neither this one...
374 ]
375
376 =back
377
378 =item $json = $json->canonical ([$enable])
379
380 If C<$enable> is true (or missing), then the C<encode> method will output JSON objects
381 by sorting their keys. This is adding a comparatively high overhead.
382
383 If C<$enable> is false, then the C<encode> method will output key-value
384 pairs in the order Perl stores them (which will likely change between runs
385 of the same script).
386
387 This option is useful if you want the same data structure to be encoded as
388 the same JSON text (given the same overall settings). If it is disabled,
389 the same hash might be encoded differently even if contains the same data,
390 as key-value pairs have no inherent ordering in Perl.
391
392 This setting has no effect when decoding JSON texts.
393
394 =item $json = $json->allow_nonref ([$enable])
395
396 If C<$enable> is true (or missing), then the C<encode> method can convert a
397 non-reference into its corresponding string, number or null JSON value,
398 which is an extension to RFC4627. Likewise, C<decode> will accept those JSON
399 values instead of croaking.
400
401 If C<$enable> is false, then the C<encode> method will croak if it isn't
402 passed an arrayref or hashref, as JSON texts must either be an object
403 or array. Likewise, C<decode> will croak if given something that is not a
404 JSON object or array.
405
406 Example, encode a Perl scalar as JSON value with enabled C<allow_nonref>,
407 resulting in an invalid JSON text:
408
409 JSON::XS->new->allow_nonref->encode ("Hello, World!")
410 => "Hello, World!"
411
412 =item $json = $json->allow_blessed ([$enable])
413
414 If C<$enable> is true (or missing), then the C<encode> method will not
415 barf when it encounters a blessed reference. Instead, the value of the
416 B<convert_blessed> option will decide whether C<null> (C<convert_blessed>
417 disabled or no C<to_json> method found) or a representation of the
418 object (C<convert_blessed> enabled and C<to_json> method found) is being
419 encoded. Has no effect on C<decode>.
420
421 If C<$enable> is false (the default), then C<encode> will throw an
422 exception when it encounters a blessed object.
423
424 =item $json = $json->convert_blessed ([$enable])
425
426 If C<$enable> is true (or missing), then C<encode>, upon encountering a
427 blessed object, will check for the availability of the C<TO_JSON> method
428 on the object's class. If found, it will be called in scalar context
429 and the resulting scalar will be encoded instead of the object. If no
430 C<TO_JSON> method is found, the value of C<allow_blessed> will decide what
431 to do.
432
433 The C<TO_JSON> method may safely call die if it wants. If C<TO_JSON>
434 returns other blessed objects, those will be handled in the same
435 way. C<TO_JSON> must take care of not causing an endless recursion cycle
436 (== crash) in this case. The name of C<TO_JSON> was chosen because other
437 methods called by the Perl core (== not by the user of the object) are
438 usually in upper case letters and to avoid collisions with the C<to_json>
439 function.
440
441 This setting does not yet influence C<decode> in any way, but in the
442 future, global hooks might get installed that influence C<decode> and are
443 enabled by this setting.
444
445 If C<$enable> is false, then the C<allow_blessed> setting will decide what
446 to do when a blessed object is found.
447
448 =item $json = $json->filter_json_object ([$coderef->($hashref)])
449
450 When C<$coderef> is specified, it will be called from C<decode> each
451 time it decodes a JSON object. The only argument is a reference to the
452 newly-created hash. If the code references returns a single scalar (which
453 need not be a reference), this value (i.e. a copy of that scalar to avoid
454 aliasing) is inserted into the deserialised data structure. If it returns
455 an empty list (NOTE: I<not> C<undef>, which is a valid scalar), the
456 original deserialised hash will be inserted. This setting can slow down
457 decoding considerably.
458
459 When C<$coderef> is omitted or undefined, any existing callback will
460 be removed and C<decode> will not change the deserialised hash in any
461 way.
462
463 Example, convert all JSON objects into the integer 5:
464
465 my $js = JSON::XS->new->filter_json_object (sub { 5 });
466 # returns [5]
467 $js->decode ('[{}]')
468 # throw an exception because allow_nonref is not enabled
469 # so a lone 5 is not allowed.
470 $js->decode ('{"a":1, "b":2}');
471
472 =item $json = $json->filter_json_single_key_object ($key [=> $coderef->($value)])
473
474 Works remotely similar to C<filter_json_object>, but is only called for
475 JSON objects having a single key named C<$key>.
476
477 This C<$coderef> is called before the one specified via
478 C<filter_json_object>, if any. It gets passed the single value in the JSON
479 object. If it returns a single value, it will be inserted into the data
480 structure. If it returns nothing (not even C<undef> but the empty list),
481 the callback from C<filter_json_object> will be called next, as if no
482 single-key callback were specified.
483
484 If C<$coderef> is omitted or undefined, the corresponding callback will be
485 disabled. There can only ever be one callback for a given key.
486
487 As this callback gets called less often then the C<filter_json_object>
488 one, decoding speed will not usually suffer as much. Therefore, single-key
489 objects make excellent targets to serialise Perl objects into, especially
490 as single-key JSON objects are as close to the type-tagged value concept
491 as JSON gets (it's basically an ID/VALUE tuple). Of course, JSON does not
492 support this in any way, so you need to make sure your data never looks
493 like a serialised Perl hash.
494
495 Typical names for the single object key are C<__class_whatever__>, or
496 C<$__dollars_are_rarely_used__$> or C<}ugly_brace_placement>, or even
497 things like C<__class_md5sum(classname)__>, to reduce the risk of clashing
498 with real hashes.
499
500 Example, decode JSON objects of the form C<< { "__widget__" => <id> } >>
501 into the corresponding C<< $WIDGET{<id>} >> object:
502
503 # return whatever is in $WIDGET{5}:
504 JSON::XS
505 ->new
506 ->filter_json_single_key_object (__widget__ => sub {
507 $WIDGET{ $_[0] }
508 })
509 ->decode ('{"__widget__": 5')
510
511 # this can be used with a TO_JSON method in some "widget" class
512 # for serialisation to json:
513 sub WidgetBase::TO_JSON {
514 my ($self) = @_;
515
516 unless ($self->{id}) {
517 $self->{id} = ..get..some..id..;
518 $WIDGET{$self->{id}} = $self;
519 }
520
521 { __widget__ => $self->{id} }
522 }
523
524 =item $json = $json->shrink ([$enable])
525
526 Perl usually over-allocates memory a bit when allocating space for
527 strings. This flag optionally resizes strings generated by either
528 C<encode> or C<decode> to their minimum size possible. This can save
529 memory when your JSON texts are either very very long or you have many
530 short strings. It will also try to downgrade any strings to octet-form
531 if possible: perl stores strings internally either in an encoding called
532 UTF-X or in octet-form. The latter cannot store everything but uses less
533 space in general (and some buggy Perl or C code might even rely on that
534 internal representation being used).
535
536 The actual definition of what shrink does might change in future versions,
537 but it will always try to save space at the expense of time.
538
539 If C<$enable> is true (or missing), the string returned by C<encode> will
540 be shrunk-to-fit, while all strings generated by C<decode> will also be
541 shrunk-to-fit.
542
543 If C<$enable> is false, then the normal perl allocation algorithms are used.
544 If you work with your data, then this is likely to be faster.
545
546 In the future, this setting might control other things, such as converting
547 strings that look like integers or floats into integers or floats
548 internally (there is no difference on the Perl level), saving space.
549
550 =item $json = $json->max_depth ([$maximum_nesting_depth])
551
552 Sets the maximum nesting level (default C<512>) accepted while encoding
553 or decoding. If the JSON text or Perl data structure has an equal or
554 higher nesting level then this limit, then the encoder and decoder will
555 stop and croak at that point.
556
557 Nesting level is defined by number of hash- or arrayrefs that the encoder
558 needs to traverse to reach a given point or the number of C<{> or C<[>
559 characters without their matching closing parenthesis crossed to reach a
560 given character in a string.
561
562 Setting the maximum depth to one disallows any nesting, so that ensures
563 that the object is only a single hash/object or array.
564
565 The argument to C<max_depth> will be rounded up to the next highest power
566 of two. If no argument is given, the highest possible setting will be
567 used, which is rarely useful.
568
569 See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
570
571 =item $json = $json->max_size ([$maximum_string_size])
572
573 Set the maximum length a JSON text may have (in bytes) where decoding is
574 being attempted. The default is C<0>, meaning no limit. When C<decode>
575 is called on a string longer then this number of characters it will not
576 attempt to decode the string but throw an exception. This setting has no
577 effect on C<encode> (yet).
578
579 The argument to C<max_size> will be rounded up to the next B<highest>
580 power of two (so may be more than requested). If no argument is given, the
581 limit check will be deactivated (same as when C<0> is specified).
582
583 See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
584
585 =item $json_text = $json->encode ($perl_scalar)
586
587 Converts the given Perl data structure (a simple scalar or a reference
588 to a hash or array) to its JSON representation. Simple scalars will be
589 converted into JSON string or number sequences, while references to arrays
590 become JSON arrays and references to hashes become JSON objects. Undefined
591 Perl values (e.g. C<undef>) become JSON C<null> values. Neither C<true>
592 nor C<false> values will be generated.
593
594 =item $perl_scalar = $json->decode ($json_text)
595
596 The opposite of C<encode>: expects a JSON text and tries to parse it,
597 returning the resulting simple scalar or reference. Croaks on error.
598
599 JSON numbers and strings become simple Perl scalars. JSON arrays become
600 Perl arrayrefs and JSON objects become Perl hashrefs. C<true> becomes
601 C<1>, C<false> becomes C<0> and C<null> becomes C<undef>.
602
603 =item ($perl_scalar, $characters) = $json->decode_prefix ($json_text)
604
605 This works like the C<decode> method, but instead of raising an exception
606 when there is trailing garbage after the first JSON object, it will
607 silently stop parsing there and return the number of characters consumed
608 so far.
609
610 This is useful if your JSON texts are not delimited by an outer protocol
611 (which is not the brightest thing to do in the first place) and you need
612 to know where the JSON text ends.
613
614 JSON::XS->new->decode_prefix ("[1] the tail")
615 => ([], 3)
616
617 =back
618
619
620 =head1 MAPPING
621
622 This section describes how JSON::XS maps Perl values to JSON values and
623 vice versa. These mappings are designed to "do the right thing" in most
624 circumstances automatically, preserving round-tripping characteristics
625 (what you put in comes out as something equivalent).
626
627 For the more enlightened: note that in the following descriptions,
628 lowercase I<perl> refers to the Perl interpreter, while uppercase I<Perl>
629 refers to the abstract Perl language itself.
630
631
632 =head2 JSON -> PERL
633
634 =over 4
635
636 =item object
637
638 A JSON object becomes a reference to a hash in Perl. No ordering of object
639 keys is preserved (JSON does not preserve object key ordering itself).
640
641 =item array
642
643 A JSON array becomes a reference to an array in Perl.
644
645 =item string
646
647 A JSON string becomes a string scalar in Perl - Unicode codepoints in JSON
648 are represented by the same codepoints in the Perl string, so no manual
649 decoding is necessary.
650
651 =item number
652
653 A JSON number becomes either an integer, numeric (floating point) or
654 string scalar in perl, depending on its range and any fractional parts. On
655 the Perl level, there is no difference between those as Perl handles all
656 the conversion details, but an integer may take slightly less memory and
657 might represent more values exactly than (floating point) numbers.
658
659 If the number consists of digits only, JSON::XS will try to represent
660 it as an integer value. If that fails, it will try to represent it as
661 a numeric (floating point) value if that is possible without loss of
662 precision. Otherwise it will preserve the number as a string value.
663
664 Numbers containing a fractional or exponential part will always be
665 represented as numeric (floating point) values, possibly at a loss of
666 precision.
667
668 This might create round-tripping problems as numbers might become strings,
669 but as Perl is typeless there is no other way to do it.
670
671 =item true, false
672
673 These JSON atoms become C<JSON::XS::true> and C<JSON::XS::false>,
674 respectively. They are overloaded to act almost exactly like the numbers
675 C<1> and C<0>. You can check whether a scalar is a JSON boolean by using
676 the C<JSON::XS::is_bool> function.
677
678 =item null
679
680 A JSON null atom becomes C<undef> in Perl.
681
682 =back
683
684
685 =head2 PERL -> JSON
686
687 The mapping from Perl to JSON is slightly more difficult, as Perl is a
688 truly typeless language, so we can only guess which JSON type is meant by
689 a Perl value.
690
691 =over 4
692
693 =item hash references
694
695 Perl hash references become JSON objects. As there is no inherent ordering
696 in hash keys (or JSON objects), they will usually be encoded in a
697 pseudo-random order that can change between runs of the same program but
698 stays generally the same within a single run of a program. JSON::XS can
699 optionally sort the hash keys (determined by the I<canonical> flag), so
700 the same datastructure will serialise to the same JSON text (given same
701 settings and version of JSON::XS), but this incurs a runtime overhead
702 and is only rarely useful, e.g. when you want to compare some JSON text
703 against another for equality.
704
705 =item array references
706
707 Perl array references become JSON arrays.
708
709 =item other references
710
711 Other unblessed references are generally not allowed and will cause an
712 exception to be thrown, except for references to the integers C<0> and
713 C<1>, which get turned into C<false> and C<true> atoms in JSON. You can
714 also use C<JSON::XS::false> and C<JSON::XS::true> to improve readability.
715
716 to_json [\0,JSON::XS::true] # yields [false,true]
717
718 =item JSON::XS::true, JSON::XS::false
719
720 These special values become JSON true and JSON false values,
721 respectively. You can also use C<\1> and C<\0> directly if you want.
722
723 =item blessed objects
724
725 Blessed objects are not allowed. JSON::XS currently tries to encode their
726 underlying representation (hash- or arrayref), but this behaviour might
727 change in future versions.
728
729 =item simple scalars
730
731 Simple Perl scalars (any scalar that is not a reference) are the most
732 difficult objects to encode: JSON::XS will encode undefined scalars as
733 JSON null value, scalars that have last been used in a string context
734 before encoding as JSON strings and anything else as number value:
735
736 # dump as number
737 to_json [2] # yields [2]
738 to_json [-3.0e17] # yields [-3e+17]
739 my $value = 5; to_json [$value] # yields [5]
740
741 # used as string, so dump as string
742 print $value;
743 to_json [$value] # yields ["5"]
744
745 # undef becomes null
746 to_json [undef] # yields [null]
747
748 You can force the type to be a JSON string by stringifying it:
749
750 my $x = 3.1; # some variable containing a number
751 "$x"; # stringified
752 $x .= ""; # another, more awkward way to stringify
753 print $x; # perl does it for you, too, quite often
754
755 You can force the type to be a JSON number by numifying it:
756
757 my $x = "3"; # some variable containing a string
758 $x += 0; # numify it, ensuring it will be dumped as a number
759 $x *= 1; # same thing, the choice is yours.
760
761 You can not currently force the type in other, less obscure, ways. Tell me
762 if you need this capability.
763
764 =back
765
766
767 =head1 COMPARISON
768
769 As already mentioned, this module was created because none of the existing
770 JSON modules could be made to work correctly. First I will describe the
771 problems (or pleasures) I encountered with various existing JSON modules,
772 followed by some benchmark values. JSON::XS was designed not to suffer
773 from any of these problems or limitations.
774
775 =over 4
776
777 =item JSON 1.07
778
779 Slow (but very portable, as it is written in pure Perl).
780
781 Undocumented/buggy Unicode handling (how JSON handles Unicode values is
782 undocumented. One can get far by feeding it Unicode strings and doing
783 en-/decoding oneself, but Unicode escapes are not working properly).
784
785 No round-tripping (strings get clobbered if they look like numbers, e.g.
786 the string C<2.0> will encode to C<2.0> instead of C<"2.0">, and that will
787 decode into the number 2.
788
789 =item JSON::PC 0.01
790
791 Very fast.
792
793 Undocumented/buggy Unicode handling.
794
795 No round-tripping.
796
797 Has problems handling many Perl values (e.g. regex results and other magic
798 values will make it croak).
799
800 Does not even generate valid JSON (C<{1,2}> gets converted to C<{1:2}>
801 which is not a valid JSON text.
802
803 Unmaintained (maintainer unresponsive for many months, bugs are not
804 getting fixed).
805
806 =item JSON::Syck 0.21
807
808 Very buggy (often crashes).
809
810 Very inflexible (no human-readable format supported, format pretty much
811 undocumented. I need at least a format for easy reading by humans and a
812 single-line compact format for use in a protocol, and preferably a way to
813 generate ASCII-only JSON texts).
814
815 Completely broken (and confusingly documented) Unicode handling (Unicode
816 escapes are not working properly, you need to set ImplicitUnicode to
817 I<different> values on en- and decoding to get symmetric behaviour).
818
819 No round-tripping (simple cases work, but this depends on whether the scalar
820 value was used in a numeric context or not).
821
822 Dumping hashes may skip hash values depending on iterator state.
823
824 Unmaintained (maintainer unresponsive for many months, bugs are not
825 getting fixed).
826
827 Does not check input for validity (i.e. will accept non-JSON input and
828 return "something" instead of raising an exception. This is a security
829 issue: imagine two banks transferring money between each other using
830 JSON. One bank might parse a given non-JSON request and deduct money,
831 while the other might reject the transaction with a syntax error. While a
832 good protocol will at least recover, that is extra unnecessary work and
833 the transaction will still not succeed).
834
835 =item JSON::DWIW 0.04
836
837 Very fast. Very natural. Very nice.
838
839 Undocumented Unicode handling (but the best of the pack. Unicode escapes
840 still don't get parsed properly).
841
842 Very inflexible.
843
844 No round-tripping.
845
846 Does not generate valid JSON texts (key strings are often unquoted, empty keys
847 result in nothing being output)
848
849 Does not check input for validity.
850
851 =back
852
853
854 =head2 JSON and YAML
855
856 You often hear that JSON is a subset (or a close subset) of YAML. This is,
857 however, a mass hysteria and very far from the truth. In general, there is
858 no way to configure JSON::XS to output a data structure as valid YAML.
859
860 If you really must use JSON::XS to generate YAML, you should use this
861 algorithm (subject to change in future versions):
862
863 my $to_yaml = JSON::XS->new->utf8->space_after (1);
864 my $yaml = $to_yaml->encode ($ref) . "\n";
865
866 This will usually generate JSON texts that also parse as valid
867 YAML. Please note that YAML has hardcoded limits on (simple) object key
868 lengths that JSON doesn't have, so you should make sure that your hash
869 keys are noticeably shorter than the 1024 characters YAML allows.
870
871 There might be other incompatibilities that I am not aware of. In general
872 you should not try to generate YAML with a JSON generator or vice versa,
873 or try to parse JSON with a YAML parser or vice versa: chances are high
874 that you will run into severe interoperability problems.
875
876
877 =head2 SPEED
878
879 It seems that JSON::XS is surprisingly fast, as shown in the following
880 tables. They have been generated with the help of the C<eg/bench> program
881 in the JSON::XS distribution, to make it easy to compare on your own
882 system.
883
884 First comes a comparison between various modules using a very short
885 single-line JSON string:
886
887 {"method": "handleMessage", "params": ["user1", "we were just talking"], \
888 "id": null, "array":[1,11,234,-5,1e5,1e7, true, false]}
889
890 It shows the number of encodes/decodes per second (JSON::XS uses
891 the functional interface, while JSON::XS/2 uses the OO interface
892 with pretty-printing and hashkey sorting enabled, JSON::XS/3 enables
893 shrink). Higher is better:
894
895 Storable | 15779.925 | 14169.946 |
896 -----------+------------+------------+
897 module | encode | decode |
898 -----------|------------|------------|
899 JSON | 4990.842 | 4088.813 |
900 JSON::DWIW | 51653.990 | 71575.154 |
901 JSON::PC | 65948.176 | 74631.744 |
902 JSON::PP | 8931.652 | 3817.168 |
903 JSON::Syck | 24877.248 | 27776.848 |
904 JSON::XS | 388361.481 | 227951.304 |
905 JSON::XS/2 | 227951.304 | 218453.333 |
906 JSON::XS/3 | 338250.323 | 218453.333 |
907 Storable | 16500.016 | 135300.129 |
908 -----------+------------+------------+
909
910 That is, JSON::XS is about five times faster than JSON::DWIW on encoding,
911 about three times faster on decoding, and over forty times faster
912 than JSON, even with pretty-printing and key sorting. It also compares
913 favourably to Storable for small amounts of data.
914
915 Using a longer test string (roughly 18KB, generated from Yahoo! Locals
916 search API (http://nanoref.com/yahooapis/mgPdGg):
917
918 module | encode | decode |
919 -----------|------------|------------|
920 JSON | 55.260 | 34.971 |
921 JSON::DWIW | 825.228 | 1082.513 |
922 JSON::PC | 3571.444 | 2394.829 |
923 JSON::PP | 210.987 | 32.574 |
924 JSON::Syck | 552.551 | 787.544 |
925 JSON::XS | 5780.463 | 4854.519 |
926 JSON::XS/2 | 3869.998 | 4798.975 |
927 JSON::XS/3 | 5862.880 | 4798.975 |
928 Storable | 4445.002 | 5235.027 |
929 -----------+------------+------------+
930
931 Again, JSON::XS leads by far (except for Storable which non-surprisingly
932 decodes faster).
933
934 On large strings containing lots of high Unicode characters, some modules
935 (such as JSON::PC) seem to decode faster than JSON::XS, but the result
936 will be broken due to missing (or wrong) Unicode handling. Others refuse
937 to decode or encode properly, so it was impossible to prepare a fair
938 comparison table for that case.
939
940
941 =head1 SECURITY CONSIDERATIONS
942
943 When you are using JSON in a protocol, talking to untrusted potentially
944 hostile creatures requires relatively few measures.
945
946 First of all, your JSON decoder should be secure, that is, should not have
947 any buffer overflows. Obviously, this module should ensure that and I am
948 trying hard on making that true, but you never know.
949
950 Second, you need to avoid resource-starving attacks. That means you should
951 limit the size of JSON texts you accept, or make sure then when your
952 resources run out, that's just fine (e.g. by using a separate process that
953 can crash safely). The size of a JSON text in octets or characters is
954 usually a good indication of the size of the resources required to decode
955 it into a Perl structure. While JSON::XS can check the size of the JSON
956 text, it might be too late when you already have it in memory, so you
957 might want to check the size before you accept the string.
958
959 Third, JSON::XS recurses using the C stack when decoding objects and
960 arrays. The C stack is a limited resource: for instance, on my amd64
961 machine with 8MB of stack size I can decode around 180k nested arrays but
962 only 14k nested JSON objects (due to perl itself recursing deeply on croak
963 to free the temporary). If that is exceeded, the program crashes. to be
964 conservative, the default nesting limit is set to 512. If your process
965 has a smaller stack, you should adjust this setting accordingly with the
966 C<max_depth> method.
967
968 And last but least, something else could bomb you that I forgot to think
969 of. In that case, you get to keep the pieces. I am always open for hints,
970 though...
971
972 If you are using JSON::XS to return packets to consumption
973 by JavaScript scripts in a browser you should have a look at
974 L<http://jpsykes.com/47/practical-csrf-and-json-security> to see whether
975 you are vulnerable to some common attack vectors (which really are browser
976 design bugs, but it is still you who will have to deal with it, as major
977 browser developers care only for features, not about doing security
978 right).
979
980
981 =head1 THREADS
982
983 This module is I<not> guaranteed to be thread safe and there are no
984 plans to change this until Perl gets thread support (as opposed to the
985 horribly slow so-called "threads" which are simply slow and bloated
986 process simulations - use fork, its I<much> faster, cheaper, better).
987
988 (It might actually work, but you have been warned).
989
990
991 =head1 BUGS
992
993 While the goal of this module is to be correct, that unfortunately does
994 not mean its bug-free, only that I think its design is bug-free. It is
995 still relatively early in its development. If you keep reporting bugs they
996 will be fixed swiftly, though.
997
998 Please refrain from using rt.cpan.org or any other bug reporting
999 service. I put the contact address into my modules for a reason.
1000
1001 =cut
1002
1003 our $true = do { bless \(my $dummy = 1), "JSON::XS::Boolean" };
1004 our $false = do { bless \(my $dummy = 0), "JSON::XS::Boolean" };
1005
1006 sub true() { $true }
1007 sub false() { $false }
1008
1009 sub is_bool($) {
1010 UNIVERSAL::isa $_[0], "JSON::XS::Boolean"
1011 # or UNIVERSAL::isa $_[0], "JSON::Literal"
1012 }
1013
1014 XSLoader::load "JSON::XS", $VERSION;
1015
1016 package JSON::XS::Boolean;
1017
1018 use overload
1019 "0+" => sub { ${$_[0]} },
1020 "++" => sub { $_[0] = ${$_[0]} + 1 },
1021 "--" => sub { $_[0] = ${$_[0]} - 1 },
1022 fallback => 1;
1023
1024 1;
1025
1026 =head1 AUTHOR
1027
1028 Marc Lehmann <schmorp@schmorp.de>
1029 http://home.schmorp.de/
1030
1031 =cut
1032