ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/JSON-XS/XS.pm
Revision: 1.22
Committed: Sun Mar 25 02:37:00 2007 UTC (17 years, 1 month ago) by root
Branch: MAIN
Changes since 1.21: +6 -4 lines
Log Message:
*** empty log message ***

File Contents

# Content
1 =head1 NAME
2
3 JSON::XS - JSON serialising/deserialising, done correctly and fast
4
5 =head1 SYNOPSIS
6
7 use JSON::XS;
8
9 # exported functions, they croak on error
10 # and expect/generate UTF-8
11
12 $utf8_encoded_json_text = to_json $perl_hash_or_arrayref;
13 $perl_hash_or_arrayref = from_json $utf8_encoded_json_text;
14
15 # objToJson and jsonToObj aliases to to_json and from_json
16 # are exported for compatibility to the JSON module,
17 # but should not be used in new code.
18
19 # OO-interface
20
21 $coder = JSON::XS->new->ascii->pretty->allow_nonref;
22 $pretty_printed_unencoded = $coder->encode ($perl_scalar);
23 $perl_scalar = $coder->decode ($unicode_json_text);
24
25 =head1 DESCRIPTION
26
27 This module converts Perl data structures to JSON and vice versa. Its
28 primary goal is to be I<correct> and its secondary goal is to be
29 I<fast>. To reach the latter goal it was written in C.
30
31 As this is the n-th-something JSON module on CPAN, what was the reason
32 to write yet another JSON module? While it seems there are many JSON
33 modules, none of them correctly handle all corner cases, and in most cases
34 their maintainers are unresponsive, gone missing, or not listening to bug
35 reports for other reasons.
36
37 See COMPARISON, below, for a comparison to some other JSON modules.
38
39 See MAPPING, below, on how JSON::XS maps perl values to JSON values and
40 vice versa.
41
42 =head2 FEATURES
43
44 =over 4
45
46 =item * correct unicode handling
47
48 This module knows how to handle Unicode, and even documents how and when
49 it does so.
50
51 =item * round-trip integrity
52
53 When you serialise a perl data structure using only datatypes supported
54 by JSON, the deserialised data structure is identical on the Perl level.
55 (e.g. the string "2.0" doesn't suddenly become "2" just because it looks
56 like a number).
57
58 =item * strict checking of JSON correctness
59
60 There is no guessing, no generating of illegal JSON texts by default,
61 and only JSON is accepted as input by default (the latter is a security
62 feature).
63
64 =item * fast
65
66 Compared to other JSON modules, this module compares favourably in terms
67 of speed, too.
68
69 =item * simple to use
70
71 This module has both a simple functional interface as well as an OO
72 interface.
73
74 =item * reasonably versatile output formats
75
76 You can choose between the most compact guarenteed single-line format
77 possible (nice for simple line-based protocols), a pure-ascii format
78 (for when your transport is not 8-bit clean, still supports the whole
79 unicode range), or a pretty-printed format (for when you want to read that
80 stuff). Or you can combine those features in whatever way you like.
81
82 =back
83
84 =cut
85
86 package JSON::XS;
87
88 use strict;
89
90 BEGIN {
91 our $VERSION = '0.8';
92 our @ISA = qw(Exporter);
93
94 our @EXPORT = qw(to_json from_json objToJson jsonToObj);
95 require Exporter;
96
97 require XSLoader;
98 XSLoader::load JSON::XS::, $VERSION;
99 }
100
101 =head1 FUNCTIONAL INTERFACE
102
103 The following convinience methods are provided by this module. They are
104 exported by default:
105
106 =over 4
107
108 =item $json_text = to_json $perl_scalar
109
110 Converts the given Perl data structure (a simple scalar or a reference to
111 a hash or array) to a UTF-8 encoded, binary string (that is, the string contains
112 octets only). Croaks on error.
113
114 This function call is functionally identical to:
115
116 $json_text = JSON::XS->new->utf8->encode ($perl_scalar)
117
118 except being faster.
119
120 =item $perl_scalar = from_json $json_text
121
122 The opposite of C<to_json>: expects an UTF-8 (binary) string and tries to
123 parse that as an UTF-8 encoded JSON text, returning the resulting simple
124 scalar or reference. Croaks on error.
125
126 This function call is functionally identical to:
127
128 $perl_scalar = JSON::XS->new->utf8->decode ($json_text)
129
130 except being faster.
131
132 =back
133
134 =head1 OBJECT-ORIENTED INTERFACE
135
136 The object oriented interface lets you configure your own encoding or
137 decoding style, within the limits of supported formats.
138
139 =over 4
140
141 =item $json = new JSON::XS
142
143 Creates a new JSON::XS object that can be used to de/encode JSON
144 strings. All boolean flags described below are by default I<disabled>.
145
146 The mutators for flags all return the JSON object again and thus calls can
147 be chained:
148
149 my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]})
150 => {"a": [1, 2]}
151
152 =item $json = $json->ascii ([$enable])
153
154 If C<$enable> is true (or missing), then the C<encode> method will not
155 generate characters outside the code range C<0..127> (which is ASCII). Any
156 unicode characters outside that range will be escaped using either a
157 single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL escape sequence,
158 as per RFC4627.
159
160 If C<$enable> is false, then the C<encode> method will not escape Unicode
161 characters unless required by the JSON syntax. This results in a faster
162 and more compact format.
163
164 JSON::XS->new->ascii (1)->encode ([chr 0x10401])
165 => ["\ud801\udc01"]
166
167 =item $json = $json->utf8 ([$enable])
168
169 If C<$enable> is true (or missing), then the C<encode> method will encode
170 the JSON result into UTF-8, as required by many protocols, while the
171 C<decode> method expects to be handled an UTF-8-encoded string. Please
172 note that UTF-8-encoded strings do not contain any characters outside the
173 range C<0..255>, they are thus useful for bytewise/binary I/O. In future
174 versions, enabling this option might enable autodetection of the UTF-16
175 and UTF-32 encoding families, as described in RFC4627.
176
177 If C<$enable> is false, then the C<encode> method will return the JSON
178 string as a (non-encoded) unicode string, while C<decode> expects thus a
179 unicode string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs
180 to be done yourself, e.g. using the Encode module.
181
182 Example, output UTF-16BE-encoded JSON:
183
184 use Encode;
185 $jsontext = encode "UTF-16BE", JSON::XS->new->encode ($object);
186
187 Example, decode UTF-32LE-encoded JSON:
188
189 use Encode;
190 $object = JSON::XS->new->decode (decode "UTF-32LE", $jsontext);
191
192 =item $json = $json->pretty ([$enable])
193
194 This enables (or disables) all of the C<indent>, C<space_before> and
195 C<space_after> (and in the future possibly more) flags in one call to
196 generate the most readable (or most compact) form possible.
197
198 Example, pretty-print some simple structure:
199
200 my $json = JSON::XS->new->pretty(1)->encode ({a => [1,2]})
201 =>
202 {
203 "a" : [
204 1,
205 2
206 ]
207 }
208
209 =item $json = $json->indent ([$enable])
210
211 If C<$enable> is true (or missing), then the C<encode> method will use a multiline
212 format as output, putting every array member or object/hash key-value pair
213 into its own line, identing them properly.
214
215 If C<$enable> is false, no newlines or indenting will be produced, and the
216 resulting JSON text is guarenteed not to contain any C<newlines>.
217
218 This setting has no effect when decoding JSON texts.
219
220 =item $json = $json->space_before ([$enable])
221
222 If C<$enable> is true (or missing), then the C<encode> method will add an extra
223 optional space before the C<:> separating keys from values in JSON objects.
224
225 If C<$enable> is false, then the C<encode> method will not add any extra
226 space at those places.
227
228 This setting has no effect when decoding JSON texts. You will also
229 most likely combine this setting with C<space_after>.
230
231 Example, space_before enabled, space_after and indent disabled:
232
233 {"key" :"value"}
234
235 =item $json = $json->space_after ([$enable])
236
237 If C<$enable> is true (or missing), then the C<encode> method will add an extra
238 optional space after the C<:> separating keys from values in JSON objects
239 and extra whitespace after the C<,> separating key-value pairs and array
240 members.
241
242 If C<$enable> is false, then the C<encode> method will not add any extra
243 space at those places.
244
245 This setting has no effect when decoding JSON texts.
246
247 Example, space_before and indent disabled, space_after enabled:
248
249 {"key": "value"}
250
251 =item $json = $json->canonical ([$enable])
252
253 If C<$enable> is true (or missing), then the C<encode> method will output JSON objects
254 by sorting their keys. This is adding a comparatively high overhead.
255
256 If C<$enable> is false, then the C<encode> method will output key-value
257 pairs in the order Perl stores them (which will likely change between runs
258 of the same script).
259
260 This option is useful if you want the same data structure to be encoded as
261 the same JSON text (given the same overall settings). If it is disabled,
262 the same hash migh be encoded differently even if contains the same data,
263 as key-value pairs have no inherent ordering in Perl.
264
265 This setting has no effect when decoding JSON texts.
266
267 =item $json = $json->allow_nonref ([$enable])
268
269 If C<$enable> is true (or missing), then the C<encode> method can convert a
270 non-reference into its corresponding string, number or null JSON value,
271 which is an extension to RFC4627. Likewise, C<decode> will accept those JSON
272 values instead of croaking.
273
274 If C<$enable> is false, then the C<encode> method will croak if it isn't
275 passed an arrayref or hashref, as JSON texts must either be an object
276 or array. Likewise, C<decode> will croak if given something that is not a
277 JSON object or array.
278
279 Example, encode a Perl scalar as JSON value with enabled C<allow_nonref>,
280 resulting in an invalid JSON text:
281
282 JSON::XS->new->allow_nonref->encode ("Hello, World!")
283 => "Hello, World!"
284
285 =item $json = $json->shrink ([$enable])
286
287 Perl usually over-allocates memory a bit when allocating space for
288 strings. This flag optionally resizes strings generated by either
289 C<encode> or C<decode> to their minimum size possible. This can save
290 memory when your JSON texts are either very very long or you have many
291 short strings. It will also try to downgrade any strings to octet-form
292 if possible: perl stores strings internally either in an encoding called
293 UTF-X or in octet-form. The latter cannot store everything but uses less
294 space in general.
295
296 If C<$enable> is true (or missing), the string returned by C<encode> will be shrunk-to-fit,
297 while all strings generated by C<decode> will also be shrunk-to-fit.
298
299 If C<$enable> is false, then the normal perl allocation algorithms are used.
300 If you work with your data, then this is likely to be faster.
301
302 In the future, this setting might control other things, such as converting
303 strings that look like integers or floats into integers or floats
304 internally (there is no difference on the Perl level), saving space.
305
306 =item $json_text = $json->encode ($perl_scalar)
307
308 Converts the given Perl data structure (a simple scalar or a reference
309 to a hash or array) to its JSON representation. Simple scalars will be
310 converted into JSON string or number sequences, while references to arrays
311 become JSON arrays and references to hashes become JSON objects. Undefined
312 Perl values (e.g. C<undef>) become JSON C<null> values. Neither C<true>
313 nor C<false> values will be generated.
314
315 =item $perl_scalar = $json->decode ($json_text)
316
317 The opposite of C<encode>: expects a JSON text and tries to parse it,
318 returning the resulting simple scalar or reference. Croaks on error.
319
320 JSON numbers and strings become simple Perl scalars. JSON arrays become
321 Perl arrayrefs and JSON objects become Perl hashrefs. C<true> becomes
322 C<1>, C<false> becomes C<0> and C<null> becomes C<undef>.
323
324 =back
325
326 =head1 MAPPING
327
328 This section describes how JSON::XS maps Perl values to JSON values and
329 vice versa. These mappings are designed to "do the right thing" in most
330 circumstances automatically, preserving round-tripping characteristics
331 (what you put in comes out as something equivalent).
332
333 For the more enlightened: note that in the following descriptions,
334 lowercase I<perl> refers to the Perl interpreter, while uppcercase I<Perl>
335 refers to the abstract Perl language itself.
336
337 =head2 JSON -> PERL
338
339 =over 4
340
341 =item object
342
343 A JSON object becomes a reference to a hash in Perl. No ordering of object
344 keys is preserved (JSON does not preserver object key ordering itself).
345
346 =item array
347
348 A JSON array becomes a reference to an array in Perl.
349
350 =item string
351
352 A JSON string becomes a string scalar in Perl - Unicode codepoints in JSON
353 are represented by the same codepoints in the Perl string, so no manual
354 decoding is necessary.
355
356 =item number
357
358 A JSON number becomes either an integer or numeric (floating point)
359 scalar in perl, depending on its range and any fractional parts. On the
360 Perl level, there is no difference between those as Perl handles all the
361 conversion details, but an integer may take slightly less memory and might
362 represent more values exactly than (floating point) numbers.
363
364 =item true, false
365
366 These JSON atoms become C<0>, C<1>, respectively. Information is lost in
367 this process. Future versions might represent those values differently,
368 but they will be guarenteed to act like these integers would normally in
369 Perl.
370
371 =item null
372
373 A JSON null atom becomes C<undef> in Perl.
374
375 =back
376
377 =head2 PERL -> JSON
378
379 The mapping from Perl to JSON is slightly more difficult, as Perl is a
380 truly typeless language, so we can only guess which JSON type is meant by
381 a Perl value.
382
383 =over 4
384
385 =item hash references
386
387 Perl hash references become JSON objects. As there is no inherent ordering
388 in hash keys, they will usually be encoded in a pseudo-random order that
389 can change between runs of the same program but stays generally the same
390 within a single run of a program. JSON::XS can optionally sort the hash
391 keys (determined by the I<canonical> flag), so the same datastructure
392 will serialise to the same JSON text (given same settings and version of
393 JSON::XS), but this incurs a runtime overhead.
394
395 =item array references
396
397 Perl array references become JSON arrays.
398
399 =item blessed objects
400
401 Blessed objects are not allowed. JSON::XS currently tries to encode their
402 underlying representation (hash- or arrayref), but this behaviour might
403 change in future versions.
404
405 =item simple scalars
406
407 Simple Perl scalars (any scalar that is not a reference) are the most
408 difficult objects to encode: JSON::XS will encode undefined scalars as
409 JSON null value, scalars that have last been used in a string context
410 before encoding as JSON strings and anything else as number value:
411
412 # dump as number
413 to_json [2] # yields [2]
414 to_json [-3.0e17] # yields [-3e+17]
415 my $value = 5; to_json [$value] # yields [5]
416
417 # used as string, so dump as string
418 print $value;
419 to_json [$value] # yields ["5"]
420
421 # undef becomes null
422 to_json [undef] # yields [null]
423
424 You can force the type to be a string by stringifying it:
425
426 my $x = 3.1; # some variable containing a number
427 "$x"; # stringified
428 $x .= ""; # another, more awkward way to stringify
429 print $x; # perl does it for you, too, quite often
430
431 You can force the type to be a number by numifying it:
432
433 my $x = "3"; # some variable containing a string
434 $x += 0; # numify it, ensuring it will be dumped as a number
435 $x *= 1; # same thing, the choise is yours.
436
437 You can not currently output JSON booleans or force the type in other,
438 less obscure, ways. Tell me if you need this capability.
439
440 =item circular data structures
441
442 Those will be encoded until memory or stackspace runs out.
443
444 =back
445
446 =head1 COMPARISON
447
448 As already mentioned, this module was created because none of the existing
449 JSON modules could be made to work correctly. First I will describe the
450 problems (or pleasures) I encountered with various existing JSON modules,
451 followed by some benchmark values. JSON::XS was designed not to suffer
452 from any of these problems or limitations.
453
454 =over 4
455
456 =item JSON 1.07
457
458 Slow (but very portable, as it is written in pure Perl).
459
460 Undocumented/buggy Unicode handling (how JSON handles unicode values is
461 undocumented. One can get far by feeding it unicode strings and doing
462 en-/decoding oneself, but unicode escapes are not working properly).
463
464 No roundtripping (strings get clobbered if they look like numbers, e.g.
465 the string C<2.0> will encode to C<2.0> instead of C<"2.0">, and that will
466 decode into the number 2.
467
468 =item JSON::PC 0.01
469
470 Very fast.
471
472 Undocumented/buggy Unicode handling.
473
474 No roundtripping.
475
476 Has problems handling many Perl values (e.g. regex results and other magic
477 values will make it croak).
478
479 Does not even generate valid JSON (C<{1,2}> gets converted to C<{1:2}>
480 which is not a valid JSON text.
481
482 Unmaintained (maintainer unresponsive for many months, bugs are not
483 getting fixed).
484
485 =item JSON::Syck 0.21
486
487 Very buggy (often crashes).
488
489 Very inflexible (no human-readable format supported, format pretty much
490 undocumented. I need at least a format for easy reading by humans and a
491 single-line compact format for use in a protocol, and preferably a way to
492 generate ASCII-only JSON texts).
493
494 Completely broken (and confusingly documented) Unicode handling (unicode
495 escapes are not working properly, you need to set ImplicitUnicode to
496 I<different> values on en- and decoding to get symmetric behaviour).
497
498 No roundtripping (simple cases work, but this depends on wether the scalar
499 value was used in a numeric context or not).
500
501 Dumping hashes may skip hash values depending on iterator state.
502
503 Unmaintained (maintainer unresponsive for many months, bugs are not
504 getting fixed).
505
506 Does not check input for validity (i.e. will accept non-JSON input and
507 return "something" instead of raising an exception. This is a security
508 issue: imagine two banks transfering money between each other using
509 JSON. One bank might parse a given non-JSON request and deduct money,
510 while the other might reject the transaction with a syntax error. While a
511 good protocol will at least recover, that is extra unnecessary work and
512 the transaction will still not succeed).
513
514 =item JSON::DWIW 0.04
515
516 Very fast. Very natural. Very nice.
517
518 Undocumented unicode handling (but the best of the pack. Unicode escapes
519 still don't get parsed properly).
520
521 Very inflexible.
522
523 No roundtripping.
524
525 Does not generate valid JSON texts (key strings are often unquoted, empty keys
526 result in nothing being output)
527
528 Does not check input for validity.
529
530 =back
531
532 =head2 SPEED
533
534 It seems that JSON::XS is surprisingly fast, as shown in the following
535 tables. They have been generated with the help of the C<eg/bench> program
536 in the JSON::XS distribution, to make it easy to compare on your own
537 system.
538
539 First comes a comparison between various modules using a very short JSON
540 string:
541
542 {"method": "handleMessage", "params": ["user1", "we were just talking"], "id": null}
543
544 It shows the number of encodes/decodes per second (JSON::XS uses the
545 functional interface, while JSON::XS/2 uses the OO interface with
546 pretty-printing and hashkey sorting enabled). Higher is better:
547
548 module | encode | decode |
549 -----------|------------|------------|
550 JSON | 11488.516 | 7823.035 |
551 JSON::DWIW | 94708.054 | 129094.260 |
552 JSON::PC | 63884.157 | 128528.212 |
553 JSON::Syck | 34898.677 | 42096.911 |
554 JSON::XS | 654027.064 | 396423.669 |
555 JSON::XS/2 | 371564.190 | 371725.613 |
556 -----------+------------+------------+
557
558 That is, JSON::XS is more than six times faster than JSON::DWIW on
559 encoding, more than three times faster on decoding, and about thirty times
560 faster than JSON, even with pretty-printing and key sorting.
561
562 Using a longer test string (roughly 18KB, generated from Yahoo! Locals
563 search API (http://nanoref.com/yahooapis/mgPdGg):
564
565 module | encode | decode |
566 -----------|------------|------------|
567 JSON | 273.023 | 44.674 |
568 JSON::DWIW | 1089.383 | 1145.704 |
569 JSON::PC | 3097.419 | 2393.921 |
570 JSON::Syck | 514.060 | 843.053 |
571 JSON::XS | 6479.668 | 3636.364 |
572 JSON::XS/2 | 3774.221 | 3599.124 |
573 -----------+------------+------------+
574
575 Again, JSON::XS leads by far.
576
577 On large strings containing lots of high unicode characters, some modules
578 (such as JSON::PC) seem to decode faster than JSON::XS, but the result
579 will be broken due to missing (or wrong) unicode handling. Others refuse
580 to decode or encode properly, so it was impossible to prepare a fair
581 comparison table for that case.
582
583 =head1 RESOURCE LIMITS
584
585 JSON::XS does not impose any limits on the size of JSON texts or Perl
586 values they represent - if your machine can handle it, JSON::XS will
587 encode or decode it. Future versions might optionally impose structure
588 depth and memory use resource limits.
589
590 =head1 BUGS
591
592 While the goal of this module is to be correct, that unfortunately does
593 not mean its bug-free, only that I think its design is bug-free. It is
594 still very young and not well-tested. If you keep reporting bugs they will
595 be fixed swiftly, though.
596
597 =cut
598
599 1;
600
601 =head1 AUTHOR
602
603 Marc Lehmann <schmorp@schmorp.de>
604 http://home.schmorp.de/
605
606 =cut
607