ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/JSON-XS/README
(Generate patch)

Comparing JSON-XS/README (file contents):
Revision 1.7 by root, Sun Mar 25 00:47:42 2007 UTC vs.
Revision 1.14 by root, Sat Jun 23 23:50:03 2007 UTC

2 JSON::XS - JSON serialising/deserialising, done correctly and fast 2 JSON::XS - JSON serialising/deserialising, done correctly and fast
3 3
4SYNOPSIS 4SYNOPSIS
5 use JSON::XS; 5 use JSON::XS;
6 6
7 # exported functions, croak on error 7 # exported functions, they croak on error
8 # and expect/generate UTF-8
8 9
9 $utf8_encoded_json_text = to_json $perl_hash_or_arrayref; 10 $utf8_encoded_json_text = to_json $perl_hash_or_arrayref;
10 $perl_hash_or_arrayref = from_json $utf8_encoded_json_text; 11 $perl_hash_or_arrayref = from_json $utf8_encoded_json_text;
11 12
13 # objToJson and jsonToObj aliases to to_json and from_json
14 # are exported for compatibility to the JSON module,
15 # but should not be used in new code.
16
12 # oo-interface 17 # OO-interface
13 18
14 $coder = JSON::XS->new->ascii->pretty->allow_nonref; 19 $coder = JSON::XS->new->ascii->pretty->allow_nonref;
15 $pretty_printed_unencoded = $coder->encode ($perl_scalar); 20 $pretty_printed_unencoded = $coder->encode ($perl_scalar);
16 $perl_scalar = $coder->decode ($unicode_json_text); 21 $perl_scalar = $coder->decode ($unicode_json_text);
17 22
30 35
31 See MAPPING, below, on how JSON::XS maps perl values to JSON values and 36 See MAPPING, below, on how JSON::XS maps perl values to JSON values and
32 vice versa. 37 vice versa.
33 38
34 FEATURES 39 FEATURES
35 * correct handling of unicode issues 40 * correct unicode handling
36 This module knows how to handle Unicode, and even documents how and 41 This module knows how to handle Unicode, and even documents how and
37 when it does so. 42 when it does so.
38 43
39 * round-trip integrity 44 * round-trip integrity
40 When you serialise a perl data structure using only datatypes 45 When you serialise a perl data structure using only datatypes
41 supported by JSON, the deserialised data structure is identical on 46 supported by JSON, the deserialised data structure is identical on
42 the Perl level. (e.g. the string "2.0" doesn't suddenly become "2"). 47 the Perl level. (e.g. the string "2.0" doesn't suddenly become "2"
48 just because it looks like a number).
43 49
44 * strict checking of JSON correctness 50 * strict checking of JSON correctness
45 There is no guessing, no generating of illegal JSON texts by 51 There is no guessing, no generating of illegal JSON texts by
46 default, and only JSON is accepted as input by default (the latter 52 default, and only JSON is accepted as input by default (the latter
47 is a security feature). 53 is a security feature).
55 interface. 61 interface.
56 62
57 * reasonably versatile output formats 63 * reasonably versatile output formats
58 You can choose between the most compact guarenteed single-line 64 You can choose between the most compact guarenteed single-line
59 format possible (nice for simple line-based protocols), a pure-ascii 65 format possible (nice for simple line-based protocols), a pure-ascii
60 format (for when your transport is not 8-bit clean), or a 66 format (for when your transport is not 8-bit clean, still supports
61 pretty-printed format (for when you want to read that stuff). Or you 67 the whole unicode range), or a pretty-printed format (for when you
62 can combine those features in whatever way you like. 68 want to read that stuff). Or you can combine those features in
69 whatever way you like.
63 70
64FUNCTIONAL INTERFACE 71FUNCTIONAL INTERFACE
65 The following convinience methods are provided by this module. They are 72 The following convinience methods are provided by this module. They are
66 exported by default: 73 exported by default:
67 74
84 This function call is functionally identical to: 91 This function call is functionally identical to:
85 92
86 $perl_scalar = JSON::XS->new->utf8->decode ($json_text) 93 $perl_scalar = JSON::XS->new->utf8->decode ($json_text)
87 94
88 except being faster. 95 except being faster.
96
97 $is_boolean = JSON::XS::is_bool $scalar
98 Returns true if the passed scalar represents either JSON::XS::true
99 or JSON::XS::false, two constants that act like 1 and 0,
100 respectively and are used to represent JSON "true" and "false"
101 values in Perl.
102
103 See MAPPING, below, for more information on how JSON values are
104 mapped to Perl.
89 105
90OBJECT-ORIENTED INTERFACE 106OBJECT-ORIENTED INTERFACE
91 The object oriented interface lets you configure your own encoding or 107 The object oriented interface lets you configure your own encoding or
92 decoding style, within the limits of supported formats. 108 decoding style, within the limits of supported formats.
93 109
105 $json = $json->ascii ([$enable]) 121 $json = $json->ascii ([$enable])
106 If $enable is true (or missing), then the "encode" method will not 122 If $enable is true (or missing), then the "encode" method will not
107 generate characters outside the code range 0..127 (which is ASCII). 123 generate characters outside the code range 0..127 (which is ASCII).
108 Any unicode characters outside that range will be escaped using 124 Any unicode characters outside that range will be escaped using
109 either a single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL 125 either a single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL
110 escape sequence, as per RFC4627. 126 escape sequence, as per RFC4627. The resulting encoded JSON text can
127 be treated as a native unicode string, an ascii-encoded,
128 latin1-encoded or UTF-8 encoded string, or any other superset of
129 ASCII.
111 130
112 If $enable is false, then the "encode" method will not escape 131 If $enable is false, then the "encode" method will not escape
113 Unicode characters unless required by the JSON syntax. This results 132 Unicode characters unless required by the JSON syntax or other
114 in a faster and more compact format. 133 flags. This results in a faster and more compact format.
134
135 The main use for this flag is to produce JSON texts that can be
136 transmitted over a 7-bit channel, as the encoded JSON texts will not
137 contain any 8 bit characters.
115 138
116 JSON::XS->new->ascii (1)->encode ([chr 0x10401]) 139 JSON::XS->new->ascii (1)->encode ([chr 0x10401])
117 => ["\ud801\udc01"] 140 => ["\ud801\udc01"]
141
142 $json = $json->latin1 ([$enable])
143 If $enable is true (or missing), then the "encode" method will
144 encode the resulting JSON text as latin1 (or iso-8859-1), escaping
145 any characters outside the code range 0..255. The resulting string
146 can be treated as a latin1-encoded JSON text or a native unicode
147 string. The "decode" method will not be affected in any way by this
148 flag, as "decode" by default expects unicode, which is a strict
149 superset of latin1.
150
151 If $enable is false, then the "encode" method will not escape
152 Unicode characters unless required by the JSON syntax or other
153 flags.
154
155 The main use for this flag is efficiently encoding binary data as
156 JSON text, as most octets will not be escaped, resulting in a
157 smaller encoded size. The disadvantage is that the resulting JSON
158 text is encoded in latin1 (and must correctly be treated as such
159 when storing and transfering), a rare encoding for JSON. It is
160 therefore most useful when you want to store data structures known
161 to contain binary data efficiently in files or databases, not when
162 talking to other JSON encoders/decoders.
163
164 JSON::XS->new->latin1->encode (["\x{89}\x{abc}"]
165 => ["\x{89}\\u0abc"] # (perl syntax, U+abc escaped, U+89 not)
118 166
119 $json = $json->utf8 ([$enable]) 167 $json = $json->utf8 ([$enable])
120 If $enable is true (or missing), then the "encode" method will 168 If $enable is true (or missing), then the "encode" method will
121 encode the JSON result into UTF-8, as required by many protocols, 169 encode the JSON result into UTF-8, as required by many protocols,
122 while the "decode" method expects to be handled an UTF-8-encoded 170 while the "decode" method expects to be handled an UTF-8-encoded
238 "encode" or "decode" to their minimum size possible. This can save 286 "encode" or "decode" to their minimum size possible. This can save
239 memory when your JSON texts are either very very long or you have 287 memory when your JSON texts are either very very long or you have
240 many short strings. It will also try to downgrade any strings to 288 many short strings. It will also try to downgrade any strings to
241 octet-form if possible: perl stores strings internally either in an 289 octet-form if possible: perl stores strings internally either in an
242 encoding called UTF-X or in octet-form. The latter cannot store 290 encoding called UTF-X or in octet-form. The latter cannot store
243 everything but uses less space in general. 291 everything but uses less space in general (and some buggy Perl or C
292 code might even rely on that internal representation being used).
293
294 The actual definition of what shrink does might change in future
295 versions, but it will always try to save space at the expense of
296 time.
244 297
245 If $enable is true (or missing), the string returned by "encode" 298 If $enable is true (or missing), the string returned by "encode"
246 will be shrunk-to-fit, while all strings generated by "decode" will 299 will be shrunk-to-fit, while all strings generated by "decode" will
247 also be shrunk-to-fit. 300 also be shrunk-to-fit.
248 301
251 304
252 In the future, this setting might control other things, such as 305 In the future, this setting might control other things, such as
253 converting strings that look like integers or floats into integers 306 converting strings that look like integers or floats into integers
254 or floats internally (there is no difference on the Perl level), 307 or floats internally (there is no difference on the Perl level),
255 saving space. 308 saving space.
309
310 $json = $json->max_depth ([$maximum_nesting_depth])
311 Sets the maximum nesting level (default 512) accepted while encoding
312 or decoding. If the JSON text or Perl data structure has an equal or
313 higher nesting level then this limit, then the encoder and decoder
314 will stop and croak at that point.
315
316 Nesting level is defined by number of hash- or arrayrefs that the
317 encoder needs to traverse to reach a given point or the number of
318 "{" or "[" characters without their matching closing parenthesis
319 crossed to reach a given character in a string.
320
321 Setting the maximum depth to one disallows any nesting, so that
322 ensures that the object is only a single hash/object or array.
323
324 The argument to "max_depth" will be rounded up to the next nearest
325 power of two.
326
327 See SECURITY CONSIDERATIONS, below, for more info on why this is
328 useful.
256 329
257 $json_text = $json->encode ($perl_scalar) 330 $json_text = $json->encode ($perl_scalar)
258 Converts the given Perl data structure (a simple scalar or a 331 Converts the given Perl data structure (a simple scalar or a
259 reference to a hash or array) to its JSON representation. Simple 332 reference to a hash or array) to its JSON representation. Simple
260 scalars will be converted into JSON string or number sequences, 333 scalars will be converted into JSON string or number sequences,
269 342
270 JSON numbers and strings become simple Perl scalars. JSON arrays 343 JSON numbers and strings become simple Perl scalars. JSON arrays
271 become Perl arrayrefs and JSON objects become Perl hashrefs. "true" 344 become Perl arrayrefs and JSON objects become Perl hashrefs. "true"
272 becomes 1, "false" becomes 0 and "null" becomes "undef". 345 becomes 1, "false" becomes 0 and "null" becomes "undef".
273 346
347 ($perl_scalar, $characters) = $json->decode_prefix ($json_text)
348 This works like the "decode" method, but instead of raising an
349 exception when there is trailing garbage after the first JSON
350 object, it will silently stop parsing there and return the number of
351 characters consumed so far.
352
353 This is useful if your JSON texts are not delimited by an outer
354 protocol (which is not the brightest thing to do in the first place)
355 and you need to know where the JSON text ends.
356
357 JSON::XS->new->decode_prefix ("[1] the tail")
358 => ([], 3)
359
274MAPPING 360MAPPING
275 This section describes how JSON::XS maps Perl values to JSON values and 361 This section describes how JSON::XS maps Perl values to JSON values and
276 vice versa. These mappings are designed to "do the right thing" in most 362 vice versa. These mappings are designed to "do the right thing" in most
277 circumstances automatically, preserving round-tripping characteristics 363 circumstances automatically, preserving round-tripping characteristics
278 (what you put in comes out as something equivalent). 364 (what you put in comes out as something equivalent).
302 all the conversion details, but an integer may take slightly less 388 all the conversion details, but an integer may take slightly less
303 memory and might represent more values exactly than (floating point) 389 memory and might represent more values exactly than (floating point)
304 numbers. 390 numbers.
305 391
306 true, false 392 true, false
307 These JSON atoms become 0, 1, respectively. Information is lost in 393 These JSON atoms become "JSON::XS::true" and "JSON::XS::false",
308 this process. Future versions might represent those values 394 respectively. They are overloaded to act almost exactly like the
309 differently, but they will be guarenteed to act like these integers 395 numbers 1 and 0. You can check wether a scalar is a JSON boolean by
310 would normally in Perl. 396 using the "JSON::XS::is_bool" function.
311 397
312 null 398 null
313 A JSON null atom becomes "undef" in Perl. 399 A JSON null atom becomes "undef" in Perl.
314 400
315 PERL -> JSON 401 PERL -> JSON
317 truly typeless language, so we can only guess which JSON type is meant 403 truly typeless language, so we can only guess which JSON type is meant
318 by a Perl value. 404 by a Perl value.
319 405
320 hash references 406 hash references
321 Perl hash references become JSON objects. As there is no inherent 407 Perl hash references become JSON objects. As there is no inherent
322 ordering in hash keys, they will usually be encoded in a 408 ordering in hash keys (or JSON objects), they will usually be
323 pseudo-random order that can change between runs of the same program 409 encoded in a pseudo-random order that can change between runs of the
324 but stays generally the same within a single run of a program. 410 same program but stays generally the same within a single run of a
325 JSON::XS can optionally sort the hash keys (determined by the 411 program. JSON::XS can optionally sort the hash keys (determined by
326 *canonical* flag), so the same datastructure will serialise to the 412 the *canonical* flag), so the same datastructure will serialise to
327 same JSON text (given same settings and version of JSON::XS), but 413 the same JSON text (given same settings and version of JSON::XS),
328 this incurs a runtime overhead. 414 but this incurs a runtime overhead and is only rarely useful, e.g.
415 when you want to compare some JSON text against another for
416 equality.
329 417
330 array references 418 array references
331 Perl array references become JSON arrays. 419 Perl array references become JSON arrays.
420
421 other references
422 Other unblessed references are generally not allowed and will cause
423 an exception to be thrown, except for references to the integers 0
424 and 1, which get turned into "false" and "true" atoms in JSON. You
425 can also use "JSON::XS::false" and "JSON::XS::true" to improve
426 readability.
427
428 to_json [\0,JSON::XS::true] # yields [false,true]
429
430 JSON::XS::true, JSON::XS::false
431 These special values become JSON true and JSON false values,
432 respectively. You cna alos use "\1" and "\0" directly if you want.
332 433
333 blessed objects 434 blessed objects
334 Blessed objects are not allowed. JSON::XS currently tries to encode 435 Blessed objects are not allowed. JSON::XS currently tries to encode
335 their underlying representation (hash- or arrayref), but this 436 their underlying representation (hash- or arrayref), but this
336 behaviour might change in future versions. 437 behaviour might change in future versions.
367 $x += 0; # numify it, ensuring it will be dumped as a number 468 $x += 0; # numify it, ensuring it will be dumped as a number
368 $x *= 1; # same thing, the choise is yours. 469 $x *= 1; # same thing, the choise is yours.
369 470
370 You can not currently output JSON booleans or force the type in 471 You can not currently output JSON booleans or force the type in
371 other, less obscure, ways. Tell me if you need this capability. 472 other, less obscure, ways. Tell me if you need this capability.
372
373 circular data structures
374 Those will be encoded until memory or stackspace runs out.
375 473
376COMPARISON 474COMPARISON
377 As already mentioned, this module was created because none of the 475 As already mentioned, this module was created because none of the
378 existing JSON modules could be made to work correctly. First I will 476 existing JSON modules could be made to work correctly. First I will
379 describe the problems (or pleasures) I encountered with various existing 477 describe the problems (or pleasures) I encountered with various existing
450 Does not generate valid JSON texts (key strings are often unquoted, 548 Does not generate valid JSON texts (key strings are often unquoted,
451 empty keys result in nothing being output) 549 empty keys result in nothing being output)
452 550
453 Does not check input for validity. 551 Does not check input for validity.
454 552
553 JSON and YAML
554 You often hear that JSON is a subset (or a close subset) of YAML. This
555 is, however, a mass hysteria and very far from the truth. In general,
556 there is no way to configure JSON::XS to output a data structure as
557 valid YAML.
558
559 If you really must use JSON::XS to generate YAML, you should use this
560 algorithm (subject to change in future versions):
561
562 my $to_yaml = JSON::XS->new->utf8->space_after (1);
563 my $yaml = $to_yaml->encode ($ref) . "\n";
564
565 This will usually generate JSON texts that also parse as valid YAML.
566 Please note that YAML has hardcoded limits on (simple) object key
567 lengths that JSON doesn't have, so you should make sure that your hash
568 keys are noticably shorter than the 1024 characters YAML allows.
569
570 There might be other incompatibilities that I am not aware of. In
571 general you should not try to generate YAML with a JSON generator or
572 vice versa, or try to parse JSON with a YAML parser or vice versa:
573 chances are high that you will run into severe interoperability
574 problems.
575
455 SPEED 576 SPEED
456 It seems that JSON::XS is surprisingly fast, as shown in the following 577 It seems that JSON::XS is surprisingly fast, as shown in the following
457 tables. They have been generated with the help of the "eg/bench" program 578 tables. They have been generated with the help of the "eg/bench" program
458 in the JSON::XS distribution, to make it easy to compare on your own 579 in the JSON::XS distribution, to make it easy to compare on your own
459 system. 580 system.
460 581
461 First comes a comparison between various modules using a very short JSON 582 First comes a comparison between various modules using a very short
462 string: 583 single-line JSON string:
463 584
464 {"method": "handleMessage", "params": ["user1", "we were just talking"], "id": null} 585 {"method": "handleMessage", "params": ["user1", "we were just talking"], \
586 "id": null, "array":[1,11,234,-5,1e5,1e7, true, false]}
465 587
466 It shows the number of encodes/decodes per second (JSON::XS uses the 588 It shows the number of encodes/decodes per second (JSON::XS uses the
467 functional interface, while JSON::XS/2 uses the OO interface with 589 functional interface, while JSON::XS/2 uses the OO interface with
468 pretty-printing and hashkey sorting enabled). Higher is better: 590 pretty-printing and hashkey sorting enabled, JSON::XS/3 enables shrink).
591 Higher is better:
469 592
470 module | encode | decode | 593 module | encode | decode |
471 -----------|------------|------------| 594 -----------|------------|------------|
472 JSON | 11488.516 | 7823.035 | 595 JSON | 7645.468 | 4208.613 |
473 JSON::DWIW | 94708.054 | 129094.260 | 596 JSON::DWIW | 40721.398 | 77101.176 |
474 JSON::PC | 63884.157 | 128528.212 | 597 JSON::PC | 65948.176 | 78251.940 |
475 JSON::Syck | 34898.677 | 42096.911 | 598 JSON::Syck | 22844.793 | 26479.192 |
476 JSON::XS | 654027.064 | 396423.669 | 599 JSON::XS | 388361.481 | 199728.762 |
477 JSON::XS/2 | 371564.190 | 371725.613 | 600 JSON::XS/2 | 218453.333 | 192399.266 |
601 JSON::XS/3 | 338250.323 | 192399.266 |
602 Storable | 15779.925 | 14169.946 |
478 -----------+------------+------------+ 603 -----------+------------+------------+
479 604
480 That is, JSON::XS is more than six times faster than JSON::DWIW on 605 That is, JSON::XS is about five times faster than JSON::DWIW on
481 encoding, more than three times faster on decoding, and about thirty 606 encoding, about three times faster on decoding, and over fourty times
482 times faster than JSON, even with pretty-printing and key sorting. 607 faster than JSON, even with pretty-printing and key sorting. It also
608 compares favourably to Storable for small amounts of data.
483 609
484 Using a longer test string (roughly 18KB, generated from Yahoo! Locals 610 Using a longer test string (roughly 18KB, generated from Yahoo! Locals
485 search API (http://nanoref.com/yahooapis/mgPdGg): 611 search API (http://nanoref.com/yahooapis/mgPdGg):
486 612
487 module | encode | decode | 613 module | encode | decode |
488 -----------|------------|------------| 614 -----------|------------|------------|
489 JSON | 273.023 | 44.674 | 615 JSON | 254.685 | 37.665 |
490 JSON::DWIW | 1089.383 | 1145.704 | 616 JSON::DWIW | 843.343 | 1049.731 |
491 JSON::PC | 3097.419 | 2393.921 | 617 JSON::PC | 3602.116 | 2307.352 |
492 JSON::Syck | 514.060 | 843.053 | 618 JSON::Syck | 505.107 | 787.899 |
493 JSON::XS | 6479.668 | 3636.364 | 619 JSON::XS | 5747.196 | 3690.220 |
494 JSON::XS/2 | 3774.221 | 3599.124 | 620 JSON::XS/2 | 3968.121 | 3676.634 |
621 JSON::XS/3 | 6105.246 | 3662.508 |
622 Storable | 4417.337 | 5285.161 |
495 -----------+------------+------------+ 623 -----------+------------+------------+
496 624
497 Again, JSON::XS leads by far. 625 Again, JSON::XS leads by far (except for Storable which non-surprisingly
626 decodes faster).
498 627
499 On large strings containing lots of high unicode characters, some 628 On large strings containing lots of high unicode characters, some
500 modules (such as JSON::PC) seem to decode faster than JSON::XS, but the 629 modules (such as JSON::PC) seem to decode faster than JSON::XS, but the
501 result will be broken due to missing (or wrong) unicode handling. Others 630 result will be broken due to missing (or wrong) unicode handling. Others
502 refuse to decode or encode properly, so it was impossible to prepare a 631 refuse to decode or encode properly, so it was impossible to prepare a
503 fair comparison table for that case. 632 fair comparison table for that case.
504 633
505RESOURCE LIMITS 634SECURITY CONSIDERATIONS
506 JSON::XS does not impose any limits on the size of JSON texts or Perl 635 When you are using JSON in a protocol, talking to untrusted potentially
507 values they represent - if your machine can handle it, JSON::XS will 636 hostile creatures requires relatively few measures.
508 encode or decode it. Future versions might optionally impose structure 637
509 depth and memory use resource limits. 638 First of all, your JSON decoder should be secure, that is, should not
639 have any buffer overflows. Obviously, this module should ensure that and
640 I am trying hard on making that true, but you never know.
641
642 Second, you need to avoid resource-starving attacks. That means you
643 should limit the size of JSON texts you accept, or make sure then when
644 your resources run out, thats just fine (e.g. by using a separate
645 process that can crash safely). The size of a JSON text in octets or
646 characters is usually a good indication of the size of the resources
647 required to decode it into a Perl structure.
648
649 Third, JSON::XS recurses using the C stack when decoding objects and
650 arrays. The C stack is a limited resource: for instance, on my amd64
651 machine with 8MB of stack size I can decode around 180k nested arrays
652 but only 14k nested JSON objects (due to perl itself recursing deeply on
653 croak to free the temporary). If that is exceeded, the program crashes.
654 to be conservative, the default nesting limit is set to 512. If your
655 process has a smaller stack, you should adjust this setting accordingly
656 with the "max_depth" method.
657
658 And last but least, something else could bomb you that I forgot to think
659 of. In that case, you get to keep the pieces. I am always open for
660 hints, though...
661
662 If you are using JSON::XS to return packets to consumption by javascript
663 scripts in a browser you should have a look at
664 <http://jpsykes.com/47/practical-csrf-and-json-security> to see wether
665 you are vulnerable to some common attack vectors (which really are
666 browser design bugs, but it is still you who will have to deal with it,
667 as major browser developers care only for features, not about doing
668 security right).
510 669
511BUGS 670BUGS
512 While the goal of this module is to be correct, that unfortunately does 671 While the goal of this module is to be correct, that unfortunately does
513 not mean its bug-free, only that I think its design is bug-free. It is 672 not mean its bug-free, only that I think its design is bug-free. It is
514 still very young and not well-tested. If you keep reporting bugs they 673 still relatively early in its development. If you keep reporting bugs
515 will be fixed swiftly, though. 674 they will be fixed swiftly, though.
516 675
517AUTHOR 676AUTHOR
518 Marc Lehmann <schmorp@schmorp.de> 677 Marc Lehmann <schmorp@schmorp.de>
519 http://home.schmorp.de/ 678 http://home.schmorp.de/
520 679

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines