ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/JSON-XS/XS.pm
Revision: 1.21
Committed: Sun Mar 25 02:32:40 2007 UTC (17 years, 1 month ago) by root
Branch: MAIN
Changes since 1.20: +12 -8 lines
Log Message:
*** empty log message ***

File Contents

# User Rev Content
1 root 1.1 =head1 NAME
2    
3     JSON::XS - JSON serialising/deserialising, done correctly and fast
4    
5     =head1 SYNOPSIS
6    
7     use JSON::XS;
8    
9 root 1.12 # exported functions, croak on error
10    
11     $utf8_encoded_json_text = to_json $perl_hash_or_arrayref;
12     $perl_hash_or_arrayref = from_json $utf8_encoded_json_text;
13    
14 root 1.21 # objToJson and jsonToObj are exported for JSON
15     # compatibility, but should not be used in new code.
16    
17 root 1.12 # oo-interface
18    
19     $coder = JSON::XS->new->ascii->pretty->allow_nonref;
20     $pretty_printed_unencoded = $coder->encode ($perl_scalar);
21     $perl_scalar = $coder->decode ($unicode_json_text);
22    
23 root 1.1 =head1 DESCRIPTION
24    
25 root 1.2 This module converts Perl data structures to JSON and vice versa. Its
26     primary goal is to be I<correct> and its secondary goal is to be
27     I<fast>. To reach the latter goal it was written in C.
28    
29     As this is the n-th-something JSON module on CPAN, what was the reason
30     to write yet another JSON module? While it seems there are many JSON
31     modules, none of them correctly handle all corner cases, and in most cases
32     their maintainers are unresponsive, gone missing, or not listening to bug
33     reports for other reasons.
34    
35     See COMPARISON, below, for a comparison to some other JSON modules.
36    
37 root 1.10 See MAPPING, below, on how JSON::XS maps perl values to JSON values and
38     vice versa.
39    
40 root 1.2 =head2 FEATURES
41    
42 root 1.1 =over 4
43    
44 root 1.21 =item * correct unicode handling
45 root 1.2
46 root 1.10 This module knows how to handle Unicode, and even documents how and when
47     it does so.
48 root 1.2
49     =item * round-trip integrity
50    
51     When you serialise a perl data structure using only datatypes supported
52     by JSON, the deserialised data structure is identical on the Perl level.
53 root 1.21 (e.g. the string "2.0" doesn't suddenly become "2" just because it looks
54     like a number).
55 root 1.2
56     =item * strict checking of JSON correctness
57    
58 root 1.16 There is no guessing, no generating of illegal JSON texts by default,
59 root 1.10 and only JSON is accepted as input by default (the latter is a security
60     feature).
61 root 1.2
62     =item * fast
63    
64 root 1.10 Compared to other JSON modules, this module compares favourably in terms
65     of speed, too.
66 root 1.2
67     =item * simple to use
68    
69     This module has both a simple functional interface as well as an OO
70     interface.
71    
72     =item * reasonably versatile output formats
73    
74 root 1.10 You can choose between the most compact guarenteed single-line format
75 root 1.21 possible (nice for simple line-based protocols), a pure-ascii format
76     (for when your transport is not 8-bit clean, still supports the whole
77     unicode range), or a pretty-printed format (for when you want to read that
78     stuff). Or you can combine those features in whatever way you like.
79 root 1.2
80     =back
81    
82 root 1.1 =cut
83    
84     package JSON::XS;
85    
86 root 1.20 use strict;
87    
88 root 1.1 BEGIN {
89 root 1.21 our $VERSION = '0.8';
90 root 1.20 our @ISA = qw(Exporter);
91 root 1.1
92 root 1.21 our @EXPORT = qw(to_json from_json objToJson jsonToObj);
93 root 1.1 require Exporter;
94    
95     require XSLoader;
96     XSLoader::load JSON::XS::, $VERSION;
97     }
98    
99 root 1.2 =head1 FUNCTIONAL INTERFACE
100    
101     The following convinience methods are provided by this module. They are
102     exported by default:
103    
104     =over 4
105    
106 root 1.16 =item $json_text = to_json $perl_scalar
107 root 1.2
108     Converts the given Perl data structure (a simple scalar or a reference to
109     a hash or array) to a UTF-8 encoded, binary string (that is, the string contains
110     octets only). Croaks on error.
111    
112 root 1.16 This function call is functionally identical to:
113 root 1.2
114 root 1.16 $json_text = JSON::XS->new->utf8->encode ($perl_scalar)
115    
116     except being faster.
117    
118     =item $perl_scalar = from_json $json_text
119 root 1.2
120     The opposite of C<to_json>: expects an UTF-8 (binary) string and tries to
121 root 1.16 parse that as an UTF-8 encoded JSON text, returning the resulting simple
122 root 1.2 scalar or reference. Croaks on error.
123    
124 root 1.16 This function call is functionally identical to:
125    
126     $perl_scalar = JSON::XS->new->utf8->decode ($json_text)
127    
128     except being faster.
129 root 1.2
130     =back
131    
132     =head1 OBJECT-ORIENTED INTERFACE
133    
134     The object oriented interface lets you configure your own encoding or
135     decoding style, within the limits of supported formats.
136    
137     =over 4
138    
139     =item $json = new JSON::XS
140    
141     Creates a new JSON::XS object that can be used to de/encode JSON
142     strings. All boolean flags described below are by default I<disabled>.
143 root 1.1
144 root 1.2 The mutators for flags all return the JSON object again and thus calls can
145     be chained:
146    
147 root 1.16 my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]})
148 root 1.3 => {"a": [1, 2]}
149 root 1.2
150 root 1.7 =item $json = $json->ascii ([$enable])
151 root 1.2
152 root 1.16 If C<$enable> is true (or missing), then the C<encode> method will not
153     generate characters outside the code range C<0..127> (which is ASCII). Any
154     unicode characters outside that range will be escaped using either a
155     single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL escape sequence,
156     as per RFC4627.
157 root 1.2
158     If C<$enable> is false, then the C<encode> method will not escape Unicode
159 root 1.16 characters unless required by the JSON syntax. This results in a faster
160     and more compact format.
161 root 1.2
162 root 1.16 JSON::XS->new->ascii (1)->encode ([chr 0x10401])
163     => ["\ud801\udc01"]
164 root 1.3
165 root 1.7 =item $json = $json->utf8 ([$enable])
166 root 1.2
167 root 1.7 If C<$enable> is true (or missing), then the C<encode> method will encode
168 root 1.16 the JSON result into UTF-8, as required by many protocols, while the
169 root 1.7 C<decode> method expects to be handled an UTF-8-encoded string. Please
170     note that UTF-8-encoded strings do not contain any characters outside the
171 root 1.16 range C<0..255>, they are thus useful for bytewise/binary I/O. In future
172     versions, enabling this option might enable autodetection of the UTF-16
173     and UTF-32 encoding families, as described in RFC4627.
174 root 1.2
175     If C<$enable> is false, then the C<encode> method will return the JSON
176     string as a (non-encoded) unicode string, while C<decode> expects thus a
177     unicode string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs
178     to be done yourself, e.g. using the Encode module.
179    
180 root 1.16 Example, output UTF-16BE-encoded JSON:
181    
182     use Encode;
183     $jsontext = encode "UTF-16BE", JSON::XS->new->encode ($object);
184    
185     Example, decode UTF-32LE-encoded JSON:
186    
187     use Encode;
188     $object = JSON::XS->new->decode (decode "UTF-32LE", $jsontext);
189 root 1.12
190 root 1.7 =item $json = $json->pretty ([$enable])
191 root 1.2
192     This enables (or disables) all of the C<indent>, C<space_before> and
193 root 1.3 C<space_after> (and in the future possibly more) flags in one call to
194 root 1.2 generate the most readable (or most compact) form possible.
195    
196 root 1.12 Example, pretty-print some simple structure:
197    
198 root 1.3 my $json = JSON::XS->new->pretty(1)->encode ({a => [1,2]})
199     =>
200     {
201     "a" : [
202     1,
203     2
204     ]
205     }
206    
207 root 1.7 =item $json = $json->indent ([$enable])
208 root 1.2
209 root 1.7 If C<$enable> is true (or missing), then the C<encode> method will use a multiline
210 root 1.2 format as output, putting every array member or object/hash key-value pair
211     into its own line, identing them properly.
212    
213     If C<$enable> is false, no newlines or indenting will be produced, and the
214 root 1.16 resulting JSON text is guarenteed not to contain any C<newlines>.
215 root 1.2
216 root 1.16 This setting has no effect when decoding JSON texts.
217 root 1.2
218 root 1.7 =item $json = $json->space_before ([$enable])
219 root 1.2
220 root 1.7 If C<$enable> is true (or missing), then the C<encode> method will add an extra
221 root 1.2 optional space before the C<:> separating keys from values in JSON objects.
222    
223     If C<$enable> is false, then the C<encode> method will not add any extra
224     space at those places.
225    
226 root 1.16 This setting has no effect when decoding JSON texts. You will also
227     most likely combine this setting with C<space_after>.
228 root 1.2
229 root 1.12 Example, space_before enabled, space_after and indent disabled:
230    
231     {"key" :"value"}
232    
233 root 1.7 =item $json = $json->space_after ([$enable])
234 root 1.2
235 root 1.7 If C<$enable> is true (or missing), then the C<encode> method will add an extra
236 root 1.2 optional space after the C<:> separating keys from values in JSON objects
237     and extra whitespace after the C<,> separating key-value pairs and array
238     members.
239    
240     If C<$enable> is false, then the C<encode> method will not add any extra
241     space at those places.
242    
243 root 1.16 This setting has no effect when decoding JSON texts.
244 root 1.2
245 root 1.12 Example, space_before and indent disabled, space_after enabled:
246    
247     {"key": "value"}
248    
249 root 1.7 =item $json = $json->canonical ([$enable])
250 root 1.2
251 root 1.7 If C<$enable> is true (or missing), then the C<encode> method will output JSON objects
252 root 1.2 by sorting their keys. This is adding a comparatively high overhead.
253    
254     If C<$enable> is false, then the C<encode> method will output key-value
255     pairs in the order Perl stores them (which will likely change between runs
256     of the same script).
257    
258     This option is useful if you want the same data structure to be encoded as
259 root 1.16 the same JSON text (given the same overall settings). If it is disabled,
260 root 1.2 the same hash migh be encoded differently even if contains the same data,
261     as key-value pairs have no inherent ordering in Perl.
262    
263 root 1.16 This setting has no effect when decoding JSON texts.
264 root 1.2
265 root 1.7 =item $json = $json->allow_nonref ([$enable])
266 root 1.3
267 root 1.7 If C<$enable> is true (or missing), then the C<encode> method can convert a
268 root 1.3 non-reference into its corresponding string, number or null JSON value,
269     which is an extension to RFC4627. Likewise, C<decode> will accept those JSON
270     values instead of croaking.
271    
272     If C<$enable> is false, then the C<encode> method will croak if it isn't
273 root 1.16 passed an arrayref or hashref, as JSON texts must either be an object
274 root 1.3 or array. Likewise, C<decode> will croak if given something that is not a
275     JSON object or array.
276    
277 root 1.12 Example, encode a Perl scalar as JSON value with enabled C<allow_nonref>,
278     resulting in an invalid JSON text:
279    
280     JSON::XS->new->allow_nonref->encode ("Hello, World!")
281     => "Hello, World!"
282    
283 root 1.7 =item $json = $json->shrink ([$enable])
284    
285     Perl usually over-allocates memory a bit when allocating space for
286     strings. This flag optionally resizes strings generated by either
287     C<encode> or C<decode> to their minimum size possible. This can save
288 root 1.16 memory when your JSON texts are either very very long or you have many
289 root 1.8 short strings. It will also try to downgrade any strings to octet-form
290     if possible: perl stores strings internally either in an encoding called
291     UTF-X or in octet-form. The latter cannot store everything but uses less
292     space in general.
293 root 1.7
294     If C<$enable> is true (or missing), the string returned by C<encode> will be shrunk-to-fit,
295     while all strings generated by C<decode> will also be shrunk-to-fit.
296    
297     If C<$enable> is false, then the normal perl allocation algorithms are used.
298     If you work with your data, then this is likely to be faster.
299    
300     In the future, this setting might control other things, such as converting
301     strings that look like integers or floats into integers or floats
302     internally (there is no difference on the Perl level), saving space.
303    
304 root 1.16 =item $json_text = $json->encode ($perl_scalar)
305 root 1.2
306     Converts the given Perl data structure (a simple scalar or a reference
307     to a hash or array) to its JSON representation. Simple scalars will be
308     converted into JSON string or number sequences, while references to arrays
309     become JSON arrays and references to hashes become JSON objects. Undefined
310     Perl values (e.g. C<undef>) become JSON C<null> values. Neither C<true>
311     nor C<false> values will be generated.
312 root 1.1
313 root 1.16 =item $perl_scalar = $json->decode ($json_text)
314 root 1.1
315 root 1.16 The opposite of C<encode>: expects a JSON text and tries to parse it,
316 root 1.2 returning the resulting simple scalar or reference. Croaks on error.
317 root 1.1
318 root 1.2 JSON numbers and strings become simple Perl scalars. JSON arrays become
319     Perl arrayrefs and JSON objects become Perl hashrefs. C<true> becomes
320     C<1>, C<false> becomes C<0> and C<null> becomes C<undef>.
321 root 1.1
322     =back
323    
324 root 1.10 =head1 MAPPING
325    
326     This section describes how JSON::XS maps Perl values to JSON values and
327     vice versa. These mappings are designed to "do the right thing" in most
328     circumstances automatically, preserving round-tripping characteristics
329     (what you put in comes out as something equivalent).
330    
331     For the more enlightened: note that in the following descriptions,
332     lowercase I<perl> refers to the Perl interpreter, while uppcercase I<Perl>
333     refers to the abstract Perl language itself.
334    
335     =head2 JSON -> PERL
336    
337     =over 4
338    
339     =item object
340    
341     A JSON object becomes a reference to a hash in Perl. No ordering of object
342 root 1.14 keys is preserved (JSON does not preserver object key ordering itself).
343 root 1.10
344     =item array
345    
346     A JSON array becomes a reference to an array in Perl.
347    
348     =item string
349    
350     A JSON string becomes a string scalar in Perl - Unicode codepoints in JSON
351     are represented by the same codepoints in the Perl string, so no manual
352     decoding is necessary.
353    
354     =item number
355    
356     A JSON number becomes either an integer or numeric (floating point)
357     scalar in perl, depending on its range and any fractional parts. On the
358     Perl level, there is no difference between those as Perl handles all the
359     conversion details, but an integer may take slightly less memory and might
360     represent more values exactly than (floating point) numbers.
361    
362     =item true, false
363    
364     These JSON atoms become C<0>, C<1>, respectively. Information is lost in
365     this process. Future versions might represent those values differently,
366     but they will be guarenteed to act like these integers would normally in
367     Perl.
368    
369     =item null
370    
371     A JSON null atom becomes C<undef> in Perl.
372    
373     =back
374    
375     =head2 PERL -> JSON
376    
377     The mapping from Perl to JSON is slightly more difficult, as Perl is a
378     truly typeless language, so we can only guess which JSON type is meant by
379     a Perl value.
380    
381     =over 4
382    
383     =item hash references
384    
385     Perl hash references become JSON objects. As there is no inherent ordering
386     in hash keys, they will usually be encoded in a pseudo-random order that
387     can change between runs of the same program but stays generally the same
388 root 1.14 within a single run of a program. JSON::XS can optionally sort the hash
389 root 1.10 keys (determined by the I<canonical> flag), so the same datastructure
390     will serialise to the same JSON text (given same settings and version of
391     JSON::XS), but this incurs a runtime overhead.
392    
393     =item array references
394    
395     Perl array references become JSON arrays.
396    
397     =item blessed objects
398    
399     Blessed objects are not allowed. JSON::XS currently tries to encode their
400     underlying representation (hash- or arrayref), but this behaviour might
401     change in future versions.
402    
403     =item simple scalars
404    
405     Simple Perl scalars (any scalar that is not a reference) are the most
406     difficult objects to encode: JSON::XS will encode undefined scalars as
407     JSON null value, scalars that have last been used in a string context
408     before encoding as JSON strings and anything else as number value:
409    
410     # dump as number
411     to_json [2] # yields [2]
412     to_json [-3.0e17] # yields [-3e+17]
413     my $value = 5; to_json [$value] # yields [5]
414    
415     # used as string, so dump as string
416     print $value;
417     to_json [$value] # yields ["5"]
418    
419     # undef becomes null
420     to_json [undef] # yields [null]
421    
422     You can force the type to be a string by stringifying it:
423    
424     my $x = 3.1; # some variable containing a number
425     "$x"; # stringified
426     $x .= ""; # another, more awkward way to stringify
427     print $x; # perl does it for you, too, quite often
428    
429     You can force the type to be a number by numifying it:
430    
431     my $x = "3"; # some variable containing a string
432     $x += 0; # numify it, ensuring it will be dumped as a number
433     $x *= 1; # same thing, the choise is yours.
434    
435     You can not currently output JSON booleans or force the type in other,
436     less obscure, ways. Tell me if you need this capability.
437    
438 root 1.11 =item circular data structures
439    
440     Those will be encoded until memory or stackspace runs out.
441    
442 root 1.10 =back
443    
444 root 1.3 =head1 COMPARISON
445    
446     As already mentioned, this module was created because none of the existing
447     JSON modules could be made to work correctly. First I will describe the
448     problems (or pleasures) I encountered with various existing JSON modules,
449 root 1.4 followed by some benchmark values. JSON::XS was designed not to suffer
450     from any of these problems or limitations.
451 root 1.3
452     =over 4
453    
454 root 1.5 =item JSON 1.07
455 root 1.3
456     Slow (but very portable, as it is written in pure Perl).
457    
458     Undocumented/buggy Unicode handling (how JSON handles unicode values is
459     undocumented. One can get far by feeding it unicode strings and doing
460     en-/decoding oneself, but unicode escapes are not working properly).
461    
462     No roundtripping (strings get clobbered if they look like numbers, e.g.
463     the string C<2.0> will encode to C<2.0> instead of C<"2.0">, and that will
464     decode into the number 2.
465    
466 root 1.5 =item JSON::PC 0.01
467 root 1.3
468     Very fast.
469    
470     Undocumented/buggy Unicode handling.
471    
472     No roundtripping.
473    
474 root 1.4 Has problems handling many Perl values (e.g. regex results and other magic
475     values will make it croak).
476 root 1.3
477     Does not even generate valid JSON (C<{1,2}> gets converted to C<{1:2}>
478 root 1.16 which is not a valid JSON text.
479 root 1.3
480     Unmaintained (maintainer unresponsive for many months, bugs are not
481     getting fixed).
482    
483 root 1.5 =item JSON::Syck 0.21
484 root 1.3
485     Very buggy (often crashes).
486    
487 root 1.4 Very inflexible (no human-readable format supported, format pretty much
488     undocumented. I need at least a format for easy reading by humans and a
489     single-line compact format for use in a protocol, and preferably a way to
490 root 1.16 generate ASCII-only JSON texts).
491 root 1.3
492     Completely broken (and confusingly documented) Unicode handling (unicode
493     escapes are not working properly, you need to set ImplicitUnicode to
494     I<different> values on en- and decoding to get symmetric behaviour).
495    
496     No roundtripping (simple cases work, but this depends on wether the scalar
497     value was used in a numeric context or not).
498    
499     Dumping hashes may skip hash values depending on iterator state.
500    
501     Unmaintained (maintainer unresponsive for many months, bugs are not
502     getting fixed).
503    
504     Does not check input for validity (i.e. will accept non-JSON input and
505     return "something" instead of raising an exception. This is a security
506     issue: imagine two banks transfering money between each other using
507     JSON. One bank might parse a given non-JSON request and deduct money,
508     while the other might reject the transaction with a syntax error. While a
509     good protocol will at least recover, that is extra unnecessary work and
510     the transaction will still not succeed).
511    
512 root 1.5 =item JSON::DWIW 0.04
513 root 1.3
514     Very fast. Very natural. Very nice.
515    
516     Undocumented unicode handling (but the best of the pack. Unicode escapes
517     still don't get parsed properly).
518    
519     Very inflexible.
520    
521     No roundtripping.
522    
523 root 1.16 Does not generate valid JSON texts (key strings are often unquoted, empty keys
524 root 1.4 result in nothing being output)
525    
526 root 1.3 Does not check input for validity.
527    
528     =back
529    
530     =head2 SPEED
531    
532 root 1.4 It seems that JSON::XS is surprisingly fast, as shown in the following
533     tables. They have been generated with the help of the C<eg/bench> program
534     in the JSON::XS distribution, to make it easy to compare on your own
535     system.
536    
537 root 1.13 First comes a comparison between various modules using a very short JSON
538 root 1.18 string:
539    
540     {"method": "handleMessage", "params": ["user1", "we were just talking"], "id": null}
541    
542     It shows the number of encodes/decodes per second (JSON::XS uses the
543     functional interface, while JSON::XS/2 uses the OO interface with
544     pretty-printing and hashkey sorting enabled). Higher is better:
545 root 1.4
546     module | encode | decode |
547     -----------|------------|------------|
548 root 1.18 JSON | 11488.516 | 7823.035 |
549     JSON::DWIW | 94708.054 | 129094.260 |
550     JSON::PC | 63884.157 | 128528.212 |
551     JSON::Syck | 34898.677 | 42096.911 |
552     JSON::XS | 654027.064 | 396423.669 |
553     JSON::XS/2 | 371564.190 | 371725.613 |
554 root 1.4 -----------+------------+------------+
555    
556 root 1.18 That is, JSON::XS is more than six times faster than JSON::DWIW on
557     encoding, more than three times faster on decoding, and about thirty times
558     faster than JSON, even with pretty-printing and key sorting.
559 root 1.4
560 root 1.13 Using a longer test string (roughly 18KB, generated from Yahoo! Locals
561 root 1.4 search API (http://nanoref.com/yahooapis/mgPdGg):
562    
563     module | encode | decode |
564     -----------|------------|------------|
565 root 1.18 JSON | 273.023 | 44.674 |
566     JSON::DWIW | 1089.383 | 1145.704 |
567     JSON::PC | 3097.419 | 2393.921 |
568     JSON::Syck | 514.060 | 843.053 |
569     JSON::XS | 6479.668 | 3636.364 |
570     JSON::XS/2 | 3774.221 | 3599.124 |
571 root 1.4 -----------+------------+------------+
572    
573 root 1.18 Again, JSON::XS leads by far.
574 root 1.4
575 root 1.18 On large strings containing lots of high unicode characters, some modules
576     (such as JSON::PC) seem to decode faster than JSON::XS, but the result
577     will be broken due to missing (or wrong) unicode handling. Others refuse
578     to decode or encode properly, so it was impossible to prepare a fair
579     comparison table for that case.
580 root 1.13
581 root 1.11 =head1 RESOURCE LIMITS
582    
583     JSON::XS does not impose any limits on the size of JSON texts or Perl
584 root 1.12 values they represent - if your machine can handle it, JSON::XS will
585 root 1.11 encode or decode it. Future versions might optionally impose structure
586     depth and memory use resource limits.
587    
588 root 1.4 =head1 BUGS
589    
590     While the goal of this module is to be correct, that unfortunately does
591     not mean its bug-free, only that I think its design is bug-free. It is
592     still very young and not well-tested. If you keep reporting bugs they will
593     be fixed swiftly, though.
594    
595 root 1.2 =cut
596    
597     1;
598    
599 root 1.1 =head1 AUTHOR
600    
601     Marc Lehmann <schmorp@schmorp.de>
602     http://home.schmorp.de/
603    
604     =cut
605