ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/JSON-XS/XS.pm
Revision: 1.22
Committed: Sun Mar 25 02:37:00 2007 UTC (17 years, 1 month ago) by root
Branch: MAIN
Changes since 1.21: +6 -4 lines
Log Message:
*** empty log message ***

File Contents

# User Rev Content
1 root 1.1 =head1 NAME
2    
3     JSON::XS - JSON serialising/deserialising, done correctly and fast
4    
5     =head1 SYNOPSIS
6    
7     use JSON::XS;
8    
9 root 1.22 # exported functions, they croak on error
10     # and expect/generate UTF-8
11 root 1.12
12     $utf8_encoded_json_text = to_json $perl_hash_or_arrayref;
13     $perl_hash_or_arrayref = from_json $utf8_encoded_json_text;
14    
15 root 1.22 # objToJson and jsonToObj aliases to to_json and from_json
16     # are exported for compatibility to the JSON module,
17     # but should not be used in new code.
18 root 1.21
19 root 1.22 # OO-interface
20 root 1.12
21     $coder = JSON::XS->new->ascii->pretty->allow_nonref;
22     $pretty_printed_unencoded = $coder->encode ($perl_scalar);
23     $perl_scalar = $coder->decode ($unicode_json_text);
24    
25 root 1.1 =head1 DESCRIPTION
26    
27 root 1.2 This module converts Perl data structures to JSON and vice versa. Its
28     primary goal is to be I<correct> and its secondary goal is to be
29     I<fast>. To reach the latter goal it was written in C.
30    
31     As this is the n-th-something JSON module on CPAN, what was the reason
32     to write yet another JSON module? While it seems there are many JSON
33     modules, none of them correctly handle all corner cases, and in most cases
34     their maintainers are unresponsive, gone missing, or not listening to bug
35     reports for other reasons.
36    
37     See COMPARISON, below, for a comparison to some other JSON modules.
38    
39 root 1.10 See MAPPING, below, on how JSON::XS maps perl values to JSON values and
40     vice versa.
41    
42 root 1.2 =head2 FEATURES
43    
44 root 1.1 =over 4
45    
46 root 1.21 =item * correct unicode handling
47 root 1.2
48 root 1.10 This module knows how to handle Unicode, and even documents how and when
49     it does so.
50 root 1.2
51     =item * round-trip integrity
52    
53     When you serialise a perl data structure using only datatypes supported
54     by JSON, the deserialised data structure is identical on the Perl level.
55 root 1.21 (e.g. the string "2.0" doesn't suddenly become "2" just because it looks
56     like a number).
57 root 1.2
58     =item * strict checking of JSON correctness
59    
60 root 1.16 There is no guessing, no generating of illegal JSON texts by default,
61 root 1.10 and only JSON is accepted as input by default (the latter is a security
62     feature).
63 root 1.2
64     =item * fast
65    
66 root 1.10 Compared to other JSON modules, this module compares favourably in terms
67     of speed, too.
68 root 1.2
69     =item * simple to use
70    
71     This module has both a simple functional interface as well as an OO
72     interface.
73    
74     =item * reasonably versatile output formats
75    
76 root 1.10 You can choose between the most compact guarenteed single-line format
77 root 1.21 possible (nice for simple line-based protocols), a pure-ascii format
78     (for when your transport is not 8-bit clean, still supports the whole
79     unicode range), or a pretty-printed format (for when you want to read that
80     stuff). Or you can combine those features in whatever way you like.
81 root 1.2
82     =back
83    
84 root 1.1 =cut
85    
86     package JSON::XS;
87    
88 root 1.20 use strict;
89    
90 root 1.1 BEGIN {
91 root 1.21 our $VERSION = '0.8';
92 root 1.20 our @ISA = qw(Exporter);
93 root 1.1
94 root 1.21 our @EXPORT = qw(to_json from_json objToJson jsonToObj);
95 root 1.1 require Exporter;
96    
97     require XSLoader;
98     XSLoader::load JSON::XS::, $VERSION;
99     }
100    
101 root 1.2 =head1 FUNCTIONAL INTERFACE
102    
103     The following convinience methods are provided by this module. They are
104     exported by default:
105    
106     =over 4
107    
108 root 1.16 =item $json_text = to_json $perl_scalar
109 root 1.2
110     Converts the given Perl data structure (a simple scalar or a reference to
111     a hash or array) to a UTF-8 encoded, binary string (that is, the string contains
112     octets only). Croaks on error.
113    
114 root 1.16 This function call is functionally identical to:
115 root 1.2
116 root 1.16 $json_text = JSON::XS->new->utf8->encode ($perl_scalar)
117    
118     except being faster.
119    
120     =item $perl_scalar = from_json $json_text
121 root 1.2
122     The opposite of C<to_json>: expects an UTF-8 (binary) string and tries to
123 root 1.16 parse that as an UTF-8 encoded JSON text, returning the resulting simple
124 root 1.2 scalar or reference. Croaks on error.
125    
126 root 1.16 This function call is functionally identical to:
127    
128     $perl_scalar = JSON::XS->new->utf8->decode ($json_text)
129    
130     except being faster.
131 root 1.2
132     =back
133    
134     =head1 OBJECT-ORIENTED INTERFACE
135    
136     The object oriented interface lets you configure your own encoding or
137     decoding style, within the limits of supported formats.
138    
139     =over 4
140    
141     =item $json = new JSON::XS
142    
143     Creates a new JSON::XS object that can be used to de/encode JSON
144     strings. All boolean flags described below are by default I<disabled>.
145 root 1.1
146 root 1.2 The mutators for flags all return the JSON object again and thus calls can
147     be chained:
148    
149 root 1.16 my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]})
150 root 1.3 => {"a": [1, 2]}
151 root 1.2
152 root 1.7 =item $json = $json->ascii ([$enable])
153 root 1.2
154 root 1.16 If C<$enable> is true (or missing), then the C<encode> method will not
155     generate characters outside the code range C<0..127> (which is ASCII). Any
156     unicode characters outside that range will be escaped using either a
157     single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL escape sequence,
158     as per RFC4627.
159 root 1.2
160     If C<$enable> is false, then the C<encode> method will not escape Unicode
161 root 1.16 characters unless required by the JSON syntax. This results in a faster
162     and more compact format.
163 root 1.2
164 root 1.16 JSON::XS->new->ascii (1)->encode ([chr 0x10401])
165     => ["\ud801\udc01"]
166 root 1.3
167 root 1.7 =item $json = $json->utf8 ([$enable])
168 root 1.2
169 root 1.7 If C<$enable> is true (or missing), then the C<encode> method will encode
170 root 1.16 the JSON result into UTF-8, as required by many protocols, while the
171 root 1.7 C<decode> method expects to be handled an UTF-8-encoded string. Please
172     note that UTF-8-encoded strings do not contain any characters outside the
173 root 1.16 range C<0..255>, they are thus useful for bytewise/binary I/O. In future
174     versions, enabling this option might enable autodetection of the UTF-16
175     and UTF-32 encoding families, as described in RFC4627.
176 root 1.2
177     If C<$enable> is false, then the C<encode> method will return the JSON
178     string as a (non-encoded) unicode string, while C<decode> expects thus a
179     unicode string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs
180     to be done yourself, e.g. using the Encode module.
181    
182 root 1.16 Example, output UTF-16BE-encoded JSON:
183    
184     use Encode;
185     $jsontext = encode "UTF-16BE", JSON::XS->new->encode ($object);
186    
187     Example, decode UTF-32LE-encoded JSON:
188    
189     use Encode;
190     $object = JSON::XS->new->decode (decode "UTF-32LE", $jsontext);
191 root 1.12
192 root 1.7 =item $json = $json->pretty ([$enable])
193 root 1.2
194     This enables (or disables) all of the C<indent>, C<space_before> and
195 root 1.3 C<space_after> (and in the future possibly more) flags in one call to
196 root 1.2 generate the most readable (or most compact) form possible.
197    
198 root 1.12 Example, pretty-print some simple structure:
199    
200 root 1.3 my $json = JSON::XS->new->pretty(1)->encode ({a => [1,2]})
201     =>
202     {
203     "a" : [
204     1,
205     2
206     ]
207     }
208    
209 root 1.7 =item $json = $json->indent ([$enable])
210 root 1.2
211 root 1.7 If C<$enable> is true (or missing), then the C<encode> method will use a multiline
212 root 1.2 format as output, putting every array member or object/hash key-value pair
213     into its own line, identing them properly.
214    
215     If C<$enable> is false, no newlines or indenting will be produced, and the
216 root 1.16 resulting JSON text is guarenteed not to contain any C<newlines>.
217 root 1.2
218 root 1.16 This setting has no effect when decoding JSON texts.
219 root 1.2
220 root 1.7 =item $json = $json->space_before ([$enable])
221 root 1.2
222 root 1.7 If C<$enable> is true (or missing), then the C<encode> method will add an extra
223 root 1.2 optional space before the C<:> separating keys from values in JSON objects.
224    
225     If C<$enable> is false, then the C<encode> method will not add any extra
226     space at those places.
227    
228 root 1.16 This setting has no effect when decoding JSON texts. You will also
229     most likely combine this setting with C<space_after>.
230 root 1.2
231 root 1.12 Example, space_before enabled, space_after and indent disabled:
232    
233     {"key" :"value"}
234    
235 root 1.7 =item $json = $json->space_after ([$enable])
236 root 1.2
237 root 1.7 If C<$enable> is true (or missing), then the C<encode> method will add an extra
238 root 1.2 optional space after the C<:> separating keys from values in JSON objects
239     and extra whitespace after the C<,> separating key-value pairs and array
240     members.
241    
242     If C<$enable> is false, then the C<encode> method will not add any extra
243     space at those places.
244    
245 root 1.16 This setting has no effect when decoding JSON texts.
246 root 1.2
247 root 1.12 Example, space_before and indent disabled, space_after enabled:
248    
249     {"key": "value"}
250    
251 root 1.7 =item $json = $json->canonical ([$enable])
252 root 1.2
253 root 1.7 If C<$enable> is true (or missing), then the C<encode> method will output JSON objects
254 root 1.2 by sorting their keys. This is adding a comparatively high overhead.
255    
256     If C<$enable> is false, then the C<encode> method will output key-value
257     pairs in the order Perl stores them (which will likely change between runs
258     of the same script).
259    
260     This option is useful if you want the same data structure to be encoded as
261 root 1.16 the same JSON text (given the same overall settings). If it is disabled,
262 root 1.2 the same hash migh be encoded differently even if contains the same data,
263     as key-value pairs have no inherent ordering in Perl.
264    
265 root 1.16 This setting has no effect when decoding JSON texts.
266 root 1.2
267 root 1.7 =item $json = $json->allow_nonref ([$enable])
268 root 1.3
269 root 1.7 If C<$enable> is true (or missing), then the C<encode> method can convert a
270 root 1.3 non-reference into its corresponding string, number or null JSON value,
271     which is an extension to RFC4627. Likewise, C<decode> will accept those JSON
272     values instead of croaking.
273    
274     If C<$enable> is false, then the C<encode> method will croak if it isn't
275 root 1.16 passed an arrayref or hashref, as JSON texts must either be an object
276 root 1.3 or array. Likewise, C<decode> will croak if given something that is not a
277     JSON object or array.
278    
279 root 1.12 Example, encode a Perl scalar as JSON value with enabled C<allow_nonref>,
280     resulting in an invalid JSON text:
281    
282     JSON::XS->new->allow_nonref->encode ("Hello, World!")
283     => "Hello, World!"
284    
285 root 1.7 =item $json = $json->shrink ([$enable])
286    
287     Perl usually over-allocates memory a bit when allocating space for
288     strings. This flag optionally resizes strings generated by either
289     C<encode> or C<decode> to their minimum size possible. This can save
290 root 1.16 memory when your JSON texts are either very very long or you have many
291 root 1.8 short strings. It will also try to downgrade any strings to octet-form
292     if possible: perl stores strings internally either in an encoding called
293     UTF-X or in octet-form. The latter cannot store everything but uses less
294     space in general.
295 root 1.7
296     If C<$enable> is true (or missing), the string returned by C<encode> will be shrunk-to-fit,
297     while all strings generated by C<decode> will also be shrunk-to-fit.
298    
299     If C<$enable> is false, then the normal perl allocation algorithms are used.
300     If you work with your data, then this is likely to be faster.
301    
302     In the future, this setting might control other things, such as converting
303     strings that look like integers or floats into integers or floats
304     internally (there is no difference on the Perl level), saving space.
305    
306 root 1.16 =item $json_text = $json->encode ($perl_scalar)
307 root 1.2
308     Converts the given Perl data structure (a simple scalar or a reference
309     to a hash or array) to its JSON representation. Simple scalars will be
310     converted into JSON string or number sequences, while references to arrays
311     become JSON arrays and references to hashes become JSON objects. Undefined
312     Perl values (e.g. C<undef>) become JSON C<null> values. Neither C<true>
313     nor C<false> values will be generated.
314 root 1.1
315 root 1.16 =item $perl_scalar = $json->decode ($json_text)
316 root 1.1
317 root 1.16 The opposite of C<encode>: expects a JSON text and tries to parse it,
318 root 1.2 returning the resulting simple scalar or reference. Croaks on error.
319 root 1.1
320 root 1.2 JSON numbers and strings become simple Perl scalars. JSON arrays become
321     Perl arrayrefs and JSON objects become Perl hashrefs. C<true> becomes
322     C<1>, C<false> becomes C<0> and C<null> becomes C<undef>.
323 root 1.1
324     =back
325    
326 root 1.10 =head1 MAPPING
327    
328     This section describes how JSON::XS maps Perl values to JSON values and
329     vice versa. These mappings are designed to "do the right thing" in most
330     circumstances automatically, preserving round-tripping characteristics
331     (what you put in comes out as something equivalent).
332    
333     For the more enlightened: note that in the following descriptions,
334     lowercase I<perl> refers to the Perl interpreter, while uppcercase I<Perl>
335     refers to the abstract Perl language itself.
336    
337     =head2 JSON -> PERL
338    
339     =over 4
340    
341     =item object
342    
343     A JSON object becomes a reference to a hash in Perl. No ordering of object
344 root 1.14 keys is preserved (JSON does not preserver object key ordering itself).
345 root 1.10
346     =item array
347    
348     A JSON array becomes a reference to an array in Perl.
349    
350     =item string
351    
352     A JSON string becomes a string scalar in Perl - Unicode codepoints in JSON
353     are represented by the same codepoints in the Perl string, so no manual
354     decoding is necessary.
355    
356     =item number
357    
358     A JSON number becomes either an integer or numeric (floating point)
359     scalar in perl, depending on its range and any fractional parts. On the
360     Perl level, there is no difference between those as Perl handles all the
361     conversion details, but an integer may take slightly less memory and might
362     represent more values exactly than (floating point) numbers.
363    
364     =item true, false
365    
366     These JSON atoms become C<0>, C<1>, respectively. Information is lost in
367     this process. Future versions might represent those values differently,
368     but they will be guarenteed to act like these integers would normally in
369     Perl.
370    
371     =item null
372    
373     A JSON null atom becomes C<undef> in Perl.
374    
375     =back
376    
377     =head2 PERL -> JSON
378    
379     The mapping from Perl to JSON is slightly more difficult, as Perl is a
380     truly typeless language, so we can only guess which JSON type is meant by
381     a Perl value.
382    
383     =over 4
384    
385     =item hash references
386    
387     Perl hash references become JSON objects. As there is no inherent ordering
388     in hash keys, they will usually be encoded in a pseudo-random order that
389     can change between runs of the same program but stays generally the same
390 root 1.14 within a single run of a program. JSON::XS can optionally sort the hash
391 root 1.10 keys (determined by the I<canonical> flag), so the same datastructure
392     will serialise to the same JSON text (given same settings and version of
393     JSON::XS), but this incurs a runtime overhead.
394    
395     =item array references
396    
397     Perl array references become JSON arrays.
398    
399     =item blessed objects
400    
401     Blessed objects are not allowed. JSON::XS currently tries to encode their
402     underlying representation (hash- or arrayref), but this behaviour might
403     change in future versions.
404    
405     =item simple scalars
406    
407     Simple Perl scalars (any scalar that is not a reference) are the most
408     difficult objects to encode: JSON::XS will encode undefined scalars as
409     JSON null value, scalars that have last been used in a string context
410     before encoding as JSON strings and anything else as number value:
411    
412     # dump as number
413     to_json [2] # yields [2]
414     to_json [-3.0e17] # yields [-3e+17]
415     my $value = 5; to_json [$value] # yields [5]
416    
417     # used as string, so dump as string
418     print $value;
419     to_json [$value] # yields ["5"]
420    
421     # undef becomes null
422     to_json [undef] # yields [null]
423    
424     You can force the type to be a string by stringifying it:
425    
426     my $x = 3.1; # some variable containing a number
427     "$x"; # stringified
428     $x .= ""; # another, more awkward way to stringify
429     print $x; # perl does it for you, too, quite often
430    
431     You can force the type to be a number by numifying it:
432    
433     my $x = "3"; # some variable containing a string
434     $x += 0; # numify it, ensuring it will be dumped as a number
435     $x *= 1; # same thing, the choise is yours.
436    
437     You can not currently output JSON booleans or force the type in other,
438     less obscure, ways. Tell me if you need this capability.
439    
440 root 1.11 =item circular data structures
441    
442     Those will be encoded until memory or stackspace runs out.
443    
444 root 1.10 =back
445    
446 root 1.3 =head1 COMPARISON
447    
448     As already mentioned, this module was created because none of the existing
449     JSON modules could be made to work correctly. First I will describe the
450     problems (or pleasures) I encountered with various existing JSON modules,
451 root 1.4 followed by some benchmark values. JSON::XS was designed not to suffer
452     from any of these problems or limitations.
453 root 1.3
454     =over 4
455    
456 root 1.5 =item JSON 1.07
457 root 1.3
458     Slow (but very portable, as it is written in pure Perl).
459    
460     Undocumented/buggy Unicode handling (how JSON handles unicode values is
461     undocumented. One can get far by feeding it unicode strings and doing
462     en-/decoding oneself, but unicode escapes are not working properly).
463    
464     No roundtripping (strings get clobbered if they look like numbers, e.g.
465     the string C<2.0> will encode to C<2.0> instead of C<"2.0">, and that will
466     decode into the number 2.
467    
468 root 1.5 =item JSON::PC 0.01
469 root 1.3
470     Very fast.
471    
472     Undocumented/buggy Unicode handling.
473    
474     No roundtripping.
475    
476 root 1.4 Has problems handling many Perl values (e.g. regex results and other magic
477     values will make it croak).
478 root 1.3
479     Does not even generate valid JSON (C<{1,2}> gets converted to C<{1:2}>
480 root 1.16 which is not a valid JSON text.
481 root 1.3
482     Unmaintained (maintainer unresponsive for many months, bugs are not
483     getting fixed).
484    
485 root 1.5 =item JSON::Syck 0.21
486 root 1.3
487     Very buggy (often crashes).
488    
489 root 1.4 Very inflexible (no human-readable format supported, format pretty much
490     undocumented. I need at least a format for easy reading by humans and a
491     single-line compact format for use in a protocol, and preferably a way to
492 root 1.16 generate ASCII-only JSON texts).
493 root 1.3
494     Completely broken (and confusingly documented) Unicode handling (unicode
495     escapes are not working properly, you need to set ImplicitUnicode to
496     I<different> values on en- and decoding to get symmetric behaviour).
497    
498     No roundtripping (simple cases work, but this depends on wether the scalar
499     value was used in a numeric context or not).
500    
501     Dumping hashes may skip hash values depending on iterator state.
502    
503     Unmaintained (maintainer unresponsive for many months, bugs are not
504     getting fixed).
505    
506     Does not check input for validity (i.e. will accept non-JSON input and
507     return "something" instead of raising an exception. This is a security
508     issue: imagine two banks transfering money between each other using
509     JSON. One bank might parse a given non-JSON request and deduct money,
510     while the other might reject the transaction with a syntax error. While a
511     good protocol will at least recover, that is extra unnecessary work and
512     the transaction will still not succeed).
513    
514 root 1.5 =item JSON::DWIW 0.04
515 root 1.3
516     Very fast. Very natural. Very nice.
517    
518     Undocumented unicode handling (but the best of the pack. Unicode escapes
519     still don't get parsed properly).
520    
521     Very inflexible.
522    
523     No roundtripping.
524    
525 root 1.16 Does not generate valid JSON texts (key strings are often unquoted, empty keys
526 root 1.4 result in nothing being output)
527    
528 root 1.3 Does not check input for validity.
529    
530     =back
531    
532     =head2 SPEED
533    
534 root 1.4 It seems that JSON::XS is surprisingly fast, as shown in the following
535     tables. They have been generated with the help of the C<eg/bench> program
536     in the JSON::XS distribution, to make it easy to compare on your own
537     system.
538    
539 root 1.13 First comes a comparison between various modules using a very short JSON
540 root 1.18 string:
541    
542     {"method": "handleMessage", "params": ["user1", "we were just talking"], "id": null}
543    
544     It shows the number of encodes/decodes per second (JSON::XS uses the
545     functional interface, while JSON::XS/2 uses the OO interface with
546     pretty-printing and hashkey sorting enabled). Higher is better:
547 root 1.4
548     module | encode | decode |
549     -----------|------------|------------|
550 root 1.18 JSON | 11488.516 | 7823.035 |
551     JSON::DWIW | 94708.054 | 129094.260 |
552     JSON::PC | 63884.157 | 128528.212 |
553     JSON::Syck | 34898.677 | 42096.911 |
554     JSON::XS | 654027.064 | 396423.669 |
555     JSON::XS/2 | 371564.190 | 371725.613 |
556 root 1.4 -----------+------------+------------+
557    
558 root 1.18 That is, JSON::XS is more than six times faster than JSON::DWIW on
559     encoding, more than three times faster on decoding, and about thirty times
560     faster than JSON, even with pretty-printing and key sorting.
561 root 1.4
562 root 1.13 Using a longer test string (roughly 18KB, generated from Yahoo! Locals
563 root 1.4 search API (http://nanoref.com/yahooapis/mgPdGg):
564    
565     module | encode | decode |
566     -----------|------------|------------|
567 root 1.18 JSON | 273.023 | 44.674 |
568     JSON::DWIW | 1089.383 | 1145.704 |
569     JSON::PC | 3097.419 | 2393.921 |
570     JSON::Syck | 514.060 | 843.053 |
571     JSON::XS | 6479.668 | 3636.364 |
572     JSON::XS/2 | 3774.221 | 3599.124 |
573 root 1.4 -----------+------------+------------+
574    
575 root 1.18 Again, JSON::XS leads by far.
576 root 1.4
577 root 1.18 On large strings containing lots of high unicode characters, some modules
578     (such as JSON::PC) seem to decode faster than JSON::XS, but the result
579     will be broken due to missing (or wrong) unicode handling. Others refuse
580     to decode or encode properly, so it was impossible to prepare a fair
581     comparison table for that case.
582 root 1.13
583 root 1.11 =head1 RESOURCE LIMITS
584    
585     JSON::XS does not impose any limits on the size of JSON texts or Perl
586 root 1.12 values they represent - if your machine can handle it, JSON::XS will
587 root 1.11 encode or decode it. Future versions might optionally impose structure
588     depth and memory use resource limits.
589    
590 root 1.4 =head1 BUGS
591    
592     While the goal of this module is to be correct, that unfortunately does
593     not mean its bug-free, only that I think its design is bug-free. It is
594     still very young and not well-tested. If you keep reporting bugs they will
595     be fixed swiftly, though.
596    
597 root 1.2 =cut
598    
599     1;
600    
601 root 1.1 =head1 AUTHOR
602    
603     Marc Lehmann <schmorp@schmorp.de>
604     http://home.schmorp.de/
605    
606     =cut
607