ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/JSON-XS/XS.pm
(Generate patch)

Comparing JSON-XS/XS.pm (file contents):
Revision 1.1 by root, Thu Mar 22 16:40:16 2007 UTC vs.
Revision 1.22 by root, Sun Mar 25 02:37:00 2007 UTC

4 4
5=head1 SYNOPSIS 5=head1 SYNOPSIS
6 6
7 use JSON::XS; 7 use JSON::XS;
8 8
9 # exported functions, they croak on error
10 # and expect/generate UTF-8
11
12 $utf8_encoded_json_text = to_json $perl_hash_or_arrayref;
13 $perl_hash_or_arrayref = from_json $utf8_encoded_json_text;
14
15 # objToJson and jsonToObj aliases to to_json and from_json
16 # are exported for compatibility to the JSON module,
17 # but should not be used in new code.
18
19 # OO-interface
20
21 $coder = JSON::XS->new->ascii->pretty->allow_nonref;
22 $pretty_printed_unencoded = $coder->encode ($perl_scalar);
23 $perl_scalar = $coder->decode ($unicode_json_text);
24
9=head1 DESCRIPTION 25=head1 DESCRIPTION
10 26
27This module converts Perl data structures to JSON and vice versa. Its
28primary goal is to be I<correct> and its secondary goal is to be
29I<fast>. To reach the latter goal it was written in C.
30
31As this is the n-th-something JSON module on CPAN, what was the reason
32to write yet another JSON module? While it seems there are many JSON
33modules, none of them correctly handle all corner cases, and in most cases
34their maintainers are unresponsive, gone missing, or not listening to bug
35reports for other reasons.
36
37See COMPARISON, below, for a comparison to some other JSON modules.
38
39See MAPPING, below, on how JSON::XS maps perl values to JSON values and
40vice versa.
41
42=head2 FEATURES
43
11=over 4 44=over 4
12 45
46=item * correct unicode handling
47
48This module knows how to handle Unicode, and even documents how and when
49it does so.
50
51=item * round-trip integrity
52
53When you serialise a perl data structure using only datatypes supported
54by JSON, the deserialised data structure is identical on the Perl level.
55(e.g. the string "2.0" doesn't suddenly become "2" just because it looks
56like a number).
57
58=item * strict checking of JSON correctness
59
60There is no guessing, no generating of illegal JSON texts by default,
61and only JSON is accepted as input by default (the latter is a security
62feature).
63
64=item * fast
65
66Compared to other JSON modules, this module compares favourably in terms
67of speed, too.
68
69=item * simple to use
70
71This module has both a simple functional interface as well as an OO
72interface.
73
74=item * reasonably versatile output formats
75
76You can choose between the most compact guarenteed single-line format
77possible (nice for simple line-based protocols), a pure-ascii format
78(for when your transport is not 8-bit clean, still supports the whole
79unicode range), or a pretty-printed format (for when you want to read that
80stuff). Or you can combine those features in whatever way you like.
81
82=back
83
13=cut 84=cut
14 85
15package JSON::XS; 86package JSON::XS;
16 87
88use strict;
89
17BEGIN { 90BEGIN {
18 $VERSION = '0.1'; 91 our $VERSION = '0.8';
19 @ISA = qw(Exporter); 92 our @ISA = qw(Exporter);
20 93
94 our @EXPORT = qw(to_json from_json objToJson jsonToObj);
21 require Exporter; 95 require Exporter;
22 96
23 require XSLoader; 97 require XSLoader;
24 XSLoader::load JSON::XS::, $VERSION; 98 XSLoader::load JSON::XS::, $VERSION;
25} 99}
26 100
27=item 101=head1 FUNCTIONAL INTERFACE
102
103The following convinience methods are provided by this module. They are
104exported by default:
105
106=over 4
107
108=item $json_text = to_json $perl_scalar
109
110Converts the given Perl data structure (a simple scalar or a reference to
111a hash or array) to a UTF-8 encoded, binary string (that is, the string contains
112octets only). Croaks on error.
113
114This function call is functionally identical to:
115
116 $json_text = JSON::XS->new->utf8->encode ($perl_scalar)
117
118except being faster.
119
120=item $perl_scalar = from_json $json_text
121
122The opposite of C<to_json>: expects an UTF-8 (binary) string and tries to
123parse that as an UTF-8 encoded JSON text, returning the resulting simple
124scalar or reference. Croaks on error.
125
126This function call is functionally identical to:
127
128 $perl_scalar = JSON::XS->new->utf8->decode ($json_text)
129
130except being faster.
131
132=back
133
134=head1 OBJECT-ORIENTED INTERFACE
135
136The object oriented interface lets you configure your own encoding or
137decoding style, within the limits of supported formats.
138
139=over 4
140
141=item $json = new JSON::XS
142
143Creates a new JSON::XS object that can be used to de/encode JSON
144strings. All boolean flags described below are by default I<disabled>.
145
146The mutators for flags all return the JSON object again and thus calls can
147be chained:
148
149 my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]})
150 => {"a": [1, 2]}
151
152=item $json = $json->ascii ([$enable])
153
154If C<$enable> is true (or missing), then the C<encode> method will not
155generate characters outside the code range C<0..127> (which is ASCII). Any
156unicode characters outside that range will be escaped using either a
157single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL escape sequence,
158as per RFC4627.
159
160If C<$enable> is false, then the C<encode> method will not escape Unicode
161characters unless required by the JSON syntax. This results in a faster
162and more compact format.
163
164 JSON::XS->new->ascii (1)->encode ([chr 0x10401])
165 => ["\ud801\udc01"]
166
167=item $json = $json->utf8 ([$enable])
168
169If C<$enable> is true (or missing), then the C<encode> method will encode
170the JSON result into UTF-8, as required by many protocols, while the
171C<decode> method expects to be handled an UTF-8-encoded string. Please
172note that UTF-8-encoded strings do not contain any characters outside the
173range C<0..255>, they are thus useful for bytewise/binary I/O. In future
174versions, enabling this option might enable autodetection of the UTF-16
175and UTF-32 encoding families, as described in RFC4627.
176
177If C<$enable> is false, then the C<encode> method will return the JSON
178string as a (non-encoded) unicode string, while C<decode> expects thus a
179unicode string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs
180to be done yourself, e.g. using the Encode module.
181
182Example, output UTF-16BE-encoded JSON:
183
184 use Encode;
185 $jsontext = encode "UTF-16BE", JSON::XS->new->encode ($object);
186
187Example, decode UTF-32LE-encoded JSON:
188
189 use Encode;
190 $object = JSON::XS->new->decode (decode "UTF-32LE", $jsontext);
191
192=item $json = $json->pretty ([$enable])
193
194This enables (or disables) all of the C<indent>, C<space_before> and
195C<space_after> (and in the future possibly more) flags in one call to
196generate the most readable (or most compact) form possible.
197
198Example, pretty-print some simple structure:
199
200 my $json = JSON::XS->new->pretty(1)->encode ({a => [1,2]})
201 =>
202 {
203 "a" : [
204 1,
205 2
206 ]
207 }
208
209=item $json = $json->indent ([$enable])
210
211If C<$enable> is true (or missing), then the C<encode> method will use a multiline
212format as output, putting every array member or object/hash key-value pair
213into its own line, identing them properly.
214
215If C<$enable> is false, no newlines or indenting will be produced, and the
216resulting JSON text is guarenteed not to contain any C<newlines>.
217
218This setting has no effect when decoding JSON texts.
219
220=item $json = $json->space_before ([$enable])
221
222If C<$enable> is true (or missing), then the C<encode> method will add an extra
223optional space before the C<:> separating keys from values in JSON objects.
224
225If C<$enable> is false, then the C<encode> method will not add any extra
226space at those places.
227
228This setting has no effect when decoding JSON texts. You will also
229most likely combine this setting with C<space_after>.
230
231Example, space_before enabled, space_after and indent disabled:
232
233 {"key" :"value"}
234
235=item $json = $json->space_after ([$enable])
236
237If C<$enable> is true (or missing), then the C<encode> method will add an extra
238optional space after the C<:> separating keys from values in JSON objects
239and extra whitespace after the C<,> separating key-value pairs and array
240members.
241
242If C<$enable> is false, then the C<encode> method will not add any extra
243space at those places.
244
245This setting has no effect when decoding JSON texts.
246
247Example, space_before and indent disabled, space_after enabled:
248
249 {"key": "value"}
250
251=item $json = $json->canonical ([$enable])
252
253If C<$enable> is true (or missing), then the C<encode> method will output JSON objects
254by sorting their keys. This is adding a comparatively high overhead.
255
256If C<$enable> is false, then the C<encode> method will output key-value
257pairs in the order Perl stores them (which will likely change between runs
258of the same script).
259
260This option is useful if you want the same data structure to be encoded as
261the same JSON text (given the same overall settings). If it is disabled,
262the same hash migh be encoded differently even if contains the same data,
263as key-value pairs have no inherent ordering in Perl.
264
265This setting has no effect when decoding JSON texts.
266
267=item $json = $json->allow_nonref ([$enable])
268
269If C<$enable> is true (or missing), then the C<encode> method can convert a
270non-reference into its corresponding string, number or null JSON value,
271which is an extension to RFC4627. Likewise, C<decode> will accept those JSON
272values instead of croaking.
273
274If C<$enable> is false, then the C<encode> method will croak if it isn't
275passed an arrayref or hashref, as JSON texts must either be an object
276or array. Likewise, C<decode> will croak if given something that is not a
277JSON object or array.
278
279Example, encode a Perl scalar as JSON value with enabled C<allow_nonref>,
280resulting in an invalid JSON text:
281
282 JSON::XS->new->allow_nonref->encode ("Hello, World!")
283 => "Hello, World!"
284
285=item $json = $json->shrink ([$enable])
286
287Perl usually over-allocates memory a bit when allocating space for
288strings. This flag optionally resizes strings generated by either
289C<encode> or C<decode> to their minimum size possible. This can save
290memory when your JSON texts are either very very long or you have many
291short strings. It will also try to downgrade any strings to octet-form
292if possible: perl stores strings internally either in an encoding called
293UTF-X or in octet-form. The latter cannot store everything but uses less
294space in general.
295
296If C<$enable> is true (or missing), the string returned by C<encode> will be shrunk-to-fit,
297while all strings generated by C<decode> will also be shrunk-to-fit.
298
299If C<$enable> is false, then the normal perl allocation algorithms are used.
300If you work with your data, then this is likely to be faster.
301
302In the future, this setting might control other things, such as converting
303strings that look like integers or floats into integers or floats
304internally (there is no difference on the Perl level), saving space.
305
306=item $json_text = $json->encode ($perl_scalar)
307
308Converts the given Perl data structure (a simple scalar or a reference
309to a hash or array) to its JSON representation. Simple scalars will be
310converted into JSON string or number sequences, while references to arrays
311become JSON arrays and references to hashes become JSON objects. Undefined
312Perl values (e.g. C<undef>) become JSON C<null> values. Neither C<true>
313nor C<false> values will be generated.
314
315=item $perl_scalar = $json->decode ($json_text)
316
317The opposite of C<encode>: expects a JSON text and tries to parse it,
318returning the resulting simple scalar or reference. Croaks on error.
319
320JSON numbers and strings become simple Perl scalars. JSON arrays become
321Perl arrayrefs and JSON objects become Perl hashrefs. C<true> becomes
322C<1>, C<false> becomes C<0> and C<null> becomes C<undef>.
323
324=back
325
326=head1 MAPPING
327
328This section describes how JSON::XS maps Perl values to JSON values and
329vice versa. These mappings are designed to "do the right thing" in most
330circumstances automatically, preserving round-tripping characteristics
331(what you put in comes out as something equivalent).
332
333For the more enlightened: note that in the following descriptions,
334lowercase I<perl> refers to the Perl interpreter, while uppcercase I<Perl>
335refers to the abstract Perl language itself.
336
337=head2 JSON -> PERL
338
339=over 4
340
341=item object
342
343A JSON object becomes a reference to a hash in Perl. No ordering of object
344keys is preserved (JSON does not preserver object key ordering itself).
345
346=item array
347
348A JSON array becomes a reference to an array in Perl.
349
350=item string
351
352A JSON string becomes a string scalar in Perl - Unicode codepoints in JSON
353are represented by the same codepoints in the Perl string, so no manual
354decoding is necessary.
355
356=item number
357
358A JSON number becomes either an integer or numeric (floating point)
359scalar in perl, depending on its range and any fractional parts. On the
360Perl level, there is no difference between those as Perl handles all the
361conversion details, but an integer may take slightly less memory and might
362represent more values exactly than (floating point) numbers.
363
364=item true, false
365
366These JSON atoms become C<0>, C<1>, respectively. Information is lost in
367this process. Future versions might represent those values differently,
368but they will be guarenteed to act like these integers would normally in
369Perl.
370
371=item null
372
373A JSON null atom becomes C<undef> in Perl.
374
375=back
376
377=head2 PERL -> JSON
378
379The mapping from Perl to JSON is slightly more difficult, as Perl is a
380truly typeless language, so we can only guess which JSON type is meant by
381a Perl value.
382
383=over 4
384
385=item hash references
386
387Perl hash references become JSON objects. As there is no inherent ordering
388in hash keys, they will usually be encoded in a pseudo-random order that
389can change between runs of the same program but stays generally the same
390within a single run of a program. JSON::XS can optionally sort the hash
391keys (determined by the I<canonical> flag), so the same datastructure
392will serialise to the same JSON text (given same settings and version of
393JSON::XS), but this incurs a runtime overhead.
394
395=item array references
396
397Perl array references become JSON arrays.
398
399=item blessed objects
400
401Blessed objects are not allowed. JSON::XS currently tries to encode their
402underlying representation (hash- or arrayref), but this behaviour might
403change in future versions.
404
405=item simple scalars
406
407Simple Perl scalars (any scalar that is not a reference) are the most
408difficult objects to encode: JSON::XS will encode undefined scalars as
409JSON null value, scalars that have last been used in a string context
410before encoding as JSON strings and anything else as number value:
411
412 # dump as number
413 to_json [2] # yields [2]
414 to_json [-3.0e17] # yields [-3e+17]
415 my $value = 5; to_json [$value] # yields [5]
416
417 # used as string, so dump as string
418 print $value;
419 to_json [$value] # yields ["5"]
420
421 # undef becomes null
422 to_json [undef] # yields [null]
423
424You can force the type to be a string by stringifying it:
425
426 my $x = 3.1; # some variable containing a number
427 "$x"; # stringified
428 $x .= ""; # another, more awkward way to stringify
429 print $x; # perl does it for you, too, quite often
430
431You can force the type to be a number by numifying it:
432
433 my $x = "3"; # some variable containing a string
434 $x += 0; # numify it, ensuring it will be dumped as a number
435 $x *= 1; # same thing, the choise is yours.
436
437You can not currently output JSON booleans or force the type in other,
438less obscure, ways. Tell me if you need this capability.
439
440=item circular data structures
441
442Those will be encoded until memory or stackspace runs out.
443
444=back
445
446=head1 COMPARISON
447
448As already mentioned, this module was created because none of the existing
449JSON modules could be made to work correctly. First I will describe the
450problems (or pleasures) I encountered with various existing JSON modules,
451followed by some benchmark values. JSON::XS was designed not to suffer
452from any of these problems or limitations.
453
454=over 4
455
456=item JSON 1.07
457
458Slow (but very portable, as it is written in pure Perl).
459
460Undocumented/buggy Unicode handling (how JSON handles unicode values is
461undocumented. One can get far by feeding it unicode strings and doing
462en-/decoding oneself, but unicode escapes are not working properly).
463
464No roundtripping (strings get clobbered if they look like numbers, e.g.
465the string C<2.0> will encode to C<2.0> instead of C<"2.0">, and that will
466decode into the number 2.
467
468=item JSON::PC 0.01
469
470Very fast.
471
472Undocumented/buggy Unicode handling.
473
474No roundtripping.
475
476Has problems handling many Perl values (e.g. regex results and other magic
477values will make it croak).
478
479Does not even generate valid JSON (C<{1,2}> gets converted to C<{1:2}>
480which is not a valid JSON text.
481
482Unmaintained (maintainer unresponsive for many months, bugs are not
483getting fixed).
484
485=item JSON::Syck 0.21
486
487Very buggy (often crashes).
488
489Very inflexible (no human-readable format supported, format pretty much
490undocumented. I need at least a format for easy reading by humans and a
491single-line compact format for use in a protocol, and preferably a way to
492generate ASCII-only JSON texts).
493
494Completely broken (and confusingly documented) Unicode handling (unicode
495escapes are not working properly, you need to set ImplicitUnicode to
496I<different> values on en- and decoding to get symmetric behaviour).
497
498No roundtripping (simple cases work, but this depends on wether the scalar
499value was used in a numeric context or not).
500
501Dumping hashes may skip hash values depending on iterator state.
502
503Unmaintained (maintainer unresponsive for many months, bugs are not
504getting fixed).
505
506Does not check input for validity (i.e. will accept non-JSON input and
507return "something" instead of raising an exception. This is a security
508issue: imagine two banks transfering money between each other using
509JSON. One bank might parse a given non-JSON request and deduct money,
510while the other might reject the transaction with a syntax error. While a
511good protocol will at least recover, that is extra unnecessary work and
512the transaction will still not succeed).
513
514=item JSON::DWIW 0.04
515
516Very fast. Very natural. Very nice.
517
518Undocumented unicode handling (but the best of the pack. Unicode escapes
519still don't get parsed properly).
520
521Very inflexible.
522
523No roundtripping.
524
525Does not generate valid JSON texts (key strings are often unquoted, empty keys
526result in nothing being output)
527
528Does not check input for validity.
529
530=back
531
532=head2 SPEED
533
534It seems that JSON::XS is surprisingly fast, as shown in the following
535tables. They have been generated with the help of the C<eg/bench> program
536in the JSON::XS distribution, to make it easy to compare on your own
537system.
538
539First comes a comparison between various modules using a very short JSON
540string:
541
542 {"method": "handleMessage", "params": ["user1", "we were just talking"], "id": null}
543
544It shows the number of encodes/decodes per second (JSON::XS uses the
545functional interface, while JSON::XS/2 uses the OO interface with
546pretty-printing and hashkey sorting enabled). Higher is better:
547
548 module | encode | decode |
549 -----------|------------|------------|
550 JSON | 11488.516 | 7823.035 |
551 JSON::DWIW | 94708.054 | 129094.260 |
552 JSON::PC | 63884.157 | 128528.212 |
553 JSON::Syck | 34898.677 | 42096.911 |
554 JSON::XS | 654027.064 | 396423.669 |
555 JSON::XS/2 | 371564.190 | 371725.613 |
556 -----------+------------+------------+
557
558That is, JSON::XS is more than six times faster than JSON::DWIW on
559encoding, more than three times faster on decoding, and about thirty times
560faster than JSON, even with pretty-printing and key sorting.
561
562Using a longer test string (roughly 18KB, generated from Yahoo! Locals
563search API (http://nanoref.com/yahooapis/mgPdGg):
564
565 module | encode | decode |
566 -----------|------------|------------|
567 JSON | 273.023 | 44.674 |
568 JSON::DWIW | 1089.383 | 1145.704 |
569 JSON::PC | 3097.419 | 2393.921 |
570 JSON::Syck | 514.060 | 843.053 |
571 JSON::XS | 6479.668 | 3636.364 |
572 JSON::XS/2 | 3774.221 | 3599.124 |
573 -----------+------------+------------+
574
575Again, JSON::XS leads by far.
576
577On large strings containing lots of high unicode characters, some modules
578(such as JSON::PC) seem to decode faster than JSON::XS, but the result
579will be broken due to missing (or wrong) unicode handling. Others refuse
580to decode or encode properly, so it was impossible to prepare a fair
581comparison table for that case.
582
583=head1 RESOURCE LIMITS
584
585JSON::XS does not impose any limits on the size of JSON texts or Perl
586values they represent - if your machine can handle it, JSON::XS will
587encode or decode it. Future versions might optionally impose structure
588depth and memory use resource limits.
589
590=head1 BUGS
591
592While the goal of this module is to be correct, that unfortunately does
593not mean its bug-free, only that I think its design is bug-free. It is
594still very young and not well-tested. If you keep reporting bugs they will
595be fixed swiftly, though.
28 596
29=cut 597=cut
30 598
31use JSON::DWIW;
32use Benchmark;
33
34use utf8;
35#my $json = '{"ü":1,"a":[1,{"3":4},2],"b":5,"üü":2}';
36my $json = '{"test":9555555555555555555,"hu" : -1e+5, "arr" : [ 1,2,3,4,5]}';
37
38my $js = JSON::XS->new;
39warn $js->indent (0);
40warn $js->canonical (0);
41warn $js->ascii (0);
42warn $js->space_after (0);
43use Data::Dumper;
44warn Dumper $js->decode ($json);
45warn Dumper $js->encode ($js->decode ($json));
46#my $x = {"üü" => 2, "ü" => 1, "a" => [1,{3,4},2], b => 5};
47
48#my $js2 = JSON::DWIW->new;
49#
50#timethese 200000, {
51# a => sub { $js->encode ($x) },
52# b => sub { $js2->to_json ($x) },
53#};
54
551; 5991;
56
57=back
58 600
59=head1 AUTHOR 601=head1 AUTHOR
60 602
61 Marc Lehmann <schmorp@schmorp.de> 603 Marc Lehmann <schmorp@schmorp.de>
62 http://home.schmorp.de/ 604 http://home.schmorp.de/

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines