ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/JSON-XS/XS.pm
(Generate patch)

Comparing JSON-XS/XS.pm (file contents):
Revision 1.1 by root, Thu Mar 22 16:40:16 2007 UTC vs.
Revision 1.21 by root, Sun Mar 25 02:32:40 2007 UTC

4 4
5=head1 SYNOPSIS 5=head1 SYNOPSIS
6 6
7 use JSON::XS; 7 use JSON::XS;
8 8
9 # exported functions, croak on error
10
11 $utf8_encoded_json_text = to_json $perl_hash_or_arrayref;
12 $perl_hash_or_arrayref = from_json $utf8_encoded_json_text;
13
14 # objToJson and jsonToObj are exported for JSON
15 # compatibility, but should not be used in new code.
16
17 # oo-interface
18
19 $coder = JSON::XS->new->ascii->pretty->allow_nonref;
20 $pretty_printed_unencoded = $coder->encode ($perl_scalar);
21 $perl_scalar = $coder->decode ($unicode_json_text);
22
9=head1 DESCRIPTION 23=head1 DESCRIPTION
10 24
25This module converts Perl data structures to JSON and vice versa. Its
26primary goal is to be I<correct> and its secondary goal is to be
27I<fast>. To reach the latter goal it was written in C.
28
29As this is the n-th-something JSON module on CPAN, what was the reason
30to write yet another JSON module? While it seems there are many JSON
31modules, none of them correctly handle all corner cases, and in most cases
32their maintainers are unresponsive, gone missing, or not listening to bug
33reports for other reasons.
34
35See COMPARISON, below, for a comparison to some other JSON modules.
36
37See MAPPING, below, on how JSON::XS maps perl values to JSON values and
38vice versa.
39
40=head2 FEATURES
41
11=over 4 42=over 4
12 43
44=item * correct unicode handling
45
46This module knows how to handle Unicode, and even documents how and when
47it does so.
48
49=item * round-trip integrity
50
51When you serialise a perl data structure using only datatypes supported
52by JSON, the deserialised data structure is identical on the Perl level.
53(e.g. the string "2.0" doesn't suddenly become "2" just because it looks
54like a number).
55
56=item * strict checking of JSON correctness
57
58There is no guessing, no generating of illegal JSON texts by default,
59and only JSON is accepted as input by default (the latter is a security
60feature).
61
62=item * fast
63
64Compared to other JSON modules, this module compares favourably in terms
65of speed, too.
66
67=item * simple to use
68
69This module has both a simple functional interface as well as an OO
70interface.
71
72=item * reasonably versatile output formats
73
74You can choose between the most compact guarenteed single-line format
75possible (nice for simple line-based protocols), a pure-ascii format
76(for when your transport is not 8-bit clean, still supports the whole
77unicode range), or a pretty-printed format (for when you want to read that
78stuff). Or you can combine those features in whatever way you like.
79
80=back
81
13=cut 82=cut
14 83
15package JSON::XS; 84package JSON::XS;
16 85
86use strict;
87
17BEGIN { 88BEGIN {
18 $VERSION = '0.1'; 89 our $VERSION = '0.8';
19 @ISA = qw(Exporter); 90 our @ISA = qw(Exporter);
20 91
92 our @EXPORT = qw(to_json from_json objToJson jsonToObj);
21 require Exporter; 93 require Exporter;
22 94
23 require XSLoader; 95 require XSLoader;
24 XSLoader::load JSON::XS::, $VERSION; 96 XSLoader::load JSON::XS::, $VERSION;
25} 97}
26 98
27=item 99=head1 FUNCTIONAL INTERFACE
100
101The following convinience methods are provided by this module. They are
102exported by default:
103
104=over 4
105
106=item $json_text = to_json $perl_scalar
107
108Converts the given Perl data structure (a simple scalar or a reference to
109a hash or array) to a UTF-8 encoded, binary string (that is, the string contains
110octets only). Croaks on error.
111
112This function call is functionally identical to:
113
114 $json_text = JSON::XS->new->utf8->encode ($perl_scalar)
115
116except being faster.
117
118=item $perl_scalar = from_json $json_text
119
120The opposite of C<to_json>: expects an UTF-8 (binary) string and tries to
121parse that as an UTF-8 encoded JSON text, returning the resulting simple
122scalar or reference. Croaks on error.
123
124This function call is functionally identical to:
125
126 $perl_scalar = JSON::XS->new->utf8->decode ($json_text)
127
128except being faster.
129
130=back
131
132=head1 OBJECT-ORIENTED INTERFACE
133
134The object oriented interface lets you configure your own encoding or
135decoding style, within the limits of supported formats.
136
137=over 4
138
139=item $json = new JSON::XS
140
141Creates a new JSON::XS object that can be used to de/encode JSON
142strings. All boolean flags described below are by default I<disabled>.
143
144The mutators for flags all return the JSON object again and thus calls can
145be chained:
146
147 my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]})
148 => {"a": [1, 2]}
149
150=item $json = $json->ascii ([$enable])
151
152If C<$enable> is true (or missing), then the C<encode> method will not
153generate characters outside the code range C<0..127> (which is ASCII). Any
154unicode characters outside that range will be escaped using either a
155single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL escape sequence,
156as per RFC4627.
157
158If C<$enable> is false, then the C<encode> method will not escape Unicode
159characters unless required by the JSON syntax. This results in a faster
160and more compact format.
161
162 JSON::XS->new->ascii (1)->encode ([chr 0x10401])
163 => ["\ud801\udc01"]
164
165=item $json = $json->utf8 ([$enable])
166
167If C<$enable> is true (or missing), then the C<encode> method will encode
168the JSON result into UTF-8, as required by many protocols, while the
169C<decode> method expects to be handled an UTF-8-encoded string. Please
170note that UTF-8-encoded strings do not contain any characters outside the
171range C<0..255>, they are thus useful for bytewise/binary I/O. In future
172versions, enabling this option might enable autodetection of the UTF-16
173and UTF-32 encoding families, as described in RFC4627.
174
175If C<$enable> is false, then the C<encode> method will return the JSON
176string as a (non-encoded) unicode string, while C<decode> expects thus a
177unicode string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs
178to be done yourself, e.g. using the Encode module.
179
180Example, output UTF-16BE-encoded JSON:
181
182 use Encode;
183 $jsontext = encode "UTF-16BE", JSON::XS->new->encode ($object);
184
185Example, decode UTF-32LE-encoded JSON:
186
187 use Encode;
188 $object = JSON::XS->new->decode (decode "UTF-32LE", $jsontext);
189
190=item $json = $json->pretty ([$enable])
191
192This enables (or disables) all of the C<indent>, C<space_before> and
193C<space_after> (and in the future possibly more) flags in one call to
194generate the most readable (or most compact) form possible.
195
196Example, pretty-print some simple structure:
197
198 my $json = JSON::XS->new->pretty(1)->encode ({a => [1,2]})
199 =>
200 {
201 "a" : [
202 1,
203 2
204 ]
205 }
206
207=item $json = $json->indent ([$enable])
208
209If C<$enable> is true (or missing), then the C<encode> method will use a multiline
210format as output, putting every array member or object/hash key-value pair
211into its own line, identing them properly.
212
213If C<$enable> is false, no newlines or indenting will be produced, and the
214resulting JSON text is guarenteed not to contain any C<newlines>.
215
216This setting has no effect when decoding JSON texts.
217
218=item $json = $json->space_before ([$enable])
219
220If C<$enable> is true (or missing), then the C<encode> method will add an extra
221optional space before the C<:> separating keys from values in JSON objects.
222
223If C<$enable> is false, then the C<encode> method will not add any extra
224space at those places.
225
226This setting has no effect when decoding JSON texts. You will also
227most likely combine this setting with C<space_after>.
228
229Example, space_before enabled, space_after and indent disabled:
230
231 {"key" :"value"}
232
233=item $json = $json->space_after ([$enable])
234
235If C<$enable> is true (or missing), then the C<encode> method will add an extra
236optional space after the C<:> separating keys from values in JSON objects
237and extra whitespace after the C<,> separating key-value pairs and array
238members.
239
240If C<$enable> is false, then the C<encode> method will not add any extra
241space at those places.
242
243This setting has no effect when decoding JSON texts.
244
245Example, space_before and indent disabled, space_after enabled:
246
247 {"key": "value"}
248
249=item $json = $json->canonical ([$enable])
250
251If C<$enable> is true (or missing), then the C<encode> method will output JSON objects
252by sorting their keys. This is adding a comparatively high overhead.
253
254If C<$enable> is false, then the C<encode> method will output key-value
255pairs in the order Perl stores them (which will likely change between runs
256of the same script).
257
258This option is useful if you want the same data structure to be encoded as
259the same JSON text (given the same overall settings). If it is disabled,
260the same hash migh be encoded differently even if contains the same data,
261as key-value pairs have no inherent ordering in Perl.
262
263This setting has no effect when decoding JSON texts.
264
265=item $json = $json->allow_nonref ([$enable])
266
267If C<$enable> is true (or missing), then the C<encode> method can convert a
268non-reference into its corresponding string, number or null JSON value,
269which is an extension to RFC4627. Likewise, C<decode> will accept those JSON
270values instead of croaking.
271
272If C<$enable> is false, then the C<encode> method will croak if it isn't
273passed an arrayref or hashref, as JSON texts must either be an object
274or array. Likewise, C<decode> will croak if given something that is not a
275JSON object or array.
276
277Example, encode a Perl scalar as JSON value with enabled C<allow_nonref>,
278resulting in an invalid JSON text:
279
280 JSON::XS->new->allow_nonref->encode ("Hello, World!")
281 => "Hello, World!"
282
283=item $json = $json->shrink ([$enable])
284
285Perl usually over-allocates memory a bit when allocating space for
286strings. This flag optionally resizes strings generated by either
287C<encode> or C<decode> to their minimum size possible. This can save
288memory when your JSON texts are either very very long or you have many
289short strings. It will also try to downgrade any strings to octet-form
290if possible: perl stores strings internally either in an encoding called
291UTF-X or in octet-form. The latter cannot store everything but uses less
292space in general.
293
294If C<$enable> is true (or missing), the string returned by C<encode> will be shrunk-to-fit,
295while all strings generated by C<decode> will also be shrunk-to-fit.
296
297If C<$enable> is false, then the normal perl allocation algorithms are used.
298If you work with your data, then this is likely to be faster.
299
300In the future, this setting might control other things, such as converting
301strings that look like integers or floats into integers or floats
302internally (there is no difference on the Perl level), saving space.
303
304=item $json_text = $json->encode ($perl_scalar)
305
306Converts the given Perl data structure (a simple scalar or a reference
307to a hash or array) to its JSON representation. Simple scalars will be
308converted into JSON string or number sequences, while references to arrays
309become JSON arrays and references to hashes become JSON objects. Undefined
310Perl values (e.g. C<undef>) become JSON C<null> values. Neither C<true>
311nor C<false> values will be generated.
312
313=item $perl_scalar = $json->decode ($json_text)
314
315The opposite of C<encode>: expects a JSON text and tries to parse it,
316returning the resulting simple scalar or reference. Croaks on error.
317
318JSON numbers and strings become simple Perl scalars. JSON arrays become
319Perl arrayrefs and JSON objects become Perl hashrefs. C<true> becomes
320C<1>, C<false> becomes C<0> and C<null> becomes C<undef>.
321
322=back
323
324=head1 MAPPING
325
326This section describes how JSON::XS maps Perl values to JSON values and
327vice versa. These mappings are designed to "do the right thing" in most
328circumstances automatically, preserving round-tripping characteristics
329(what you put in comes out as something equivalent).
330
331For the more enlightened: note that in the following descriptions,
332lowercase I<perl> refers to the Perl interpreter, while uppcercase I<Perl>
333refers to the abstract Perl language itself.
334
335=head2 JSON -> PERL
336
337=over 4
338
339=item object
340
341A JSON object becomes a reference to a hash in Perl. No ordering of object
342keys is preserved (JSON does not preserver object key ordering itself).
343
344=item array
345
346A JSON array becomes a reference to an array in Perl.
347
348=item string
349
350A JSON string becomes a string scalar in Perl - Unicode codepoints in JSON
351are represented by the same codepoints in the Perl string, so no manual
352decoding is necessary.
353
354=item number
355
356A JSON number becomes either an integer or numeric (floating point)
357scalar in perl, depending on its range and any fractional parts. On the
358Perl level, there is no difference between those as Perl handles all the
359conversion details, but an integer may take slightly less memory and might
360represent more values exactly than (floating point) numbers.
361
362=item true, false
363
364These JSON atoms become C<0>, C<1>, respectively. Information is lost in
365this process. Future versions might represent those values differently,
366but they will be guarenteed to act like these integers would normally in
367Perl.
368
369=item null
370
371A JSON null atom becomes C<undef> in Perl.
372
373=back
374
375=head2 PERL -> JSON
376
377The mapping from Perl to JSON is slightly more difficult, as Perl is a
378truly typeless language, so we can only guess which JSON type is meant by
379a Perl value.
380
381=over 4
382
383=item hash references
384
385Perl hash references become JSON objects. As there is no inherent ordering
386in hash keys, they will usually be encoded in a pseudo-random order that
387can change between runs of the same program but stays generally the same
388within a single run of a program. JSON::XS can optionally sort the hash
389keys (determined by the I<canonical> flag), so the same datastructure
390will serialise to the same JSON text (given same settings and version of
391JSON::XS), but this incurs a runtime overhead.
392
393=item array references
394
395Perl array references become JSON arrays.
396
397=item blessed objects
398
399Blessed objects are not allowed. JSON::XS currently tries to encode their
400underlying representation (hash- or arrayref), but this behaviour might
401change in future versions.
402
403=item simple scalars
404
405Simple Perl scalars (any scalar that is not a reference) are the most
406difficult objects to encode: JSON::XS will encode undefined scalars as
407JSON null value, scalars that have last been used in a string context
408before encoding as JSON strings and anything else as number value:
409
410 # dump as number
411 to_json [2] # yields [2]
412 to_json [-3.0e17] # yields [-3e+17]
413 my $value = 5; to_json [$value] # yields [5]
414
415 # used as string, so dump as string
416 print $value;
417 to_json [$value] # yields ["5"]
418
419 # undef becomes null
420 to_json [undef] # yields [null]
421
422You can force the type to be a string by stringifying it:
423
424 my $x = 3.1; # some variable containing a number
425 "$x"; # stringified
426 $x .= ""; # another, more awkward way to stringify
427 print $x; # perl does it for you, too, quite often
428
429You can force the type to be a number by numifying it:
430
431 my $x = "3"; # some variable containing a string
432 $x += 0; # numify it, ensuring it will be dumped as a number
433 $x *= 1; # same thing, the choise is yours.
434
435You can not currently output JSON booleans or force the type in other,
436less obscure, ways. Tell me if you need this capability.
437
438=item circular data structures
439
440Those will be encoded until memory or stackspace runs out.
441
442=back
443
444=head1 COMPARISON
445
446As already mentioned, this module was created because none of the existing
447JSON modules could be made to work correctly. First I will describe the
448problems (or pleasures) I encountered with various existing JSON modules,
449followed by some benchmark values. JSON::XS was designed not to suffer
450from any of these problems or limitations.
451
452=over 4
453
454=item JSON 1.07
455
456Slow (but very portable, as it is written in pure Perl).
457
458Undocumented/buggy Unicode handling (how JSON handles unicode values is
459undocumented. One can get far by feeding it unicode strings and doing
460en-/decoding oneself, but unicode escapes are not working properly).
461
462No roundtripping (strings get clobbered if they look like numbers, e.g.
463the string C<2.0> will encode to C<2.0> instead of C<"2.0">, and that will
464decode into the number 2.
465
466=item JSON::PC 0.01
467
468Very fast.
469
470Undocumented/buggy Unicode handling.
471
472No roundtripping.
473
474Has problems handling many Perl values (e.g. regex results and other magic
475values will make it croak).
476
477Does not even generate valid JSON (C<{1,2}> gets converted to C<{1:2}>
478which is not a valid JSON text.
479
480Unmaintained (maintainer unresponsive for many months, bugs are not
481getting fixed).
482
483=item JSON::Syck 0.21
484
485Very buggy (often crashes).
486
487Very inflexible (no human-readable format supported, format pretty much
488undocumented. I need at least a format for easy reading by humans and a
489single-line compact format for use in a protocol, and preferably a way to
490generate ASCII-only JSON texts).
491
492Completely broken (and confusingly documented) Unicode handling (unicode
493escapes are not working properly, you need to set ImplicitUnicode to
494I<different> values on en- and decoding to get symmetric behaviour).
495
496No roundtripping (simple cases work, but this depends on wether the scalar
497value was used in a numeric context or not).
498
499Dumping hashes may skip hash values depending on iterator state.
500
501Unmaintained (maintainer unresponsive for many months, bugs are not
502getting fixed).
503
504Does not check input for validity (i.e. will accept non-JSON input and
505return "something" instead of raising an exception. This is a security
506issue: imagine two banks transfering money between each other using
507JSON. One bank might parse a given non-JSON request and deduct money,
508while the other might reject the transaction with a syntax error. While a
509good protocol will at least recover, that is extra unnecessary work and
510the transaction will still not succeed).
511
512=item JSON::DWIW 0.04
513
514Very fast. Very natural. Very nice.
515
516Undocumented unicode handling (but the best of the pack. Unicode escapes
517still don't get parsed properly).
518
519Very inflexible.
520
521No roundtripping.
522
523Does not generate valid JSON texts (key strings are often unquoted, empty keys
524result in nothing being output)
525
526Does not check input for validity.
527
528=back
529
530=head2 SPEED
531
532It seems that JSON::XS is surprisingly fast, as shown in the following
533tables. They have been generated with the help of the C<eg/bench> program
534in the JSON::XS distribution, to make it easy to compare on your own
535system.
536
537First comes a comparison between various modules using a very short JSON
538string:
539
540 {"method": "handleMessage", "params": ["user1", "we were just talking"], "id": null}
541
542It shows the number of encodes/decodes per second (JSON::XS uses the
543functional interface, while JSON::XS/2 uses the OO interface with
544pretty-printing and hashkey sorting enabled). Higher is better:
545
546 module | encode | decode |
547 -----------|------------|------------|
548 JSON | 11488.516 | 7823.035 |
549 JSON::DWIW | 94708.054 | 129094.260 |
550 JSON::PC | 63884.157 | 128528.212 |
551 JSON::Syck | 34898.677 | 42096.911 |
552 JSON::XS | 654027.064 | 396423.669 |
553 JSON::XS/2 | 371564.190 | 371725.613 |
554 -----------+------------+------------+
555
556That is, JSON::XS is more than six times faster than JSON::DWIW on
557encoding, more than three times faster on decoding, and about thirty times
558faster than JSON, even with pretty-printing and key sorting.
559
560Using a longer test string (roughly 18KB, generated from Yahoo! Locals
561search API (http://nanoref.com/yahooapis/mgPdGg):
562
563 module | encode | decode |
564 -----------|------------|------------|
565 JSON | 273.023 | 44.674 |
566 JSON::DWIW | 1089.383 | 1145.704 |
567 JSON::PC | 3097.419 | 2393.921 |
568 JSON::Syck | 514.060 | 843.053 |
569 JSON::XS | 6479.668 | 3636.364 |
570 JSON::XS/2 | 3774.221 | 3599.124 |
571 -----------+------------+------------+
572
573Again, JSON::XS leads by far.
574
575On large strings containing lots of high unicode characters, some modules
576(such as JSON::PC) seem to decode faster than JSON::XS, but the result
577will be broken due to missing (or wrong) unicode handling. Others refuse
578to decode or encode properly, so it was impossible to prepare a fair
579comparison table for that case.
580
581=head1 RESOURCE LIMITS
582
583JSON::XS does not impose any limits on the size of JSON texts or Perl
584values they represent - if your machine can handle it, JSON::XS will
585encode or decode it. Future versions might optionally impose structure
586depth and memory use resource limits.
587
588=head1 BUGS
589
590While the goal of this module is to be correct, that unfortunately does
591not mean its bug-free, only that I think its design is bug-free. It is
592still very young and not well-tested. If you keep reporting bugs they will
593be fixed swiftly, though.
28 594
29=cut 595=cut
30 596
31use JSON::DWIW;
32use Benchmark;
33
34use utf8;
35#my $json = '{"ü":1,"a":[1,{"3":4},2],"b":5,"üü":2}';
36my $json = '{"test":9555555555555555555,"hu" : -1e+5, "arr" : [ 1,2,3,4,5]}';
37
38my $js = JSON::XS->new;
39warn $js->indent (0);
40warn $js->canonical (0);
41warn $js->ascii (0);
42warn $js->space_after (0);
43use Data::Dumper;
44warn Dumper $js->decode ($json);
45warn Dumper $js->encode ($js->decode ($json));
46#my $x = {"üü" => 2, "ü" => 1, "a" => [1,{3,4},2], b => 5};
47
48#my $js2 = JSON::DWIW->new;
49#
50#timethese 200000, {
51# a => sub { $js->encode ($x) },
52# b => sub { $js2->to_json ($x) },
53#};
54
551; 5971;
56
57=back
58 598
59=head1 AUTHOR 599=head1 AUTHOR
60 600
61 Marc Lehmann <schmorp@schmorp.de> 601 Marc Lehmann <schmorp@schmorp.de>
62 http://home.schmorp.de/ 602 http://home.schmorp.de/

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines