ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/JSON-XS/XS.pm
(Generate patch)

Comparing JSON-XS/XS.pm (file contents):
Revision 1.1 by root, Thu Mar 22 16:40:16 2007 UTC vs.
Revision 1.78 by root, Wed Dec 5 10:59:28 2007 UTC

1=head1 NAME 1=head1 NAME
2 2
3JSON::XS - JSON serialising/deserialising, done correctly and fast 3JSON::XS - JSON serialising/deserialising, done correctly and fast
4 4
5JSON::XS - 正しくて高速な JSON シリアライザ/デシリアライザ
6 (http://fleur.hio.jp/perldoc/mix/lib/JSON/XS.html)
7
5=head1 SYNOPSIS 8=head1 SYNOPSIS
6 9
7 use JSON::XS; 10 use JSON::XS;
8 11
12 # exported functions, they croak on error
13 # and expect/generate UTF-8
14
15 $utf8_encoded_json_text = encode_json $perl_hash_or_arrayref;
16 $perl_hash_or_arrayref = decode_json $utf8_encoded_json_text;
17
18 # OO-interface
19
20 $coder = JSON::XS->new->ascii->pretty->allow_nonref;
21 $pretty_printed_unencoded = $coder->encode ($perl_scalar);
22 $perl_scalar = $coder->decode ($unicode_json_text);
23
24 # Note that JSON version 2.0 and above will automatically use JSON::XS
25 # if available, at virtually no speed overhead either, so you should
26 # be able to just:
27
28 use JSON;
29
30 # and do the same things, except that you have a pure-perl fallback now.
31
9=head1 DESCRIPTION 32=head1 DESCRIPTION
10 33
34This module converts Perl data structures to JSON and vice versa. Its
35primary goal is to be I<correct> and its secondary goal is to be
36I<fast>. To reach the latter goal it was written in C.
37
38Beginning with version 2.0 of the JSON module, when both JSON and
39JSON::XS are installed, then JSON will fall back on JSON::XS (this can be
40overriden) with no overhead due to emulation (by inheritign constructor
41and methods). If JSON::XS is not available, it will fall back to the
42compatible JSON::PP module as backend, so using JSON instead of JSON::XS
43gives you a portable JSON API that can be fast when you need and doesn't
44require a C compiler when that is a problem.
45
46As this is the n-th-something JSON module on CPAN, what was the reason
47to write yet another JSON module? While it seems there are many JSON
48modules, none of them correctly handle all corner cases, and in most cases
49their maintainers are unresponsive, gone missing, or not listening to bug
50reports for other reasons.
51
52See COMPARISON, below, for a comparison to some other JSON modules.
53
54See MAPPING, below, on how JSON::XS maps perl values to JSON values and
55vice versa.
56
57=head2 FEATURES
58
11=over 4 59=over 4
12 60
61=item * correct Unicode handling
62
63This module knows how to handle Unicode, and even documents how and when
64it does so.
65
66=item * round-trip integrity
67
68When you serialise a perl data structure using only datatypes supported
69by JSON, the deserialised data structure is identical on the Perl level.
70(e.g. the string "2.0" doesn't suddenly become "2" just because it looks
71like a number).
72
73=item * strict checking of JSON correctness
74
75There is no guessing, no generating of illegal JSON texts by default,
76and only JSON is accepted as input by default (the latter is a security
77feature).
78
79=item * fast
80
81Compared to other JSON modules, this module compares favourably in terms
82of speed, too.
83
84=item * simple to use
85
86This module has both a simple functional interface as well as an OO
87interface.
88
89=item * reasonably versatile output formats
90
91You can choose between the most compact guaranteed single-line format
92possible (nice for simple line-based protocols), a pure-ascii format
93(for when your transport is not 8-bit clean, still supports the whole
94Unicode range), or a pretty-printed format (for when you want to read that
95stuff). Or you can combine those features in whatever way you like.
96
97=back
98
13=cut 99=cut
14 100
15package JSON::XS; 101package JSON::XS;
16 102
17BEGIN { 103use strict;
104
18 $VERSION = '0.1'; 105our $VERSION = '2.01';
19 @ISA = qw(Exporter); 106our @ISA = qw(Exporter);
20 107
21 require Exporter; 108our @EXPORT = qw(encode_json decode_json to_json from_json);
22 109
110sub to_json($) {
23 require XSLoader; 111 require Carp;
24 XSLoader::load JSON::XS::, $VERSION; 112 Carp::croak ("JSON::XS::to_json has been renamed to encode_json, either downgrade to pre-2.0 versions of JSON::XS or rename the call");
25} 113}
26 114
27=item 115sub from_json($) {
116 require Carp;
117 Carp::croak ("JSON::XS::from_json has been renamed to decode_json, either downgrade to pre-2.0 versions of JSON::XS or rename the call");
118}
119
120use Exporter;
121use XSLoader;
122
123=head1 FUNCTIONAL INTERFACE
124
125The following convenience methods are provided by this module. They are
126exported by default:
127
128=over 4
129
130=item $json_text = encode_json $perl_scalar
131
132Converts the given Perl data structure to a UTF-8 encoded, binary string
133(that is, the string contains octets only). Croaks on error.
134
135This function call is functionally identical to:
136
137 $json_text = JSON::XS->new->utf8->encode ($perl_scalar)
138
139except being faster.
140
141=item $perl_scalar = decode_json $json_text
142
143The opposite of C<encode_json>: expects an UTF-8 (binary) string and tries
144to parse that as an UTF-8 encoded JSON text, returning the resulting
145reference. Croaks on error.
146
147This function call is functionally identical to:
148
149 $perl_scalar = JSON::XS->new->utf8->decode ($json_text)
150
151except being faster.
152
153=item $is_boolean = JSON::XS::is_bool $scalar
154
155Returns true if the passed scalar represents either JSON::XS::true or
156JSON::XS::false, two constants that act like C<1> and C<0>, respectively
157and are used to represent JSON C<true> and C<false> values in Perl.
158
159See MAPPING, below, for more information on how JSON values are mapped to
160Perl.
161
162=back
163
164
165=head1 A FEW NOTES ON UNICODE AND PERL
166
167Since this often leads to confusion, here are a few very clear words on
168how Unicode works in Perl, modulo bugs.
169
170=over 4
171
172=item 1. Perl strings can store characters with ordinal values > 255.
173
174This enables you to store Unicode characters as single characters in a
175Perl string - very natural.
176
177=item 2. Perl does I<not> associate an encoding with your strings.
178
179Unless you force it to, e.g. when matching it against a regex, or printing
180the scalar to a file, in which case Perl either interprets your string as
181locale-encoded text, octets/binary, or as Unicode, depending on various
182settings. In no case is an encoding stored together with your data, it is
183I<use> that decides encoding, not any magical metadata.
184
185=item 3. The internal utf-8 flag has no meaning with regards to the
186encoding of your string.
187
188Just ignore that flag unless you debug a Perl bug, a module written in
189XS or want to dive into the internals of perl. Otherwise it will only
190confuse you, as, despite the name, it says nothing about how your string
191is encoded. You can have Unicode strings with that flag set, with that
192flag clear, and you can have binary data with that flag set and that flag
193clear. Other possibilities exist, too.
194
195If you didn't know about that flag, just the better, pretend it doesn't
196exist.
197
198=item 4. A "Unicode String" is simply a string where each character can be
199validly interpreted as a Unicode codepoint.
200
201If you have UTF-8 encoded data, it is no longer a Unicode string, but a
202Unicode string encoded in UTF-8, giving you a binary string.
203
204=item 5. A string containing "high" (> 255) character values is I<not> a UTF-8 string.
205
206It's a fact. Learn to live with it.
207
208=back
209
210I hope this helps :)
211
212
213=head1 OBJECT-ORIENTED INTERFACE
214
215The object oriented interface lets you configure your own encoding or
216decoding style, within the limits of supported formats.
217
218=over 4
219
220=item $json = new JSON::XS
221
222Creates a new JSON::XS object that can be used to de/encode JSON
223strings. All boolean flags described below are by default I<disabled>.
224
225The mutators for flags all return the JSON object again and thus calls can
226be chained:
227
228 my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]})
229 => {"a": [1, 2]}
230
231=item $json = $json->ascii ([$enable])
232
233=item $enabled = $json->get_ascii
234
235If C<$enable> is true (or missing), then the C<encode> method will not
236generate characters outside the code range C<0..127> (which is ASCII). Any
237Unicode characters outside that range will be escaped using either a
238single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL escape sequence,
239as per RFC4627. The resulting encoded JSON text can be treated as a native
240Unicode string, an ascii-encoded, latin1-encoded or UTF-8 encoded string,
241or any other superset of ASCII.
242
243If C<$enable> is false, then the C<encode> method will not escape Unicode
244characters unless required by the JSON syntax or other flags. This results
245in a faster and more compact format.
246
247The main use for this flag is to produce JSON texts that can be
248transmitted over a 7-bit channel, as the encoded JSON texts will not
249contain any 8 bit characters.
250
251 JSON::XS->new->ascii (1)->encode ([chr 0x10401])
252 => ["\ud801\udc01"]
253
254=item $json = $json->latin1 ([$enable])
255
256=item $enabled = $json->get_latin1
257
258If C<$enable> is true (or missing), then the C<encode> method will encode
259the resulting JSON text as latin1 (or iso-8859-1), escaping any characters
260outside the code range C<0..255>. The resulting string can be treated as a
261latin1-encoded JSON text or a native Unicode string. The C<decode> method
262will not be affected in any way by this flag, as C<decode> by default
263expects Unicode, which is a strict superset of latin1.
264
265If C<$enable> is false, then the C<encode> method will not escape Unicode
266characters unless required by the JSON syntax or other flags.
267
268The main use for this flag is efficiently encoding binary data as JSON
269text, as most octets will not be escaped, resulting in a smaller encoded
270size. The disadvantage is that the resulting JSON text is encoded
271in latin1 (and must correctly be treated as such when storing and
272transferring), a rare encoding for JSON. It is therefore most useful when
273you want to store data structures known to contain binary data efficiently
274in files or databases, not when talking to other JSON encoders/decoders.
275
276 JSON::XS->new->latin1->encode (["\x{89}\x{abc}"]
277 => ["\x{89}\\u0abc"] # (perl syntax, U+abc escaped, U+89 not)
278
279=item $json = $json->utf8 ([$enable])
280
281=item $enabled = $json->get_utf8
282
283If C<$enable> is true (or missing), then the C<encode> method will encode
284the JSON result into UTF-8, as required by many protocols, while the
285C<decode> method expects to be handled an UTF-8-encoded string. Please
286note that UTF-8-encoded strings do not contain any characters outside the
287range C<0..255>, they are thus useful for bytewise/binary I/O. In future
288versions, enabling this option might enable autodetection of the UTF-16
289and UTF-32 encoding families, as described in RFC4627.
290
291If C<$enable> is false, then the C<encode> method will return the JSON
292string as a (non-encoded) Unicode string, while C<decode> expects thus a
293Unicode string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs
294to be done yourself, e.g. using the Encode module.
295
296Example, output UTF-16BE-encoded JSON:
297
298 use Encode;
299 $jsontext = encode "UTF-16BE", JSON::XS->new->encode ($object);
300
301Example, decode UTF-32LE-encoded JSON:
302
303 use Encode;
304 $object = JSON::XS->new->decode (decode "UTF-32LE", $jsontext);
305
306=item $json = $json->pretty ([$enable])
307
308This enables (or disables) all of the C<indent>, C<space_before> and
309C<space_after> (and in the future possibly more) flags in one call to
310generate the most readable (or most compact) form possible.
311
312Example, pretty-print some simple structure:
313
314 my $json = JSON::XS->new->pretty(1)->encode ({a => [1,2]})
315 =>
316 {
317 "a" : [
318 1,
319 2
320 ]
321 }
322
323=item $json = $json->indent ([$enable])
324
325=item $enabled = $json->get_indent
326
327If C<$enable> is true (or missing), then the C<encode> method will use a multiline
328format as output, putting every array member or object/hash key-value pair
329into its own line, indenting them properly.
330
331If C<$enable> is false, no newlines or indenting will be produced, and the
332resulting JSON text is guaranteed not to contain any C<newlines>.
333
334This setting has no effect when decoding JSON texts.
335
336=item $json = $json->space_before ([$enable])
337
338=item $enabled = $json->get_space_before
339
340If C<$enable> is true (or missing), then the C<encode> method will add an extra
341optional space before the C<:> separating keys from values in JSON objects.
342
343If C<$enable> is false, then the C<encode> method will not add any extra
344space at those places.
345
346This setting has no effect when decoding JSON texts. You will also
347most likely combine this setting with C<space_after>.
348
349Example, space_before enabled, space_after and indent disabled:
350
351 {"key" :"value"}
352
353=item $json = $json->space_after ([$enable])
354
355=item $enabled = $json->get_space_after
356
357If C<$enable> is true (or missing), then the C<encode> method will add an extra
358optional space after the C<:> separating keys from values in JSON objects
359and extra whitespace after the C<,> separating key-value pairs and array
360members.
361
362If C<$enable> is false, then the C<encode> method will not add any extra
363space at those places.
364
365This setting has no effect when decoding JSON texts.
366
367Example, space_before and indent disabled, space_after enabled:
368
369 {"key": "value"}
370
371=item $json = $json->relaxed ([$enable])
372
373=item $enabled = $json->get_relaxed
374
375If C<$enable> is true (or missing), then C<decode> will accept some
376extensions to normal JSON syntax (see below). C<encode> will not be
377affected in anyway. I<Be aware that this option makes you accept invalid
378JSON texts as if they were valid!>. I suggest only to use this option to
379parse application-specific files written by humans (configuration files,
380resource files etc.)
381
382If C<$enable> is false (the default), then C<decode> will only accept
383valid JSON texts.
384
385Currently accepted extensions are:
386
387=over 4
388
389=item * list items can have an end-comma
390
391JSON I<separates> array elements and key-value pairs with commas. This
392can be annoying if you write JSON texts manually and want to be able to
393quickly append elements, so this extension accepts comma at the end of
394such items not just between them:
395
396 [
397 1,
398 2, <- this comma not normally allowed
399 ]
400 {
401 "k1": "v1",
402 "k2": "v2", <- this comma not normally allowed
403 }
404
405=item * shell-style '#'-comments
406
407Whenever JSON allows whitespace, shell-style comments are additionally
408allowed. They are terminated by the first carriage-return or line-feed
409character, after which more white-space and comments are allowed.
410
411 [
412 1, # this comment not allowed in JSON
413 # neither this one...
414 ]
415
416=back
417
418=item $json = $json->canonical ([$enable])
419
420=item $enabled = $json->get_canonical
421
422If C<$enable> is true (or missing), then the C<encode> method will output JSON objects
423by sorting their keys. This is adding a comparatively high overhead.
424
425If C<$enable> is false, then the C<encode> method will output key-value
426pairs in the order Perl stores them (which will likely change between runs
427of the same script).
428
429This option is useful if you want the same data structure to be encoded as
430the same JSON text (given the same overall settings). If it is disabled,
431the same hash might be encoded differently even if contains the same data,
432as key-value pairs have no inherent ordering in Perl.
433
434This setting has no effect when decoding JSON texts.
435
436=item $json = $json->allow_nonref ([$enable])
437
438=item $enabled = $json->get_allow_nonref
439
440If C<$enable> is true (or missing), then the C<encode> method can convert a
441non-reference into its corresponding string, number or null JSON value,
442which is an extension to RFC4627. Likewise, C<decode> will accept those JSON
443values instead of croaking.
444
445If C<$enable> is false, then the C<encode> method will croak if it isn't
446passed an arrayref or hashref, as JSON texts must either be an object
447or array. Likewise, C<decode> will croak if given something that is not a
448JSON object or array.
449
450Example, encode a Perl scalar as JSON value with enabled C<allow_nonref>,
451resulting in an invalid JSON text:
452
453 JSON::XS->new->allow_nonref->encode ("Hello, World!")
454 => "Hello, World!"
455
456=item $json = $json->allow_blessed ([$enable])
457
458=item $enabled = $json->get_allow_blessed
459
460If C<$enable> is true (or missing), then the C<encode> method will not
461barf when it encounters a blessed reference. Instead, the value of the
462B<convert_blessed> option will decide whether C<null> (C<convert_blessed>
463disabled or no C<TO_JSON> method found) or a representation of the
464object (C<convert_blessed> enabled and C<TO_JSON> method found) is being
465encoded. Has no effect on C<decode>.
466
467If C<$enable> is false (the default), then C<encode> will throw an
468exception when it encounters a blessed object.
469
470=item $json = $json->convert_blessed ([$enable])
471
472=item $enabled = $json->get_convert_blessed
473
474If C<$enable> is true (or missing), then C<encode>, upon encountering a
475blessed object, will check for the availability of the C<TO_JSON> method
476on the object's class. If found, it will be called in scalar context
477and the resulting scalar will be encoded instead of the object. If no
478C<TO_JSON> method is found, the value of C<allow_blessed> will decide what
479to do.
480
481The C<TO_JSON> method may safely call die if it wants. If C<TO_JSON>
482returns other blessed objects, those will be handled in the same
483way. C<TO_JSON> must take care of not causing an endless recursion cycle
484(== crash) in this case. The name of C<TO_JSON> was chosen because other
485methods called by the Perl core (== not by the user of the object) are
486usually in upper case letters and to avoid collisions with any C<to_json>
487function or method.
488
489This setting does not yet influence C<decode> in any way, but in the
490future, global hooks might get installed that influence C<decode> and are
491enabled by this setting.
492
493If C<$enable> is false, then the C<allow_blessed> setting will decide what
494to do when a blessed object is found.
495
496=item $json = $json->filter_json_object ([$coderef->($hashref)])
497
498When C<$coderef> is specified, it will be called from C<decode> each
499time it decodes a JSON object. The only argument is a reference to the
500newly-created hash. If the code references returns a single scalar (which
501need not be a reference), this value (i.e. a copy of that scalar to avoid
502aliasing) is inserted into the deserialised data structure. If it returns
503an empty list (NOTE: I<not> C<undef>, which is a valid scalar), the
504original deserialised hash will be inserted. This setting can slow down
505decoding considerably.
506
507When C<$coderef> is omitted or undefined, any existing callback will
508be removed and C<decode> will not change the deserialised hash in any
509way.
510
511Example, convert all JSON objects into the integer 5:
512
513 my $js = JSON::XS->new->filter_json_object (sub { 5 });
514 # returns [5]
515 $js->decode ('[{}]')
516 # throw an exception because allow_nonref is not enabled
517 # so a lone 5 is not allowed.
518 $js->decode ('{"a":1, "b":2}');
519
520=item $json = $json->filter_json_single_key_object ($key [=> $coderef->($value)])
521
522Works remotely similar to C<filter_json_object>, but is only called for
523JSON objects having a single key named C<$key>.
524
525This C<$coderef> is called before the one specified via
526C<filter_json_object>, if any. It gets passed the single value in the JSON
527object. If it returns a single value, it will be inserted into the data
528structure. If it returns nothing (not even C<undef> but the empty list),
529the callback from C<filter_json_object> will be called next, as if no
530single-key callback were specified.
531
532If C<$coderef> is omitted or undefined, the corresponding callback will be
533disabled. There can only ever be one callback for a given key.
534
535As this callback gets called less often then the C<filter_json_object>
536one, decoding speed will not usually suffer as much. Therefore, single-key
537objects make excellent targets to serialise Perl objects into, especially
538as single-key JSON objects are as close to the type-tagged value concept
539as JSON gets (it's basically an ID/VALUE tuple). Of course, JSON does not
540support this in any way, so you need to make sure your data never looks
541like a serialised Perl hash.
542
543Typical names for the single object key are C<__class_whatever__>, or
544C<$__dollars_are_rarely_used__$> or C<}ugly_brace_placement>, or even
545things like C<__class_md5sum(classname)__>, to reduce the risk of clashing
546with real hashes.
547
548Example, decode JSON objects of the form C<< { "__widget__" => <id> } >>
549into the corresponding C<< $WIDGET{<id>} >> object:
550
551 # return whatever is in $WIDGET{5}:
552 JSON::XS
553 ->new
554 ->filter_json_single_key_object (__widget__ => sub {
555 $WIDGET{ $_[0] }
556 })
557 ->decode ('{"__widget__": 5')
558
559 # this can be used with a TO_JSON method in some "widget" class
560 # for serialisation to json:
561 sub WidgetBase::TO_JSON {
562 my ($self) = @_;
563
564 unless ($self->{id}) {
565 $self->{id} = ..get..some..id..;
566 $WIDGET{$self->{id}} = $self;
567 }
568
569 { __widget__ => $self->{id} }
570 }
571
572=item $json = $json->shrink ([$enable])
573
574=item $enabled = $json->get_shrink
575
576Perl usually over-allocates memory a bit when allocating space for
577strings. This flag optionally resizes strings generated by either
578C<encode> or C<decode> to their minimum size possible. This can save
579memory when your JSON texts are either very very long or you have many
580short strings. It will also try to downgrade any strings to octet-form
581if possible: perl stores strings internally either in an encoding called
582UTF-X or in octet-form. The latter cannot store everything but uses less
583space in general (and some buggy Perl or C code might even rely on that
584internal representation being used).
585
586The actual definition of what shrink does might change in future versions,
587but it will always try to save space at the expense of time.
588
589If C<$enable> is true (or missing), the string returned by C<encode> will
590be shrunk-to-fit, while all strings generated by C<decode> will also be
591shrunk-to-fit.
592
593If C<$enable> is false, then the normal perl allocation algorithms are used.
594If you work with your data, then this is likely to be faster.
595
596In the future, this setting might control other things, such as converting
597strings that look like integers or floats into integers or floats
598internally (there is no difference on the Perl level), saving space.
599
600=item $json = $json->max_depth ([$maximum_nesting_depth])
601
602=item $max_depth = $json->get_max_depth
603
604Sets the maximum nesting level (default C<512>) accepted while encoding
605or decoding. If the JSON text or Perl data structure has an equal or
606higher nesting level then this limit, then the encoder and decoder will
607stop and croak at that point.
608
609Nesting level is defined by number of hash- or arrayrefs that the encoder
610needs to traverse to reach a given point or the number of C<{> or C<[>
611characters without their matching closing parenthesis crossed to reach a
612given character in a string.
613
614Setting the maximum depth to one disallows any nesting, so that ensures
615that the object is only a single hash/object or array.
616
617The argument to C<max_depth> will be rounded up to the next highest power
618of two. If no argument is given, the highest possible setting will be
619used, which is rarely useful.
620
621See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
622
623=item $json = $json->max_size ([$maximum_string_size])
624
625=item $max_size = $json->get_max_size
626
627Set the maximum length a JSON text may have (in bytes) where decoding is
628being attempted. The default is C<0>, meaning no limit. When C<decode>
629is called on a string longer then this number of characters it will not
630attempt to decode the string but throw an exception. This setting has no
631effect on C<encode> (yet).
632
633The argument to C<max_size> will be rounded up to the next B<highest>
634power of two (so may be more than requested). If no argument is given, the
635limit check will be deactivated (same as when C<0> is specified).
636
637See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
638
639=item $json_text = $json->encode ($perl_scalar)
640
641Converts the given Perl data structure (a simple scalar or a reference
642to a hash or array) to its JSON representation. Simple scalars will be
643converted into JSON string or number sequences, while references to arrays
644become JSON arrays and references to hashes become JSON objects. Undefined
645Perl values (e.g. C<undef>) become JSON C<null> values. Neither C<true>
646nor C<false> values will be generated.
647
648=item $perl_scalar = $json->decode ($json_text)
649
650The opposite of C<encode>: expects a JSON text and tries to parse it,
651returning the resulting simple scalar or reference. Croaks on error.
652
653JSON numbers and strings become simple Perl scalars. JSON arrays become
654Perl arrayrefs and JSON objects become Perl hashrefs. C<true> becomes
655C<1>, C<false> becomes C<0> and C<null> becomes C<undef>.
656
657=item ($perl_scalar, $characters) = $json->decode_prefix ($json_text)
658
659This works like the C<decode> method, but instead of raising an exception
660when there is trailing garbage after the first JSON object, it will
661silently stop parsing there and return the number of characters consumed
662so far.
663
664This is useful if your JSON texts are not delimited by an outer protocol
665(which is not the brightest thing to do in the first place) and you need
666to know where the JSON text ends.
667
668 JSON::XS->new->decode_prefix ("[1] the tail")
669 => ([], 3)
670
671=back
672
673
674=head1 MAPPING
675
676This section describes how JSON::XS maps Perl values to JSON values and
677vice versa. These mappings are designed to "do the right thing" in most
678circumstances automatically, preserving round-tripping characteristics
679(what you put in comes out as something equivalent).
680
681For the more enlightened: note that in the following descriptions,
682lowercase I<perl> refers to the Perl interpreter, while uppercase I<Perl>
683refers to the abstract Perl language itself.
684
685
686=head2 JSON -> PERL
687
688=over 4
689
690=item object
691
692A JSON object becomes a reference to a hash in Perl. No ordering of object
693keys is preserved (JSON does not preserve object key ordering itself).
694
695=item array
696
697A JSON array becomes a reference to an array in Perl.
698
699=item string
700
701A JSON string becomes a string scalar in Perl - Unicode codepoints in JSON
702are represented by the same codepoints in the Perl string, so no manual
703decoding is necessary.
704
705=item number
706
707A JSON number becomes either an integer, numeric (floating point) or
708string scalar in perl, depending on its range and any fractional parts. On
709the Perl level, there is no difference between those as Perl handles all
710the conversion details, but an integer may take slightly less memory and
711might represent more values exactly than (floating point) numbers.
712
713If the number consists of digits only, JSON::XS will try to represent
714it as an integer value. If that fails, it will try to represent it as
715a numeric (floating point) value if that is possible without loss of
716precision. Otherwise it will preserve the number as a string value.
717
718Numbers containing a fractional or exponential part will always be
719represented as numeric (floating point) values, possibly at a loss of
720precision.
721
722This might create round-tripping problems as numbers might become strings,
723but as Perl is typeless there is no other way to do it.
724
725=item true, false
726
727These JSON atoms become C<JSON::XS::true> and C<JSON::XS::false>,
728respectively. They are overloaded to act almost exactly like the numbers
729C<1> and C<0>. You can check whether a scalar is a JSON boolean by using
730the C<JSON::XS::is_bool> function.
731
732=item null
733
734A JSON null atom becomes C<undef> in Perl.
735
736=back
737
738
739=head2 PERL -> JSON
740
741The mapping from Perl to JSON is slightly more difficult, as Perl is a
742truly typeless language, so we can only guess which JSON type is meant by
743a Perl value.
744
745=over 4
746
747=item hash references
748
749Perl hash references become JSON objects. As there is no inherent ordering
750in hash keys (or JSON objects), they will usually be encoded in a
751pseudo-random order that can change between runs of the same program but
752stays generally the same within a single run of a program. JSON::XS can
753optionally sort the hash keys (determined by the I<canonical> flag), so
754the same datastructure will serialise to the same JSON text (given same
755settings and version of JSON::XS), but this incurs a runtime overhead
756and is only rarely useful, e.g. when you want to compare some JSON text
757against another for equality.
758
759=item array references
760
761Perl array references become JSON arrays.
762
763=item other references
764
765Other unblessed references are generally not allowed and will cause an
766exception to be thrown, except for references to the integers C<0> and
767C<1>, which get turned into C<false> and C<true> atoms in JSON. You can
768also use C<JSON::XS::false> and C<JSON::XS::true> to improve readability.
769
770 encode_json [\0,JSON::XS::true] # yields [false,true]
771
772=item JSON::XS::true, JSON::XS::false
773
774These special values become JSON true and JSON false values,
775respectively. You can also use C<\1> and C<\0> directly if you want.
776
777=item blessed objects
778
779Blessed objects are not allowed. JSON::XS currently tries to encode their
780underlying representation (hash- or arrayref), but this behaviour might
781change in future versions.
782
783=item simple scalars
784
785Simple Perl scalars (any scalar that is not a reference) are the most
786difficult objects to encode: JSON::XS will encode undefined scalars as
787JSON null value, scalars that have last been used in a string context
788before encoding as JSON strings and anything else as number value:
789
790 # dump as number
791 encode_json [2] # yields [2]
792 encode_json [-3.0e17] # yields [-3e+17]
793 my $value = 5; encode_json [$value] # yields [5]
794
795 # used as string, so dump as string
796 print $value;
797 encode_json [$value] # yields ["5"]
798
799 # undef becomes null
800 encode_json [undef] # yields [null]
801
802You can force the type to be a JSON string by stringifying it:
803
804 my $x = 3.1; # some variable containing a number
805 "$x"; # stringified
806 $x .= ""; # another, more awkward way to stringify
807 print $x; # perl does it for you, too, quite often
808
809You can force the type to be a JSON number by numifying it:
810
811 my $x = "3"; # some variable containing a string
812 $x += 0; # numify it, ensuring it will be dumped as a number
813 $x *= 1; # same thing, the choice is yours.
814
815You can not currently force the type in other, less obscure, ways. Tell me
816if you need this capability.
817
818=back
819
820
821=head1 COMPARISON
822
823As already mentioned, this module was created because none of the existing
824JSON modules could be made to work correctly. First I will describe the
825problems (or pleasures) I encountered with various existing JSON modules,
826followed by some benchmark values. JSON::XS was designed not to suffer
827from any of these problems or limitations.
828
829=over 4
830
831=item JSON 1.07
832
833Slow (but very portable, as it is written in pure Perl).
834
835Undocumented/buggy Unicode handling (how JSON handles Unicode values is
836undocumented. One can get far by feeding it Unicode strings and doing
837en-/decoding oneself, but Unicode escapes are not working properly).
838
839No round-tripping (strings get clobbered if they look like numbers, e.g.
840the string C<2.0> will encode to C<2.0> instead of C<"2.0">, and that will
841decode into the number 2.
842
843=item JSON::PC 0.01
844
845Very fast.
846
847Undocumented/buggy Unicode handling.
848
849No round-tripping.
850
851Has problems handling many Perl values (e.g. regex results and other magic
852values will make it croak).
853
854Does not even generate valid JSON (C<{1,2}> gets converted to C<{1:2}>
855which is not a valid JSON text.
856
857Unmaintained (maintainer unresponsive for many months, bugs are not
858getting fixed).
859
860=item JSON::Syck 0.21
861
862Very buggy (often crashes).
863
864Very inflexible (no human-readable format supported, format pretty much
865undocumented. I need at least a format for easy reading by humans and a
866single-line compact format for use in a protocol, and preferably a way to
867generate ASCII-only JSON texts).
868
869Completely broken (and confusingly documented) Unicode handling (Unicode
870escapes are not working properly, you need to set ImplicitUnicode to
871I<different> values on en- and decoding to get symmetric behaviour).
872
873No round-tripping (simple cases work, but this depends on whether the scalar
874value was used in a numeric context or not).
875
876Dumping hashes may skip hash values depending on iterator state.
877
878Unmaintained (maintainer unresponsive for many months, bugs are not
879getting fixed).
880
881Does not check input for validity (i.e. will accept non-JSON input and
882return "something" instead of raising an exception. This is a security
883issue: imagine two banks transferring money between each other using
884JSON. One bank might parse a given non-JSON request and deduct money,
885while the other might reject the transaction with a syntax error. While a
886good protocol will at least recover, that is extra unnecessary work and
887the transaction will still not succeed).
888
889=item JSON::DWIW 0.04
890
891Very fast. Very natural. Very nice.
892
893Undocumented Unicode handling (but the best of the pack. Unicode escapes
894still don't get parsed properly).
895
896Very inflexible.
897
898No round-tripping.
899
900Does not generate valid JSON texts (key strings are often unquoted, empty keys
901result in nothing being output)
902
903Does not check input for validity.
904
905=back
906
907
908=head2 JSON and YAML
909
910You often hear that JSON is a subset (or a close subset) of YAML. This is,
911however, a mass hysteria and very far from the truth. In general, there is
912no way to configure JSON::XS to output a data structure as valid YAML.
913
914If you really must use JSON::XS to generate YAML, you should use this
915algorithm (subject to change in future versions):
916
917 my $to_yaml = JSON::XS->new->utf8->space_after (1);
918 my $yaml = $to_yaml->encode ($ref) . "\n";
919
920This will usually generate JSON texts that also parse as valid
921YAML. Please note that YAML has hardcoded limits on (simple) object key
922lengths that JSON doesn't have, so you should make sure that your hash
923keys are noticeably shorter than the 1024 characters YAML allows.
924
925There might be other incompatibilities that I am not aware of. In general
926you should not try to generate YAML with a JSON generator or vice versa,
927or try to parse JSON with a YAML parser or vice versa: chances are high
928that you will run into severe interoperability problems.
929
930
931=head2 SPEED
932
933It seems that JSON::XS is surprisingly fast, as shown in the following
934tables. They have been generated with the help of the C<eg/bench> program
935in the JSON::XS distribution, to make it easy to compare on your own
936system.
937
938First comes a comparison between various modules using a very short
939single-line JSON string:
940
941 {"method": "handleMessage", "params": ["user1", "we were just talking"], \
942 "id": null, "array":[1,11,234,-5,1e5,1e7, true, false]}
943
944It shows the number of encodes/decodes per second (JSON::XS uses
945the functional interface, while JSON::XS/2 uses the OO interface
946with pretty-printing and hashkey sorting enabled, JSON::XS/3 enables
947shrink). Higher is better:
948
949 module | encode | decode |
950 -----------|------------|------------|
951 JSON 1.x | 4990.842 | 4088.813 |
952 JSON::DWIW | 51653.990 | 71575.154 |
953 JSON::PC | 65948.176 | 74631.744 |
954 JSON::PP | 8931.652 | 3817.168 |
955 JSON::Syck | 24877.248 | 27776.848 |
956 JSON::XS | 388361.481 | 227951.304 |
957 JSON::XS/2 | 227951.304 | 218453.333 |
958 JSON::XS/3 | 338250.323 | 218453.333 |
959 Storable | 16500.016 | 135300.129 |
960 -----------+------------+------------+
961
962That is, JSON::XS is about five times faster than JSON::DWIW on encoding,
963about three times faster on decoding, and over forty times faster
964than JSON, even with pretty-printing and key sorting. It also compares
965favourably to Storable for small amounts of data.
966
967Using a longer test string (roughly 18KB, generated from Yahoo! Locals
968search API (http://nanoref.com/yahooapis/mgPdGg):
969
970 module | encode | decode |
971 -----------|------------|------------|
972 JSON 1.x | 55.260 | 34.971 |
973 JSON::DWIW | 825.228 | 1082.513 |
974 JSON::PC | 3571.444 | 2394.829 |
975 JSON::PP | 210.987 | 32.574 |
976 JSON::Syck | 552.551 | 787.544 |
977 JSON::XS | 5780.463 | 4854.519 |
978 JSON::XS/2 | 3869.998 | 4798.975 |
979 JSON::XS/3 | 5862.880 | 4798.975 |
980 Storable | 4445.002 | 5235.027 |
981 -----------+------------+------------+
982
983Again, JSON::XS leads by far (except for Storable which non-surprisingly
984decodes faster).
985
986On large strings containing lots of high Unicode characters, some modules
987(such as JSON::PC) seem to decode faster than JSON::XS, but the result
988will be broken due to missing (or wrong) Unicode handling. Others refuse
989to decode or encode properly, so it was impossible to prepare a fair
990comparison table for that case.
991
992
993=head1 SECURITY CONSIDERATIONS
994
995When you are using JSON in a protocol, talking to untrusted potentially
996hostile creatures requires relatively few measures.
997
998First of all, your JSON decoder should be secure, that is, should not have
999any buffer overflows. Obviously, this module should ensure that and I am
1000trying hard on making that true, but you never know.
1001
1002Second, you need to avoid resource-starving attacks. That means you should
1003limit the size of JSON texts you accept, or make sure then when your
1004resources run out, that's just fine (e.g. by using a separate process that
1005can crash safely). The size of a JSON text in octets or characters is
1006usually a good indication of the size of the resources required to decode
1007it into a Perl structure. While JSON::XS can check the size of the JSON
1008text, it might be too late when you already have it in memory, so you
1009might want to check the size before you accept the string.
1010
1011Third, JSON::XS recurses using the C stack when decoding objects and
1012arrays. The C stack is a limited resource: for instance, on my amd64
1013machine with 8MB of stack size I can decode around 180k nested arrays but
1014only 14k nested JSON objects (due to perl itself recursing deeply on croak
1015to free the temporary). If that is exceeded, the program crashes. to be
1016conservative, the default nesting limit is set to 512. If your process
1017has a smaller stack, you should adjust this setting accordingly with the
1018C<max_depth> method.
1019
1020And last but least, something else could bomb you that I forgot to think
1021of. In that case, you get to keep the pieces. I am always open for hints,
1022though...
1023
1024If you are using JSON::XS to return packets to consumption
1025by JavaScript scripts in a browser you should have a look at
1026L<http://jpsykes.com/47/practical-csrf-and-json-security> to see whether
1027you are vulnerable to some common attack vectors (which really are browser
1028design bugs, but it is still you who will have to deal with it, as major
1029browser developers care only for features, not about doing security
1030right).
1031
1032
1033=head1 THREADS
1034
1035This module is I<not> guaranteed to be thread safe and there are no
1036plans to change this until Perl gets thread support (as opposed to the
1037horribly slow so-called "threads" which are simply slow and bloated
1038process simulations - use fork, its I<much> faster, cheaper, better).
1039
1040(It might actually work, but you have been warned).
1041
1042
1043=head1 BUGS
1044
1045While the goal of this module is to be correct, that unfortunately does
1046not mean its bug-free, only that I think its design is bug-free. It is
1047still relatively early in its development. If you keep reporting bugs they
1048will be fixed swiftly, though.
1049
1050Please refrain from using rt.cpan.org or any other bug reporting
1051service. I put the contact address into my modules for a reason.
28 1052
29=cut 1053=cut
30 1054
31use JSON::DWIW; 1055our $true = do { bless \(my $dummy = 1), "JSON::XS::Boolean" };
32use Benchmark; 1056our $false = do { bless \(my $dummy = 0), "JSON::XS::Boolean" };
33 1057
34use utf8; 1058sub true() { $true }
35#my $json = '{"ü":1,"a":[1,{"3":4},2],"b":5,"üü":2}'; 1059sub false() { $false }
36my $json = '{"test":9555555555555555555,"hu" : -1e+5, "arr" : [ 1,2,3,4,5]}';
37 1060
38my $js = JSON::XS->new; 1061sub is_bool($) {
39warn $js->indent (0); 1062 UNIVERSAL::isa $_[0], "JSON::XS::Boolean"
40warn $js->canonical (0); 1063# or UNIVERSAL::isa $_[0], "JSON::Literal"
41warn $js->ascii (0); 1064}
42warn $js->space_after (0);
43use Data::Dumper;
44warn Dumper $js->decode ($json);
45warn Dumper $js->encode ($js->decode ($json));
46#my $x = {"üü" => 2, "ü" => 1, "a" => [1,{3,4},2], b => 5};
47 1065
48#my $js2 = JSON::DWIW->new; 1066XSLoader::load "JSON::XS", $VERSION;
49# 1067
50#timethese 200000, { 1068package JSON::XS::Boolean;
51# a => sub { $js->encode ($x) }, 1069
52# b => sub { $js2->to_json ($x) }, 1070use overload
53#}; 1071 "0+" => sub { ${$_[0]} },
1072 "++" => sub { $_[0] = ${$_[0]} + 1 },
1073 "--" => sub { $_[0] = ${$_[0]} - 1 },
1074 fallback => 1;
54 1075
551; 10761;
56
57=back
58 1077
59=head1 AUTHOR 1078=head1 AUTHOR
60 1079
61 Marc Lehmann <schmorp@schmorp.de> 1080 Marc Lehmann <schmorp@schmorp.de>
62 http://home.schmorp.de/ 1081 http://home.schmorp.de/

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines