ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/JSON-XS/XS.pm
(Generate patch)

Comparing JSON-XS/XS.pm (file contents):
Revision 1.90 by root, Wed Mar 19 22:28:43 2008 UTC vs.
Revision 1.103 by root, Tue Apr 29 16:07:56 2008 UTC

1=head1 NAME
2
3JSON::XS - JSON serialising/deserialising, done correctly and fast
4
1=encoding utf-8 5=encoding utf-8
2
3=head1 NAME
4
5JSON::XS - JSON serialising/deserialising, done correctly and fast
6 6
7JSON::XS - 正しくて高速な JSON シリアライザ/デシリアライザ 7JSON::XS - 正しくて高速な JSON シリアライザ/デシリアライザ
8 (http://fleur.hio.jp/perldoc/mix/lib/JSON/XS.html) 8 (http://fleur.hio.jp/perldoc/mix/lib/JSON/XS.html)
9 9
10=head1 SYNOPSIS 10=head1 SYNOPSIS
103 103
104package JSON::XS; 104package JSON::XS;
105 105
106use strict; 106use strict;
107 107
108our $VERSION = '2.1'; 108our $VERSION = '2.2';
109our @ISA = qw(Exporter); 109our @ISA = qw(Exporter);
110 110
111our @EXPORT = qw(encode_json decode_json to_json from_json); 111our @EXPORT = qw(encode_json decode_json to_json from_json);
112 112
113sub to_json($) { 113sub to_json($) {
462Example, encode a Perl scalar as JSON value with enabled C<allow_nonref>, 462Example, encode a Perl scalar as JSON value with enabled C<allow_nonref>,
463resulting in an invalid JSON text: 463resulting in an invalid JSON text:
464 464
465 JSON::XS->new->allow_nonref->encode ("Hello, World!") 465 JSON::XS->new->allow_nonref->encode ("Hello, World!")
466 => "Hello, World!" 466 => "Hello, World!"
467
468=item $json = $json->allow_unknown ([$enable])
469
470=item $enabled = $json->get_allow_unknown
471
472If C<$enable> is true (or missing), then C<encode> will I<not> throw an
473exception when it encounters values it cannot represent in JSON (for
474example, filehandles) but instead will encode a JSON C<null> value. Note
475that blessed objects are not included here and are handled separately by
476c<allow_nonref>.
477
478If C<$enable> is false (the default), then C<encode> will throw an
479exception when it encounters anything it cannot encode as JSON.
480
481This option does not affect C<decode> in any way, and it is recommended to
482leave it off unless you know your communications partner.
467 483
468=item $json = $json->allow_blessed ([$enable]) 484=item $json = $json->allow_blessed ([$enable])
469 485
470=item $enabled = $json->get_allow_blessed 486=item $enabled = $json->get_allow_blessed
471 487
612=item $json = $json->max_depth ([$maximum_nesting_depth]) 628=item $json = $json->max_depth ([$maximum_nesting_depth])
613 629
614=item $max_depth = $json->get_max_depth 630=item $max_depth = $json->get_max_depth
615 631
616Sets the maximum nesting level (default C<512>) accepted while encoding 632Sets the maximum nesting level (default C<512>) accepted while encoding
617or decoding. If the JSON text or Perl data structure has an equal or 633or decoding. If a higher nesting level is detected in JSON text or a Perl
618higher nesting level then this limit, then the encoder and decoder will 634data structure, then the encoder and decoder will stop and croak at that
619stop and croak at that point. 635point.
620 636
621Nesting level is defined by number of hash- or arrayrefs that the encoder 637Nesting level is defined by number of hash- or arrayrefs that the encoder
622needs to traverse to reach a given point or the number of C<{> or C<[> 638needs to traverse to reach a given point or the number of C<{> or C<[>
623characters without their matching closing parenthesis crossed to reach a 639characters without their matching closing parenthesis crossed to reach a
624given character in a string. 640given character in a string.
625 641
626Setting the maximum depth to one disallows any nesting, so that ensures 642Setting the maximum depth to one disallows any nesting, so that ensures
627that the object is only a single hash/object or array. 643that the object is only a single hash/object or array.
628 644
629The argument to C<max_depth> will be rounded up to the next highest power
630of two. If no argument is given, the highest possible setting will be 645If no argument is given, the highest possible setting will be used, which
631used, which is rarely useful. 646is rarely useful.
647
648Note that nesting is implemented by recursion in C. The default value has
649been chosen to be as large as typical operating systems allow without
650crashing.
632 651
633See SECURITY CONSIDERATIONS, below, for more info on why this is useful. 652See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
634 653
635=item $json = $json->max_size ([$maximum_string_size]) 654=item $json = $json->max_size ([$maximum_string_size])
636 655
637=item $max_size = $json->get_max_size 656=item $max_size = $json->get_max_size
638 657
639Set the maximum length a JSON text may have (in bytes) where decoding is 658Set the maximum length a JSON text may have (in bytes) where decoding is
640being attempted. The default is C<0>, meaning no limit. When C<decode> 659being attempted. The default is C<0>, meaning no limit. When C<decode>
641is called on a string longer then this number of characters it will not 660is called on a string that is longer then this many bytes, it will not
642attempt to decode the string but throw an exception. This setting has no 661attempt to decode the string but throw an exception. This setting has no
643effect on C<encode> (yet). 662effect on C<encode> (yet).
644 663
645The argument to C<max_size> will be rounded up to the next B<highest> 664If no argument is given, the limit check will be deactivated (same as when
646power of two (so may be more than requested). If no argument is given, the 665C<0> is specified).
647limit check will be deactivated (same as when C<0> is specified).
648 666
649See SECURITY CONSIDERATIONS, below, for more info on why this is useful. 667See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
650 668
651=item $json_text = $json->encode ($perl_scalar) 669=item $json_text = $json->encode ($perl_scalar)
652 670
679 697
680 JSON::XS->new->decode_prefix ("[1] the tail") 698 JSON::XS->new->decode_prefix ("[1] the tail")
681 => ([], 3) 699 => ([], 3)
682 700
683=back 701=back
702
703
704=head1 INCREMENTAL PARSING
705
706[This section and the API it details is still EXPERIMENTAL]
707
708In some cases, there is the need for incremental parsing of JSON
709texts. While this module always has to keep both JSON text and resulting
710Perl data structure in memory at one time, it does allow you to parse a
711JSON stream incrementally. It does so by accumulating text until it has
712a full JSON object, which it then can decode. This process is similar to
713using C<decode_prefix> to see if a full JSON object is available, but is
714much more efficient (JSON::XS will only attempt to parse the JSON text
715once it is sure it has enough text to get a decisive result, using a very
716simple but truly incremental parser).
717
718The following two methods deal with this.
719
720=over 4
721
722=item [void, scalar or list context] = $json->incr_parse ([$string])
723
724This is the central parsing function. It can both append new text and
725extract objects from the stream accumulated so far (both of these
726functions are optional).
727
728If C<$string> is given, then this string is appended to the already
729existing JSON fragment stored in the C<$json> object.
730
731After that, if the function is called in void context, it will simply
732return without doing anything further. This can be used to add more text
733in as many chunks as you want.
734
735If the method is called in scalar context, then it will try to extract
736exactly I<one> JSON object. If that is successful, it will return this
737object, otherwise it will return C<undef>. If there is a parse error,
738this method will croak just as C<decode> would do (one can then use
739C<incr_skip> to skip the errornous part). This is the most common way of
740using the method.
741
742And finally, in list context, it will try to extract as many objects
743from the stream as it can find and return them, or the empty list
744otherwise. For this to work, there must be no separators between the JSON
745objects or arrays, instead they must be concatenated back-to-back. If
746an error occurs, an exception will be raised as in the scalar context
747case. Note that in this case, any previously-parsed JSON texts will be
748lost.
749
750=item $lvalue_string = $json->incr_text
751
752This method returns the currently stored JSON fragment as an lvalue, that
753is, you can manipulate it. This I<only> works when a preceding call to
754C<incr_parse> in I<scalar context> successfully returned an object. Under
755all other circumstances you must not call this function (I mean it.
756although in simple tests it might actually work, it I<will> fail under
757real world conditions). As a special exception, you can also call this
758method before having parsed anything.
759
760This function is useful in two cases: a) finding the trailing text after a
761JSON object or b) parsing multiple JSON objects separated by non-JSON text
762(such as commas).
763
764=item $json->incr_skip
765
766This will reset the state of the incremental parser and will remove the
767parsed text from the input buffer. This is useful after C<incr_parse>
768died, in which case the input buffer and incremental parser state is left
769unchanged, to skip the text parsed so far and to reset the parse state.
770
771=back
772
773=head2 LIMITATIONS
774
775All options that affect decoding are supported, except
776C<allow_nonref>. The reason for this is that it cannot be made to
777work sensibly: JSON objects and arrays are self-delimited, i.e. you can concatenate
778them back to back and still decode them perfectly. This does not hold true
779for JSON numbers, however.
780
781For example, is the string C<1> a single JSON number, or is it simply the
782start of C<12>? Or is C<12> a single JSON number, or the concatenation
783of C<1> and C<2>? In neither case you can tell, and this is why JSON::XS
784takes the conservative route and disallows this case.
785
786=head2 EXAMPLES
787
788Some examples will make all this clearer. First, a simple example that
789works similarly to C<decode_prefix>: We want to decode the JSON object at
790the start of a string and identify the portion after the JSON object:
791
792 my $text = "[1,2,3] hello";
793
794 my $json = new JSON::XS;
795
796 my $obj = $json->incr_parse ($text)
797 or die "expected JSON object or array at beginning of string";
798
799 my $tail = $json->incr_text;
800 # $tail now contains " hello"
801
802Easy, isn't it?
803
804Now for a more complicated example: Imagine a hypothetical protocol where
805you read some requests from a TCP stream, and each request is a JSON
806array, without any separation between them (in fact, it is often useful to
807use newlines as "separators", as these get interpreted as whitespace at
808the start of the JSON text, which makes it possible to test said protocol
809with C<telnet>...).
810
811Here is how you'd do it (it is trivial to write this in an event-based
812manner):
813
814 my $json = new JSON::XS;
815
816 # read some data from the socket
817 while (sysread $socket, my $buf, 4096) {
818
819 # split and decode as many requests as possible
820 for my $request ($json->incr_parse ($buf)) {
821 # act on the $request
822 }
823 }
824
825Another complicated example: Assume you have a string with JSON objects
826or arrays, all separated by (optional) comma characters (e.g. C<[1],[2],
827[3]>). To parse them, we have to skip the commas between the JSON texts,
828and here is where the lvalue-ness of C<incr_text> comes in useful:
829
830 my $text = "[1],[2], [3]";
831 my $json = new JSON::XS;
832
833 # void context, so no parsing done
834 $json->incr_parse ($text);
835
836 # now extract as many objects as possible. note the
837 # use of scalar context so incr_text can be called.
838 while (my $obj = $json->incr_parse) {
839 # do something with $obj
840
841 # now skip the optional comma
842 $json->incr_text =~ s/^ \s* , //x;
843 }
844
845Now lets go for a very complex example: Assume that you have a gigantic
846JSON array-of-objects, many gigabytes in size, and you want to parse it,
847but you cannot load it into memory fully (this has actually happened in
848the real world :).
849
850Well, you lost, you have to implement your own JSON parser. But JSON::XS
851can still help you: You implement a (very simple) array parser and let
852JSON decode the array elements, which are all full JSON objects on their
853own (this wouldn't work if the array elements could be JSON numbers, for
854example):
855
856 my $json = new JSON::XS;
857
858 # open the monster
859 open my $fh, "<bigfile.json"
860 or die "bigfile: $!";
861
862 # first parse the initial "["
863 for (;;) {
864 sysread $fh, my $buf, 65536
865 or die "read error: $!";
866 $json->incr_parse ($buf); # void context, so no parsing
867
868 # Exit the loop once we found and removed(!) the initial "[".
869 # In essence, we are (ab-)using the $json object as a simple scalar
870 # we append data to.
871 last if $json->incr_text =~ s/^ \s* \[ //x;
872 }
873
874 # now we have the skipped the initial "[", so continue
875 # parsing all the elements.
876 for (;;) {
877 # in this loop we read data until we got a single JSON object
878 for (;;) {
879 if (my $obj = $json->incr_parse) {
880 # do something with $obj
881 last;
882 }
883
884 # add more data
885 sysread $fh, my $buf, 65536
886 or die "read error: $!";
887 $json->incr_parse ($buf); # void context, so no parsing
888 }
889
890 # in this loop we read data until we either found and parsed the
891 # separating "," between elements, or the final "]"
892 for (;;) {
893 # first skip whitespace
894 $json->incr_text =~ s/^\s*//;
895
896 # if we find "]", we are done
897 if ($json->incr_text =~ s/^\]//) {
898 print "finished.\n";
899 exit;
900 }
901
902 # if we find ",", we can continue with the next element
903 if ($json->incr_text =~ s/^,//) {
904 last;
905 }
906
907 # if we find anything else, we have a parse error!
908 if (length $json->incr_text) {
909 die "parse error near ", $json->incr_text;
910 }
911
912 # else add more data
913 sysread $fh, my $buf, 65536
914 or die "read error: $!";
915 $json->incr_parse ($buf); # void context, so no parsing
916 }
917
918This is a complex example, but most of the complexity comes from the fact
919that we are trying to be correct (bear with me if I am wrong, I never ran
920the above example :).
921
684 922
685 923
686=head1 MAPPING 924=head1 MAPPING
687 925
688This section describes how JSON::XS maps Perl values to JSON values and 926This section describes how JSON::XS maps Perl values to JSON values and
825 my $x = "3"; # some variable containing a string 1063 my $x = "3"; # some variable containing a string
826 $x += 0; # numify it, ensuring it will be dumped as a number 1064 $x += 0; # numify it, ensuring it will be dumped as a number
827 $x *= 1; # same thing, the choice is yours. 1065 $x *= 1; # same thing, the choice is yours.
828 1066
829You can not currently force the type in other, less obscure, ways. Tell me 1067You can not currently force the type in other, less obscure, ways. Tell me
830if you need this capability (but don't forget to explain why its needed 1068if you need this capability (but don't forget to explain why it's needed
831:). 1069:).
832 1070
833=back 1071=back
834 1072
835 1073
837 1075
838The interested reader might have seen a number of flags that signify 1076The interested reader might have seen a number of flags that signify
839encodings or codesets - C<utf8>, C<latin1> and C<ascii>. There seems to be 1077encodings or codesets - C<utf8>, C<latin1> and C<ascii>. There seems to be
840some confusion on what these do, so here is a short comparison: 1078some confusion on what these do, so here is a short comparison:
841 1079
842C<utf8> controls wether the JSON text created by C<encode> (and expected 1080C<utf8> controls whether the JSON text created by C<encode> (and expected
843by C<decode>) is UTF-8 encoded or not, while C<latin1> and C<ascii> only 1081by C<decode>) is UTF-8 encoded or not, while C<latin1> and C<ascii> only
844control wether C<encode> escapes character values outside their respective 1082control whether C<encode> escapes character values outside their respective
845codeset range. Neither of these flags conflict with each other, although 1083codeset range. Neither of these flags conflict with each other, although
846some combinations make less sense than others. 1084some combinations make less sense than others.
847 1085
848Care has been taken to make all flags symmetrical with respect to 1086Care has been taken to make all flags symmetrical with respect to
849C<encode> and C<decode>, that is, texts encoded with any combination of 1087C<encode> and C<decode>, that is, texts encoded with any combination of
925as UTF-8, ISO-8859-1, ASCII, KOI8-R or most about any character set and 1163as UTF-8, ISO-8859-1, ASCII, KOI8-R or most about any character set and
9268-bit-encoding, and still get the same data structure back. This is useful 11648-bit-encoding, and still get the same data structure back. This is useful
927when your channel for JSON transfer is not 8-bit clean or the encoding 1165when your channel for JSON transfer is not 8-bit clean or the encoding
928might be mangled in between (e.g. in mail), and works because ASCII is a 1166might be mangled in between (e.g. in mail), and works because ASCII is a
929proper subset of most 8-bit and multibyte encodings in use in the world. 1167proper subset of most 8-bit and multibyte encodings in use in the world.
930
931=back
932
933
934=head1 COMPARISON
935
936As already mentioned, this module was created because none of the existing
937JSON modules could be made to work correctly. First I will describe the
938problems (or pleasures) I encountered with various existing JSON modules,
939followed by some benchmark values. JSON::XS was designed not to suffer
940from any of these problems or limitations.
941
942=over 4
943
944=item JSON 2.xx
945
946A marvellous piece of engineering, this module either uses JSON::XS
947directly when available (so will be 100% compatible with it, including
948speed), or it uses JSON::PP, which is basically JSON::XS translated to
949Pure Perl, which should be 100% compatible with JSON::XS, just a bit
950slower.
951
952You cannot really lose by using this module, especially as it tries very
953hard to work even with ancient Perl versions, while JSON::XS does not.
954
955=item JSON 1.07
956
957Slow (but very portable, as it is written in pure Perl).
958
959Undocumented/buggy Unicode handling (how JSON handles Unicode values is
960undocumented. One can get far by feeding it Unicode strings and doing
961en-/decoding oneself, but Unicode escapes are not working properly).
962
963No round-tripping (strings get clobbered if they look like numbers, e.g.
964the string C<2.0> will encode to C<2.0> instead of C<"2.0">, and that will
965decode into the number 2.
966
967=item JSON::PC 0.01
968
969Very fast.
970
971Undocumented/buggy Unicode handling.
972
973No round-tripping.
974
975Has problems handling many Perl values (e.g. regex results and other magic
976values will make it croak).
977
978Does not even generate valid JSON (C<{1,2}> gets converted to C<{1:2}>
979which is not a valid JSON text.
980
981Unmaintained (maintainer unresponsive for many months, bugs are not
982getting fixed).
983
984=item JSON::Syck 0.21
985
986Very buggy (often crashes).
987
988Very inflexible (no human-readable format supported, format pretty much
989undocumented. I need at least a format for easy reading by humans and a
990single-line compact format for use in a protocol, and preferably a way to
991generate ASCII-only JSON texts).
992
993Completely broken (and confusingly documented) Unicode handling (Unicode
994escapes are not working properly, you need to set ImplicitUnicode to
995I<different> values on en- and decoding to get symmetric behaviour).
996
997No round-tripping (simple cases work, but this depends on whether the scalar
998value was used in a numeric context or not).
999
1000Dumping hashes may skip hash values depending on iterator state.
1001
1002Unmaintained (maintainer unresponsive for many months, bugs are not
1003getting fixed).
1004
1005Does not check input for validity (i.e. will accept non-JSON input and
1006return "something" instead of raising an exception. This is a security
1007issue: imagine two banks transferring money between each other using
1008JSON. One bank might parse a given non-JSON request and deduct money,
1009while the other might reject the transaction with a syntax error. While a
1010good protocol will at least recover, that is extra unnecessary work and
1011the transaction will still not succeed).
1012
1013=item JSON::DWIW 0.04
1014
1015Very fast. Very natural. Very nice.
1016
1017Undocumented Unicode handling (but the best of the pack. Unicode escapes
1018still don't get parsed properly).
1019
1020Very inflexible.
1021
1022No round-tripping.
1023
1024Does not generate valid JSON texts (key strings are often unquoted, empty keys
1025result in nothing being output)
1026
1027Does not check input for validity.
1028 1168
1029=back 1169=back
1030 1170
1031 1171
1032=head2 JSON and YAML 1172=head2 JSON and YAML
1092 1232
1093First comes a comparison between various modules using 1233First comes a comparison between various modules using
1094a very short single-line JSON string (also available at 1234a very short single-line JSON string (also available at
1095L<http://dist.schmorp.de/misc/json/short.json>). 1235L<http://dist.schmorp.de/misc/json/short.json>).
1096 1236
1097 {"method": "handleMessage", "params": ["user1", "we were just talking"], \ 1237 {"method": "handleMessage", "params": ["user1",
1098 "id": null, "array":[1,11,234,-5,1e5,1e7, true, false]} 1238 "we were just talking"], "id": null, "array":[1,11,234,-5,1e5,1e7,
1239 true, false]}
1099 1240
1100It shows the number of encodes/decodes per second (JSON::XS uses 1241It shows the number of encodes/decodes per second (JSON::XS uses
1101the functional interface, while JSON::XS/2 uses the OO interface 1242the functional interface, while JSON::XS/2 uses the OO interface
1102with pretty-printing and hashkey sorting enabled, JSON::XS/3 enables 1243with pretty-printing and hashkey sorting enabled, JSON::XS/3 enables
1103shrink). Higher is better: 1244shrink). Higher is better:
1193=head1 THREADS 1334=head1 THREADS
1194 1335
1195This module is I<not> guaranteed to be thread safe and there are no 1336This module is I<not> guaranteed to be thread safe and there are no
1196plans to change this until Perl gets thread support (as opposed to the 1337plans to change this until Perl gets thread support (as opposed to the
1197horribly slow so-called "threads" which are simply slow and bloated 1338horribly slow so-called "threads" which are simply slow and bloated
1198process simulations - use fork, its I<much> faster, cheaper, better). 1339process simulations - use fork, it's I<much> faster, cheaper, better).
1199 1340
1200(It might actually work, but you have been warned). 1341(It might actually work, but you have been warned).
1201 1342
1202 1343
1203=head1 BUGS 1344=head1 BUGS
1204 1345
1205While the goal of this module is to be correct, that unfortunately does 1346While the goal of this module is to be correct, that unfortunately does
1206not mean its bug-free, only that I think its design is bug-free. It is 1347not mean it's bug-free, only that I think its design is bug-free. If you
1207still relatively early in its development. If you keep reporting bugs they 1348keep reporting bugs they will be fixed swiftly, though.
1208will be fixed swiftly, though.
1209 1349
1210Please refrain from using rt.cpan.org or any other bug reporting 1350Please refrain from using rt.cpan.org or any other bug reporting
1211service. I put the contact address into my modules for a reason. 1351service. I put the contact address into my modules for a reason.
1212 1352
1213=cut 1353=cut
1233 "--" => sub { $_[0] = ${$_[0]} - 1 }, 1373 "--" => sub { $_[0] = ${$_[0]} - 1 },
1234 fallback => 1; 1374 fallback => 1;
1235 1375
12361; 13761;
1237 1377
1378=head1 SEE ALSO
1379
1380The F<json_xs> command line utility for quick experiments.
1381
1238=head1 AUTHOR 1382=head1 AUTHOR
1239 1383
1240 Marc Lehmann <schmorp@schmorp.de> 1384 Marc Lehmann <schmorp@schmorp.de>
1241 http://home.schmorp.de/ 1385 http://home.schmorp.de/
1242 1386

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines