ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/JSON-XS/README
(Generate patch)

Comparing JSON-XS/README (file contents):
Revision 1.23 by root, Wed Mar 19 22:31:00 2008 UTC vs.
Revision 1.29 by root, Thu Feb 19 01:13:46 2009 UTC

32 primary goal is to be *correct* and its secondary goal is to be *fast*. 32 primary goal is to be *correct* and its secondary goal is to be *fast*.
33 To reach the latter goal it was written in C. 33 To reach the latter goal it was written in C.
34 34
35 Beginning with version 2.0 of the JSON module, when both JSON and 35 Beginning with version 2.0 of the JSON module, when both JSON and
36 JSON::XS are installed, then JSON will fall back on JSON::XS (this can 36 JSON::XS are installed, then JSON will fall back on JSON::XS (this can
37 be overriden) with no overhead due to emulation (by inheritign 37 be overridden) with no overhead due to emulation (by inheriting
38 constructor and methods). If JSON::XS is not available, it will fall 38 constructor and methods). If JSON::XS is not available, it will fall
39 back to the compatible JSON::PP module as backend, so using JSON instead 39 back to the compatible JSON::PP module as backend, so using JSON instead
40 of JSON::XS gives you a portable JSON API that can be fast when you need 40 of JSON::XS gives you a portable JSON API that can be fast when you need
41 and doesn't require a C compiler when that is a problem. 41 and doesn't require a C compiler when that is a problem.
42 42
44 to write yet another JSON module? While it seems there are many JSON 44 to write yet another JSON module? While it seems there are many JSON
45 modules, none of them correctly handle all corner cases, and in most 45 modules, none of them correctly handle all corner cases, and in most
46 cases their maintainers are unresponsive, gone missing, or not listening 46 cases their maintainers are unresponsive, gone missing, or not listening
47 to bug reports for other reasons. 47 to bug reports for other reasons.
48 48
49 See COMPARISON, below, for a comparison to some other JSON modules.
50
51 See MAPPING, below, on how JSON::XS maps perl values to JSON values and 49 See MAPPING, below, on how JSON::XS maps perl values to JSON values and
52 vice versa. 50 vice versa.
53 51
54 FEATURES 52 FEATURES
55 * correct Unicode handling 53 * correct Unicode handling
57 This module knows how to handle Unicode, documents how and when it 55 This module knows how to handle Unicode, documents how and when it
58 does so, and even documents what "correct" means. 56 does so, and even documents what "correct" means.
59 57
60 * round-trip integrity 58 * round-trip integrity
61 59
62 When you serialise a perl data structure using only datatypes 60 When you serialise a perl data structure using only data types
63 supported by JSON, the deserialised data structure is identical on 61 supported by JSON, the deserialised data structure is identical on
64 the Perl level. (e.g. the string "2.0" doesn't suddenly become "2" 62 the Perl level. (e.g. the string "2.0" doesn't suddenly become "2"
65 just because it looks like a number). There minor *are* exceptions 63 just because it looks like a number). There minor *are* exceptions
66 to this, read the MAPPING section below to learn about those. 64 to this, read the MAPPING section below to learn about those.
67 65
78 too. 76 too.
79 77
80 * simple to use 78 * simple to use
81 79
82 This module has both a simple functional interface as well as an 80 This module has both a simple functional interface as well as an
83 objetc oriented interface interface. 81 object oriented interface interface.
84 82
85 * reasonably versatile output formats 83 * reasonably versatile output formats
86 84
87 You can choose between the most compact guaranteed-single-line 85 You can choose between the most compact guaranteed-single-line
88 format possible (nice for simple line-based protocols), a pure-ascii 86 format possible (nice for simple line-based protocols), a pure-ASCII
89 format (for when your transport is not 8-bit clean, still supports 87 format (for when your transport is not 8-bit clean, still supports
90 the whole Unicode range), or a pretty-printed format (for when you 88 the whole Unicode range), or a pretty-printed format (for when you
91 want to read that stuff). Or you can combine those features in 89 want to read that stuff). Or you can combine those features in
92 whatever way you like. 90 whatever way you like.
93 91
101 99
102 This function call is functionally identical to: 100 This function call is functionally identical to:
103 101
104 $json_text = JSON::XS->new->utf8->encode ($perl_scalar) 102 $json_text = JSON::XS->new->utf8->encode ($perl_scalar)
105 103
106 except being faster. 104 Except being faster.
107 105
108 $perl_scalar = decode_json $json_text 106 $perl_scalar = decode_json $json_text
109 The opposite of "encode_json": expects an UTF-8 (binary) string and 107 The opposite of "encode_json": expects an UTF-8 (binary) string and
110 tries to parse that as an UTF-8 encoded JSON text, returning the 108 tries to parse that as an UTF-8 encoded JSON text, returning the
111 resulting reference. Croaks on error. 109 resulting reference. Croaks on error.
112 110
113 This function call is functionally identical to: 111 This function call is functionally identical to:
114 112
115 $perl_scalar = JSON::XS->new->utf8->decode ($json_text) 113 $perl_scalar = JSON::XS->new->utf8->decode ($json_text)
116 114
117 except being faster. 115 Except being faster.
118 116
119 $is_boolean = JSON::XS::is_bool $scalar 117 $is_boolean = JSON::XS::is_bool $scalar
120 Returns true if the passed scalar represents either JSON::XS::true 118 Returns true if the passed scalar represents either JSON::XS::true
121 or JSON::XS::false, two constants that act like 1 and 0, 119 or JSON::XS::false, two constants that act like 1 and 0,
122 respectively and are used to represent JSON "true" and "false" 120 respectively and are used to represent JSON "true" and "false"
152 150
153 If you didn't know about that flag, just the better, pretend it 151 If you didn't know about that flag, just the better, pretend it
154 doesn't exist. 152 doesn't exist.
155 153
156 4. A "Unicode String" is simply a string where each character can be 154 4. A "Unicode String" is simply a string where each character can be
157 validly interpreted as a Unicode codepoint. 155 validly interpreted as a Unicode code point.
158 If you have UTF-8 encoded data, it is no longer a Unicode string, 156 If you have UTF-8 encoded data, it is no longer a Unicode string,
159 but a Unicode string encoded in UTF-8, giving you a binary string. 157 but a Unicode string encoded in UTF-8, giving you a binary string.
160 158
161 5. A string containing "high" (> 255) character values is *not* a UTF-8 159 5. A string containing "high" (> 255) character values is *not* a UTF-8
162 string. 160 string.
397 Example, encode a Perl scalar as JSON value with enabled 395 Example, encode a Perl scalar as JSON value with enabled
398 "allow_nonref", resulting in an invalid JSON text: 396 "allow_nonref", resulting in an invalid JSON text:
399 397
400 JSON::XS->new->allow_nonref->encode ("Hello, World!") 398 JSON::XS->new->allow_nonref->encode ("Hello, World!")
401 => "Hello, World!" 399 => "Hello, World!"
400
401 $json = $json->allow_unknown ([$enable])
402 $enabled = $json->get_allow_unknown
403 If $enable is true (or missing), then "encode" will *not* throw an
404 exception when it encounters values it cannot represent in JSON (for
405 example, filehandles) but instead will encode a JSON "null" value.
406 Note that blessed objects are not included here and are handled
407 separately by c<allow_nonref>.
408
409 If $enable is false (the default), then "encode" will throw an
410 exception when it encounters anything it cannot encode as JSON.
411
412 This option does not affect "decode" in any way, and it is
413 recommended to leave it off unless you know your communications
414 partner.
402 415
403 $json = $json->allow_blessed ([$enable]) 416 $json = $json->allow_blessed ([$enable])
404 $enabled = $json->get_allow_blessed 417 $enabled = $json->get_allow_blessed
405 If $enable is true (or missing), then the "encode" method will not 418 If $enable is true (or missing), then the "encode" method will not
406 barf when it encounters a blessed reference. Instead, the value of 419 barf when it encounters a blessed reference. Instead, the value of
541 saving space. 554 saving space.
542 555
543 $json = $json->max_depth ([$maximum_nesting_depth]) 556 $json = $json->max_depth ([$maximum_nesting_depth])
544 $max_depth = $json->get_max_depth 557 $max_depth = $json->get_max_depth
545 Sets the maximum nesting level (default 512) accepted while encoding 558 Sets the maximum nesting level (default 512) accepted while encoding
546 or decoding. If the JSON text or Perl data structure has an equal or 559 or decoding. If a higher nesting level is detected in JSON text or a
547 higher nesting level then this limit, then the encoder and decoder 560 Perl data structure, then the encoder and decoder will stop and
548 will stop and croak at that point. 561 croak at that point.
549 562
550 Nesting level is defined by number of hash- or arrayrefs that the 563 Nesting level is defined by number of hash- or arrayrefs that the
551 encoder needs to traverse to reach a given point or the number of 564 encoder needs to traverse to reach a given point or the number of
552 "{" or "[" characters without their matching closing parenthesis 565 "{" or "[" characters without their matching closing parenthesis
553 crossed to reach a given character in a string. 566 crossed to reach a given character in a string.
554 567
555 Setting the maximum depth to one disallows any nesting, so that 568 Setting the maximum depth to one disallows any nesting, so that
556 ensures that the object is only a single hash/object or array. 569 ensures that the object is only a single hash/object or array.
557 570
558 The argument to "max_depth" will be rounded up to the next highest
559 power of two. If no argument is given, the highest possible setting 571 If no argument is given, the highest possible setting will be used,
560 will be used, which is rarely useful. 572 which is rarely useful.
573
574 Note that nesting is implemented by recursion in C. The default
575 value has been chosen to be as large as typical operating systems
576 allow without crashing.
561 577
562 See SECURITY CONSIDERATIONS, below, for more info on why this is 578 See SECURITY CONSIDERATIONS, below, for more info on why this is
563 useful. 579 useful.
564 580
565 $json = $json->max_size ([$maximum_string_size]) 581 $json = $json->max_size ([$maximum_string_size])
566 $max_size = $json->get_max_size 582 $max_size = $json->get_max_size
567 Set the maximum length a JSON text may have (in bytes) where 583 Set the maximum length a JSON text may have (in bytes) where
568 decoding is being attempted. The default is 0, meaning no limit. 584 decoding is being attempted. The default is 0, meaning no limit.
569 When "decode" is called on a string longer then this number of 585 When "decode" is called on a string that is longer then this many
570 characters it will not attempt to decode the string but throw an 586 bytes, it will not attempt to decode the string but throw an
571 exception. This setting has no effect on "encode" (yet). 587 exception. This setting has no effect on "encode" (yet).
572 588
573 The argument to "max_size" will be rounded up to the next highest
574 power of two (so may be more than requested). If no argument is
575 given, the limit check will be deactivated (same as when 0 is 589 If no argument is given, the limit check will be deactivated (same
576 specified). 590 as when 0 is specified).
577 591
578 See SECURITY CONSIDERATIONS, below, for more info on why this is 592 See SECURITY CONSIDERATIONS, below, for more info on why this is
579 useful. 593 useful.
580 594
581 $json_text = $json->encode ($perl_scalar) 595 $json_text = $json->encode ($perl_scalar)
605 protocol (which is not the brightest thing to do in the first place) 619 protocol (which is not the brightest thing to do in the first place)
606 and you need to know where the JSON text ends. 620 and you need to know where the JSON text ends.
607 621
608 JSON::XS->new->decode_prefix ("[1] the tail") 622 JSON::XS->new->decode_prefix ("[1] the tail")
609 => ([], 3) 623 => ([], 3)
624
625INCREMENTAL PARSING
626 In some cases, there is the need for incremental parsing of JSON texts.
627 While this module always has to keep both JSON text and resulting Perl
628 data structure in memory at one time, it does allow you to parse a JSON
629 stream incrementally. It does so by accumulating text until it has a
630 full JSON object, which it then can decode. This process is similar to
631 using "decode_prefix" to see if a full JSON object is available, but is
632 much more efficient (and can be implemented with a minimum of method
633 calls).
634
635 JSON::XS will only attempt to parse the JSON text once it is sure it has
636 enough text to get a decisive result, using a very simple but truly
637 incremental parser. This means that it sometimes won't stop as early as
638 the full parser, for example, it doesn't detect parenthese mismatches.
639 The only thing it guarantees is that it starts decoding as soon as a
640 syntactically valid JSON text has been seen. This means you need to set
641 resource limits (e.g. "max_size") to ensure the parser will stop parsing
642 in the presence if syntax errors.
643
644 The following methods implement this incremental parser.
645
646 [void, scalar or list context] = $json->incr_parse ([$string])
647 This is the central parsing function. It can both append new text
648 and extract objects from the stream accumulated so far (both of
649 these functions are optional).
650
651 If $string is given, then this string is appended to the already
652 existing JSON fragment stored in the $json object.
653
654 After that, if the function is called in void context, it will
655 simply return without doing anything further. This can be used to
656 add more text in as many chunks as you want.
657
658 If the method is called in scalar context, then it will try to
659 extract exactly *one* JSON object. If that is successful, it will
660 return this object, otherwise it will return "undef". If there is a
661 parse error, this method will croak just as "decode" would do (one
662 can then use "incr_skip" to skip the errornous part). This is the
663 most common way of using the method.
664
665 And finally, in list context, it will try to extract as many objects
666 from the stream as it can find and return them, or the empty list
667 otherwise. For this to work, there must be no separators between the
668 JSON objects or arrays, instead they must be concatenated
669 back-to-back. If an error occurs, an exception will be raised as in
670 the scalar context case. Note that in this case, any
671 previously-parsed JSON texts will be lost.
672
673 $lvalue_string = $json->incr_text
674 This method returns the currently stored JSON fragment as an lvalue,
675 that is, you can manipulate it. This *only* works when a preceding
676 call to "incr_parse" in *scalar context* successfully returned an
677 object. Under all other circumstances you must not call this
678 function (I mean it. although in simple tests it might actually
679 work, it *will* fail under real world conditions). As a special
680 exception, you can also call this method before having parsed
681 anything.
682
683 This function is useful in two cases: a) finding the trailing text
684 after a JSON object or b) parsing multiple JSON objects separated by
685 non-JSON text (such as commas).
686
687 $json->incr_skip
688 This will reset the state of the incremental parser and will remove
689 the parsed text from the input buffer so far. This is useful after
690 "incr_parse" died, in which case the input buffer and incremental
691 parser state is left unchanged, to skip the text parsed so far and
692 to reset the parse state.
693
694 The difference to "incr_reset" is that only text until the parse
695 error occured is removed.
696
697 $json->incr_reset
698 This completely resets the incremental parser, that is, after this
699 call, it will be as if the parser had never parsed anything.
700
701 This is useful if you want to repeatedly parse JSON objects and want
702 to ignore any trailing data, which means you have to reset the
703 parser after each successful decode.
704
705 LIMITATIONS
706 All options that affect decoding are supported, except "allow_nonref".
707 The reason for this is that it cannot be made to work sensibly: JSON
708 objects and arrays are self-delimited, i.e. you can concatenate them
709 back to back and still decode them perfectly. This does not hold true
710 for JSON numbers, however.
711
712 For example, is the string 1 a single JSON number, or is it simply the
713 start of 12? Or is 12 a single JSON number, or the concatenation of 1
714 and 2? In neither case you can tell, and this is why JSON::XS takes the
715 conservative route and disallows this case.
716
717 EXAMPLES
718 Some examples will make all this clearer. First, a simple example that
719 works similarly to "decode_prefix": We want to decode the JSON object at
720 the start of a string and identify the portion after the JSON object:
721
722 my $text = "[1,2,3] hello";
723
724 my $json = new JSON::XS;
725
726 my $obj = $json->incr_parse ($text)
727 or die "expected JSON object or array at beginning of string";
728
729 my $tail = $json->incr_text;
730 # $tail now contains " hello"
731
732 Easy, isn't it?
733
734 Now for a more complicated example: Imagine a hypothetical protocol
735 where you read some requests from a TCP stream, and each request is a
736 JSON array, without any separation between them (in fact, it is often
737 useful to use newlines as "separators", as these get interpreted as
738 whitespace at the start of the JSON text, which makes it possible to
739 test said protocol with "telnet"...).
740
741 Here is how you'd do it (it is trivial to write this in an event-based
742 manner):
743
744 my $json = new JSON::XS;
745
746 # read some data from the socket
747 while (sysread $socket, my $buf, 4096) {
748
749 # split and decode as many requests as possible
750 for my $request ($json->incr_parse ($buf)) {
751 # act on the $request
752 }
753 }
754
755 Another complicated example: Assume you have a string with JSON objects
756 or arrays, all separated by (optional) comma characters (e.g. "[1],[2],
757 [3]"). To parse them, we have to skip the commas between the JSON texts,
758 and here is where the lvalue-ness of "incr_text" comes in useful:
759
760 my $text = "[1],[2], [3]";
761 my $json = new JSON::XS;
762
763 # void context, so no parsing done
764 $json->incr_parse ($text);
765
766 # now extract as many objects as possible. note the
767 # use of scalar context so incr_text can be called.
768 while (my $obj = $json->incr_parse) {
769 # do something with $obj
770
771 # now skip the optional comma
772 $json->incr_text =~ s/^ \s* , //x;
773 }
774
775 Now lets go for a very complex example: Assume that you have a gigantic
776 JSON array-of-objects, many gigabytes in size, and you want to parse it,
777 but you cannot load it into memory fully (this has actually happened in
778 the real world :).
779
780 Well, you lost, you have to implement your own JSON parser. But JSON::XS
781 can still help you: You implement a (very simple) array parser and let
782 JSON decode the array elements, which are all full JSON objects on their
783 own (this wouldn't work if the array elements could be JSON numbers, for
784 example):
785
786 my $json = new JSON::XS;
787
788 # open the monster
789 open my $fh, "<bigfile.json"
790 or die "bigfile: $!";
791
792 # first parse the initial "["
793 for (;;) {
794 sysread $fh, my $buf, 65536
795 or die "read error: $!";
796 $json->incr_parse ($buf); # void context, so no parsing
797
798 # Exit the loop once we found and removed(!) the initial "[".
799 # In essence, we are (ab-)using the $json object as a simple scalar
800 # we append data to.
801 last if $json->incr_text =~ s/^ \s* \[ //x;
802 }
803
804 # now we have the skipped the initial "[", so continue
805 # parsing all the elements.
806 for (;;) {
807 # in this loop we read data until we got a single JSON object
808 for (;;) {
809 if (my $obj = $json->incr_parse) {
810 # do something with $obj
811 last;
812 }
813
814 # add more data
815 sysread $fh, my $buf, 65536
816 or die "read error: $!";
817 $json->incr_parse ($buf); # void context, so no parsing
818 }
819
820 # in this loop we read data until we either found and parsed the
821 # separating "," between elements, or the final "]"
822 for (;;) {
823 # first skip whitespace
824 $json->incr_text =~ s/^\s*//;
825
826 # if we find "]", we are done
827 if ($json->incr_text =~ s/^\]//) {
828 print "finished.\n";
829 exit;
830 }
831
832 # if we find ",", we can continue with the next element
833 if ($json->incr_text =~ s/^,//) {
834 last;
835 }
836
837 # if we find anything else, we have a parse error!
838 if (length $json->incr_text) {
839 die "parse error near ", $json->incr_text;
840 }
841
842 # else add more data
843 sysread $fh, my $buf, 65536
844 or die "read error: $!";
845 $json->incr_parse ($buf); # void context, so no parsing
846 }
847
848 This is a complex example, but most of the complexity comes from the
849 fact that we are trying to be correct (bear with me if I am wrong, I
850 never ran the above example :).
610 851
611MAPPING 852MAPPING
612 This section describes how JSON::XS maps Perl values to JSON values and 853 This section describes how JSON::XS maps Perl values to JSON values and
613 vice versa. These mappings are designed to "do the right thing" in most 854 vice versa. These mappings are designed to "do the right thing" in most
614 circumstances automatically, preserving round-tripping characteristics 855 circumstances automatically, preserving round-tripping characteristics
687 an exception to be thrown, except for references to the integers 0 928 an exception to be thrown, except for references to the integers 0
688 and 1, which get turned into "false" and "true" atoms in JSON. You 929 and 1, which get turned into "false" and "true" atoms in JSON. You
689 can also use "JSON::XS::false" and "JSON::XS::true" to improve 930 can also use "JSON::XS::false" and "JSON::XS::true" to improve
690 readability. 931 readability.
691 932
692 encode_json [\0,JSON::XS::true] # yields [false,true] 933 encode_json [\0, JSON::XS::true] # yields [false,true]
693 934
694 JSON::XS::true, JSON::XS::false 935 JSON::XS::true, JSON::XS::false
695 These special values become JSON true and JSON false values, 936 These special values become JSON true and JSON false values,
696 respectively. You can also use "\1" and "\0" directly if you want. 937 respectively. You can also use "\1" and "\0" directly if you want.
697 938
734 $x += 0; # numify it, ensuring it will be dumped as a number 975 $x += 0; # numify it, ensuring it will be dumped as a number
735 $x *= 1; # same thing, the choice is yours. 976 $x *= 1; # same thing, the choice is yours.
736 977
737 You can not currently force the type in other, less obscure, ways. 978 You can not currently force the type in other, less obscure, ways.
738 Tell me if you need this capability (but don't forget to explain why 979 Tell me if you need this capability (but don't forget to explain why
739 its needed :). 980 it's needed :).
740 981
741ENCODING/CODESET FLAG NOTES 982ENCODING/CODESET FLAG NOTES
742 The interested reader might have seen a number of flags that signify 983 The interested reader might have seen a number of flags that signify
743 encodings or codesets - "utf8", "latin1" and "ascii". There seems to be 984 encodings or codesets - "utf8", "latin1" and "ascii". There seems to be
744 some confusion on what these do, so here is a short comparison: 985 some confusion on what these do, so here is a short comparison:
745 986
746 "utf8" controls wether the JSON text created by "encode" (and expected 987 "utf8" controls whether the JSON text created by "encode" (and expected
747 by "decode") is UTF-8 encoded or not, while "latin1" and "ascii" only 988 by "decode") is UTF-8 encoded or not, while "latin1" and "ascii" only
748 control wether "encode" escapes character values outside their 989 control whether "encode" escapes character values outside their
749 respective codeset range. Neither of these flags conflict with each 990 respective codeset range. Neither of these flags conflict with each
750 other, although some combinations make less sense than others. 991 other, although some combinations make less sense than others.
751 992
752 Care has been taken to make all flags symmetrical with respect to 993 Care has been taken to make all flags symmetrical with respect to
753 "encode" and "decode", that is, texts encoded with any combination of 994 "encode" and "decode", that is, texts encoded with any combination of
831 structure back. This is useful when your channel for JSON transfer 1072 structure back. This is useful when your channel for JSON transfer
832 is not 8-bit clean or the encoding might be mangled in between (e.g. 1073 is not 8-bit clean or the encoding might be mangled in between (e.g.
833 in mail), and works because ASCII is a proper subset of most 8-bit 1074 in mail), and works because ASCII is a proper subset of most 8-bit
834 and multibyte encodings in use in the world. 1075 and multibyte encodings in use in the world.
835 1076
836COMPARISON 1077 JSON and ECMAscript
837 As already mentioned, this module was created because none of the 1078 JSON syntax is based on how literals are represented in javascript (the
838 existing JSON modules could be made to work correctly. First I will 1079 not-standardised predecessor of ECMAscript) which is presumably why it
839 describe the problems (or pleasures) I encountered with various existing 1080 is called "JavaScript Object Notation".
840 JSON modules, followed by some benchmark values. JSON::XS was designed
841 not to suffer from any of these problems or limitations.
842 1081
843 JSON 2.xx 1082 However, JSON is not a subset (and also not a superset of course) of
844 A marvellous piece of engineering, this module either uses JSON::XS 1083 ECMAscript (the standard) or javascript (whatever browsers actually
845 directly when available (so will be 100% compatible with it, 1084 implement).
846 including speed), or it uses JSON::PP, which is basically JSON::XS
847 translated to Pure Perl, which should be 100% compatible with
848 JSON::XS, just a bit slower.
849 1085
850 You cannot really lose by using this module, especially as it tries 1086 If you want to use javascript's "eval" function to "parse" JSON, you
851 very hard to work even with ancient Perl versions, while JSON::XS 1087 might run into parse errors for valid JSON texts, or the resulting data
852 does not. 1088 structure might not be queryable:
853 1089
854 JSON 1.07 1090 One of the problems is that U+2028 and U+2029 are valid characters
855 Slow (but very portable, as it is written in pure Perl). 1091 inside JSON strings, but are not allowed in ECMAscript string literals,
1092 so the following Perl fragment will not output something that can be
1093 guaranteed to be parsable by javascript's "eval":
856 1094
857 Undocumented/buggy Unicode handling (how JSON handles Unicode values 1095 use JSON::XS;
858 is undocumented. One can get far by feeding it Unicode strings and
859 doing en-/decoding oneself, but Unicode escapes are not working
860 properly).
861 1096
862 No round-tripping (strings get clobbered if they look like numbers, 1097 print encode_json [chr 0x2028];
863 e.g. the string 2.0 will encode to 2.0 instead of "2.0", and that
864 will decode into the number 2.
865 1098
866 JSON::PC 0.01 1099 The right fix for this is to use a proper JSON parser in your javascript
867 Very fast. 1100 programs, and not rely on "eval" (see for example Douglas Crockford's
1101 json2.js parser).
868 1102
869 Undocumented/buggy Unicode handling. 1103 If this is not an option, you can, as a stop-gap measure, simply encode
1104 to ASCII-only JSON:
870 1105
871 No round-tripping. 1106 use JSON::XS;
872 1107
873 Has problems handling many Perl values (e.g. regex results and other 1108 print JSON::XS->new->ascii->encode ([chr 0x2028]);
874 magic values will make it croak).
875 1109
876 Does not even generate valid JSON ("{1,2}" gets converted to "{1:2}" 1110 Note that this will enlarge the resulting JSON text quite a bit if you
877 which is not a valid JSON text. 1111 have many non-ASCII characters. You might be tempted to run some regexes
1112 to only escape U+2028 and U+2029, e.g.:
878 1113
879 Unmaintained (maintainer unresponsive for many months, bugs are not 1114 # DO NOT USE THIS!
880 getting fixed). 1115 my $json = JSON::XS->new->utf8->encode ([chr 0x2028]);
1116 $json =~ s/\xe2\x80\xa8/\\u2028/g; # escape U+2028
1117 $json =~ s/\xe2\x80\xa9/\\u2029/g; # escape U+2029
1118 print $json;
881 1119
882 JSON::Syck 0.21 1120 Note that *this is a bad idea*: the above only works for U+2028 and
883 Very buggy (often crashes). 1121 U+2029 and thus only for fully ECMAscript-compliant parsers. Many
1122 existing javascript implementations, however, have issues with other
1123 characters as well - using "eval" naively simply *will* cause problems.
884 1124
885 Very inflexible (no human-readable format supported, format pretty 1125 Another problem is that some javascript implementations reserve some
886 much undocumented. I need at least a format for easy reading by 1126 property names for their own purposes (which probably makes them
887 humans and a single-line compact format for use in a protocol, and 1127 non-ECMAscript-compliant). For example, Iceweasel reserves the
888 preferably a way to generate ASCII-only JSON texts). 1128 "__proto__" property name for it's own purposes.
889 1129
890 Completely broken (and confusingly documented) Unicode handling 1130 If that is a problem, you could parse try to filter the resulting JSON
891 (Unicode escapes are not working properly, you need to set 1131 output for these property strings, e.g.:
892 ImplicitUnicode to *different* values on en- and decoding to get
893 symmetric behaviour).
894 1132
895 No round-tripping (simple cases work, but this depends on whether 1133 $json =~ s/"__proto__"\s*:/"__proto__renamed":/g;
896 the scalar value was used in a numeric context or not).
897 1134
898 Dumping hashes may skip hash values depending on iterator state. 1135 This works because "__proto__" is not valid outside of strings, so every
1136 occurence of ""__proto__"\s*:" must be a string used as property name.
899 1137
900 Unmaintained (maintainer unresponsive for many months, bugs are not 1138 If you know of other incompatibilities, please let me know.
901 getting fixed).
902
903 Does not check input for validity (i.e. will accept non-JSON input
904 and return "something" instead of raising an exception. This is a
905 security issue: imagine two banks transferring money between each
906 other using JSON. One bank might parse a given non-JSON request and
907 deduct money, while the other might reject the transaction with a
908 syntax error. While a good protocol will at least recover, that is
909 extra unnecessary work and the transaction will still not succeed).
910
911 JSON::DWIW 0.04
912 Very fast. Very natural. Very nice.
913
914 Undocumented Unicode handling (but the best of the pack. Unicode
915 escapes still don't get parsed properly).
916
917 Very inflexible.
918
919 No round-tripping.
920
921 Does not generate valid JSON texts (key strings are often unquoted,
922 empty keys result in nothing being output)
923
924 Does not check input for validity.
925 1139
926 JSON and YAML 1140 JSON and YAML
927 You often hear that JSON is a subset of YAML. This is, however, a mass 1141 You often hear that JSON is a subset of YAML. This is, however, a mass
928 hysteria(*) and very far from the truth (as of the time of this 1142 hysteria(*) and very far from the truth (as of the time of this
929 writing), so let me state it clearly: *in general, there is no way to 1143 writing), so let me state it clearly: *in general, there is no way to
977 1191
978 First comes a comparison between various modules using a very short 1192 First comes a comparison between various modules using a very short
979 single-line JSON string (also available at 1193 single-line JSON string (also available at
980 <http://dist.schmorp.de/misc/json/short.json>). 1194 <http://dist.schmorp.de/misc/json/short.json>).
981 1195
982 {"method": "handleMessage", "params": ["user1", "we were just talking"], \ 1196 {"method": "handleMessage", "params": ["user1",
983 "id": null, "array":[1,11,234,-5,1e5,1e7, true, false]} 1197 "we were just talking"], "id": null, "array":[1,11,234,-5,1e5,1e7,
1198 true, false]}
984 1199
985 It shows the number of encodes/decodes per second (JSON::XS uses the 1200 It shows the number of encodes/decodes per second (JSON::XS uses the
986 functional interface, while JSON::XS/2 uses the OO interface with 1201 functional interface, while JSON::XS/2 uses the OO interface with
987 pretty-printing and hashkey sorting enabled, JSON::XS/3 enables shrink). 1202 pretty-printing and hashkey sorting enabled, JSON::XS/3 enables shrink).
988 Higher is better: 1203 Higher is better:
1075 1290
1076THREADS 1291THREADS
1077 This module is *not* guaranteed to be thread safe and there are no plans 1292 This module is *not* guaranteed to be thread safe and there are no plans
1078 to change this until Perl gets thread support (as opposed to the 1293 to change this until Perl gets thread support (as opposed to the
1079 horribly slow so-called "threads" which are simply slow and bloated 1294 horribly slow so-called "threads" which are simply slow and bloated
1080 process simulations - use fork, its *much* faster, cheaper, better). 1295 process simulations - use fork, it's *much* faster, cheaper, better).
1081 1296
1082 (It might actually work, but you have been warned). 1297 (It might actually work, but you have been warned).
1083 1298
1084BUGS 1299BUGS
1085 While the goal of this module is to be correct, that unfortunately does 1300 While the goal of this module is to be correct, that unfortunately does
1086 not mean its bug-free, only that I think its design is bug-free. It is 1301 not mean it's bug-free, only that I think its design is bug-free. If you
1087 still relatively early in its development. If you keep reporting bugs
1088 they will be fixed swiftly, though. 1302 keep reporting bugs they will be fixed swiftly, though.
1089 1303
1090 Please refrain from using rt.cpan.org or any other bug reporting 1304 Please refrain from using rt.cpan.org or any other bug reporting
1091 service. I put the contact address into my modules for a reason. 1305 service. I put the contact address into my modules for a reason.
1306
1307SEE ALSO
1308 The json_xs command line utility for quick experiments.
1092 1309
1093AUTHOR 1310AUTHOR
1094 Marc Lehmann <schmorp@schmorp.de> 1311 Marc Lehmann <schmorp@schmorp.de>
1095 http://home.schmorp.de/ 1312 http://home.schmorp.de/
1096 1313

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines