… | |
… | |
103 | |
103 | |
104 | package JSON::XS; |
104 | package JSON::XS; |
105 | |
105 | |
106 | use strict; |
106 | use strict; |
107 | |
107 | |
108 | our $VERSION = '2.01'; |
108 | our $VERSION = '2.1'; |
109 | our @ISA = qw(Exporter); |
109 | our @ISA = qw(Exporter); |
110 | |
110 | |
111 | our @EXPORT = qw(encode_json decode_json to_json from_json); |
111 | our @EXPORT = qw(encode_json decode_json to_json from_json); |
112 | |
112 | |
113 | sub to_json($) { |
113 | sub to_json($) { |
… | |
… | |
245 | |
245 | |
246 | If C<$enable> is false, then the C<encode> method will not escape Unicode |
246 | If C<$enable> is false, then the C<encode> method will not escape Unicode |
247 | characters unless required by the JSON syntax or other flags. This results |
247 | characters unless required by the JSON syntax or other flags. This results |
248 | in a faster and more compact format. |
248 | in a faster and more compact format. |
249 | |
249 | |
|
|
250 | See also the section I<ENCODING/CODESET FLAG NOTES> later in this |
|
|
251 | document. |
|
|
252 | |
250 | The main use for this flag is to produce JSON texts that can be |
253 | The main use for this flag is to produce JSON texts that can be |
251 | transmitted over a 7-bit channel, as the encoded JSON texts will not |
254 | transmitted over a 7-bit channel, as the encoded JSON texts will not |
252 | contain any 8 bit characters. |
255 | contain any 8 bit characters. |
253 | |
256 | |
254 | JSON::XS->new->ascii (1)->encode ([chr 0x10401]) |
257 | JSON::XS->new->ascii (1)->encode ([chr 0x10401]) |
… | |
… | |
265 | will not be affected in any way by this flag, as C<decode> by default |
268 | will not be affected in any way by this flag, as C<decode> by default |
266 | expects Unicode, which is a strict superset of latin1. |
269 | expects Unicode, which is a strict superset of latin1. |
267 | |
270 | |
268 | If C<$enable> is false, then the C<encode> method will not escape Unicode |
271 | If C<$enable> is false, then the C<encode> method will not escape Unicode |
269 | characters unless required by the JSON syntax or other flags. |
272 | characters unless required by the JSON syntax or other flags. |
|
|
273 | |
|
|
274 | See also the section I<ENCODING/CODESET FLAG NOTES> later in this |
|
|
275 | document. |
270 | |
276 | |
271 | The main use for this flag is efficiently encoding binary data as JSON |
277 | The main use for this flag is efficiently encoding binary data as JSON |
272 | text, as most octets will not be escaped, resulting in a smaller encoded |
278 | text, as most octets will not be escaped, resulting in a smaller encoded |
273 | size. The disadvantage is that the resulting JSON text is encoded |
279 | size. The disadvantage is that the resulting JSON text is encoded |
274 | in latin1 (and must correctly be treated as such when storing and |
280 | in latin1 (and must correctly be treated as such when storing and |
… | |
… | |
293 | |
299 | |
294 | If C<$enable> is false, then the C<encode> method will return the JSON |
300 | If C<$enable> is false, then the C<encode> method will return the JSON |
295 | string as a (non-encoded) Unicode string, while C<decode> expects thus a |
301 | string as a (non-encoded) Unicode string, while C<decode> expects thus a |
296 | Unicode string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs |
302 | Unicode string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs |
297 | to be done yourself, e.g. using the Encode module. |
303 | to be done yourself, e.g. using the Encode module. |
|
|
304 | |
|
|
305 | See also the section I<ENCODING/CODESET FLAG NOTES> later in this |
|
|
306 | document. |
298 | |
307 | |
299 | Example, output UTF-16BE-encoded JSON: |
308 | Example, output UTF-16BE-encoded JSON: |
300 | |
309 | |
301 | use Encode; |
310 | use Encode; |
302 | $jsontext = encode "UTF-16BE", JSON::XS->new->encode ($object); |
311 | $jsontext = encode "UTF-16BE", JSON::XS->new->encode ($object); |
… | |
… | |
816 | my $x = "3"; # some variable containing a string |
825 | my $x = "3"; # some variable containing a string |
817 | $x += 0; # numify it, ensuring it will be dumped as a number |
826 | $x += 0; # numify it, ensuring it will be dumped as a number |
818 | $x *= 1; # same thing, the choice is yours. |
827 | $x *= 1; # same thing, the choice is yours. |
819 | |
828 | |
820 | You can not currently force the type in other, less obscure, ways. Tell me |
829 | You can not currently force the type in other, less obscure, ways. Tell me |
821 | if you need this capability (but don't forget to explain why its needed |
830 | if you need this capability (but don't forget to explain why it's needed |
822 | :). |
831 | :). |
823 | |
832 | |
824 | =back |
833 | =back |
825 | |
834 | |
826 | |
835 | |
… | |
… | |
828 | |
837 | |
829 | The interested reader might have seen a number of flags that signify |
838 | The interested reader might have seen a number of flags that signify |
830 | encodings or codesets - C<utf8>, C<latin1> and C<ascii>. There seems to be |
839 | encodings or codesets - C<utf8>, C<latin1> and C<ascii>. There seems to be |
831 | some confusion on what these do, so here is a short comparison: |
840 | some confusion on what these do, so here is a short comparison: |
832 | |
841 | |
833 | C<utf8> controls wether the JSON text created by C<encode> (and expected |
842 | C<utf8> controls whether the JSON text created by C<encode> (and expected |
834 | by C<decode>) is UTF-8 encoded or not, while C<latin1> and C<ascii> only |
843 | by C<decode>) is UTF-8 encoded or not, while C<latin1> and C<ascii> only |
835 | control wether C<encode> escapes character values outside their respective |
844 | control whether C<encode> escapes character values outside their respective |
836 | codeset range. Neither of these flags conflict with each other, although |
845 | codeset range. Neither of these flags conflict with each other, although |
837 | some combinations make less sense than others. |
846 | some combinations make less sense than others. |
838 | |
847 | |
839 | Care has been taken to make all flags symmetrical with respect to |
848 | Care has been taken to make all flags symmetrical with respect to |
840 | C<encode> and C<decode>, that is, texts encoded with any combination of |
849 | C<encode> and C<decode>, that is, texts encoded with any combination of |
… | |
… | |
1021 | |
1030 | |
1022 | |
1031 | |
1023 | =head2 JSON and YAML |
1032 | =head2 JSON and YAML |
1024 | |
1033 | |
1025 | You often hear that JSON is a subset of YAML. This is, however, a mass |
1034 | You often hear that JSON is a subset of YAML. This is, however, a mass |
1026 | hysteria(*) and very far from the truth. In general, there is no way to |
1035 | hysteria(*) and very far from the truth (as of the time of this writing), |
|
|
1036 | so let me state it clearly: I<in general, there is no way to configure |
1027 | configure JSON::XS to output a data structure as valid YAML that works for |
1037 | JSON::XS to output a data structure as valid YAML> that works in all |
1028 | all cases. |
1038 | cases. |
1029 | |
1039 | |
1030 | If you really must use JSON::XS to generate YAML, you should use this |
1040 | If you really must use JSON::XS to generate YAML, you should use this |
1031 | algorithm (subject to change in future versions): |
1041 | algorithm (subject to change in future versions): |
1032 | |
1042 | |
1033 | my $to_yaml = JSON::XS->new->utf8->space_after (1); |
1043 | my $to_yaml = JSON::XS->new->utf8->space_after (1); |
… | |
… | |
1036 | This will I<usually> generate JSON texts that also parse as valid |
1046 | This will I<usually> generate JSON texts that also parse as valid |
1037 | YAML. Please note that YAML has hardcoded limits on (simple) object key |
1047 | YAML. Please note that YAML has hardcoded limits on (simple) object key |
1038 | lengths that JSON doesn't have and also has different and incompatible |
1048 | lengths that JSON doesn't have and also has different and incompatible |
1039 | unicode handling, so you should make sure that your hash keys are |
1049 | unicode handling, so you should make sure that your hash keys are |
1040 | noticeably shorter than the 1024 "stream characters" YAML allows and that |
1050 | noticeably shorter than the 1024 "stream characters" YAML allows and that |
1041 | you do not have codepoints with values outside the Unicode BMP (basic |
1051 | you do not have characters with codepoint values outside the Unicode BMP |
1042 | multilingual page). YAML also does not allow C<\/> sequences in strings |
1052 | (basic multilingual page). YAML also does not allow C<\/> sequences in |
1043 | (which JSON::XS does not I<currently> generate). |
1053 | strings (which JSON::XS does not I<currently> generate, but other JSON |
|
|
1054 | generators might). |
1044 | |
1055 | |
1045 | There might be other incompatibilities that I am not aware of (or the YAML |
1056 | There might be other incompatibilities that I am not aware of (or the YAML |
1046 | specification has been changed yet again - it does so quite often). In |
1057 | specification has been changed yet again - it does so quite often). In |
1047 | general you should not try to generate YAML with a JSON generator or vice |
1058 | general you should not try to generate YAML with a JSON generator or vice |
1048 | versa, or try to parse JSON with a YAML parser or vice versa: chances are |
1059 | versa, or try to parse JSON with a YAML parser or vice versa: chances are |
… | |
… | |
1051 | |
1062 | |
1052 | =over 4 |
1063 | =over 4 |
1053 | |
1064 | |
1054 | =item (*) |
1065 | =item (*) |
1055 | |
1066 | |
1056 | This is spread actively by the YAML team, however. For many years now they |
1067 | I have been pressured multiple times by Brian Ingerson (one of the |
1057 | claim YAML were a superset of JSON, even when proven otherwise. |
1068 | authors of the YAML specification) to remove this paragraph, despite him |
|
|
1069 | acknowledging that the actual incompatibilities exist. As I was personally |
|
|
1070 | bitten by this "JSON is YAML" lie, I refused and said I will continue to |
|
|
1071 | educate people about these issues, so others do not run into the same |
|
|
1072 | problem again and again. After this, Brian called me a (quote)I<complete |
|
|
1073 | and worthless idiot>(unquote). |
1058 | |
1074 | |
1059 | Even the author of this manpage was at some point accused of providing |
1075 | In my opinion, instead of pressuring and insulting people who actually |
1060 | "incorrect" information, despite the evidence presented (claims ranged |
1076 | clarify issues with YAML and the wrong statements of some of its |
1061 | from "your documentation contains inaccurate and negative statements about |
1077 | proponents, I would kindly suggest reading the JSON spec (which is not |
1062 | YAML" (the only negative comment is this footnote, and it didn't exist |
1078 | that difficult or long) and finally make YAML compatible to it, and |
1063 | back then; the question on which claims were inaccurate was never answered |
1079 | educating users about the changes, instead of spreading lies about the |
1064 | etc.) to "the YAML spec is not up-to-date" (the *real* and supposedly |
1080 | real compatibility for many I<years> and trying to silence people who |
1065 | JSON-compatible spec is apparently not currently publicly available) |
1081 | point out that it isn't true. |
1066 | to actual requests to replace this section by *incorrect* information, |
|
|
1067 | suppressing information about the real problem). |
|
|
1068 | |
|
|
1069 | So whenever you are told that YAML was a superset of JSON, first check |
|
|
1070 | wether it is really true (it might be when you check it, but it certainly |
|
|
1071 | was not true when this was written). I would much prefer if the YAML team |
|
|
1072 | would spent their time on actually making JSON compatibility a truth |
|
|
1073 | (JSON, after all, has a very small and simple specification) instead of |
|
|
1074 | trying to lobby/force people into reporting untruths. |
|
|
1075 | |
1082 | |
1076 | =back |
1083 | =back |
1077 | |
1084 | |
1078 | |
1085 | |
1079 | =head2 SPEED |
1086 | =head2 SPEED |
… | |
… | |
1081 | It seems that JSON::XS is surprisingly fast, as shown in the following |
1088 | It seems that JSON::XS is surprisingly fast, as shown in the following |
1082 | tables. They have been generated with the help of the C<eg/bench> program |
1089 | tables. They have been generated with the help of the C<eg/bench> program |
1083 | in the JSON::XS distribution, to make it easy to compare on your own |
1090 | in the JSON::XS distribution, to make it easy to compare on your own |
1084 | system. |
1091 | system. |
1085 | |
1092 | |
1086 | First comes a comparison between various modules using a very short |
1093 | First comes a comparison between various modules using |
1087 | single-line JSON string: |
1094 | a very short single-line JSON string (also available at |
|
|
1095 | L<http://dist.schmorp.de/misc/json/short.json>). |
1088 | |
1096 | |
1089 | {"method": "handleMessage", "params": ["user1", "we were just talking"], \ |
1097 | {"method": "handleMessage", "params": ["user1", "we were just talking"], \ |
1090 | "id": null, "array":[1,11,234,-5,1e5,1e7, true, false]} |
1098 | "id": null, "array":[1,11,234,-5,1e5,1e7, true, false]} |
1091 | |
1099 | |
1092 | It shows the number of encodes/decodes per second (JSON::XS uses |
1100 | It shows the number of encodes/decodes per second (JSON::XS uses |
… | |
… | |
1111 | about three times faster on decoding, and over forty times faster |
1119 | about three times faster on decoding, and over forty times faster |
1112 | than JSON, even with pretty-printing and key sorting. It also compares |
1120 | than JSON, even with pretty-printing and key sorting. It also compares |
1113 | favourably to Storable for small amounts of data. |
1121 | favourably to Storable for small amounts of data. |
1114 | |
1122 | |
1115 | Using a longer test string (roughly 18KB, generated from Yahoo! Locals |
1123 | Using a longer test string (roughly 18KB, generated from Yahoo! Locals |
1116 | search API (http://nanoref.com/yahooapis/mgPdGg): |
1124 | search API (L<http://dist.schmorp.de/misc/json/long.json>). |
1117 | |
1125 | |
1118 | module | encode | decode | |
1126 | module | encode | decode | |
1119 | -----------|------------|------------| |
1127 | -----------|------------|------------| |
1120 | JSON 1.x | 55.260 | 34.971 | |
1128 | JSON 1.x | 55.260 | 34.971 | |
1121 | JSON::DWIW | 825.228 | 1082.513 | |
1129 | JSON::DWIW | 825.228 | 1082.513 | |
… | |
… | |
1185 | =head1 THREADS |
1193 | =head1 THREADS |
1186 | |
1194 | |
1187 | This module is I<not> guaranteed to be thread safe and there are no |
1195 | This module is I<not> guaranteed to be thread safe and there are no |
1188 | plans to change this until Perl gets thread support (as opposed to the |
1196 | plans to change this until Perl gets thread support (as opposed to the |
1189 | horribly slow so-called "threads" which are simply slow and bloated |
1197 | horribly slow so-called "threads" which are simply slow and bloated |
1190 | process simulations - use fork, its I<much> faster, cheaper, better). |
1198 | process simulations - use fork, it's I<much> faster, cheaper, better). |
1191 | |
1199 | |
1192 | (It might actually work, but you have been warned). |
1200 | (It might actually work, but you have been warned). |
1193 | |
1201 | |
1194 | |
1202 | |
1195 | =head1 BUGS |
1203 | =head1 BUGS |
1196 | |
1204 | |
1197 | While the goal of this module is to be correct, that unfortunately does |
1205 | While the goal of this module is to be correct, that unfortunately does |
1198 | not mean its bug-free, only that I think its design is bug-free. It is |
1206 | not mean it's bug-free, only that I think its design is bug-free. It is |
1199 | still relatively early in its development. If you keep reporting bugs they |
1207 | still relatively early in its development. If you keep reporting bugs they |
1200 | will be fixed swiftly, though. |
1208 | will be fixed swiftly, though. |
1201 | |
1209 | |
1202 | Please refrain from using rt.cpan.org or any other bug reporting |
1210 | Please refrain from using rt.cpan.org or any other bug reporting |
1203 | service. I put the contact address into my modules for a reason. |
1211 | service. I put the contact address into my modules for a reason. |