ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/JSON-XS/README
(Generate patch)

Comparing JSON-XS/README (file contents):
Revision 1.26 by root, Tue Jun 3 06:43:45 2008 UTC vs.
Revision 1.34 by root, Thu Mar 11 17:36:09 2010 UTC

20 $perl_scalar = $coder->decode ($unicode_json_text); 20 $perl_scalar = $coder->decode ($unicode_json_text);
21 21
22 # Note that JSON version 2.0 and above will automatically use JSON::XS 22 # Note that JSON version 2.0 and above will automatically use JSON::XS
23 # if available, at virtually no speed overhead either, so you should 23 # if available, at virtually no speed overhead either, so you should
24 # be able to just: 24 # be able to just:
25 25
26 use JSON; 26 use JSON;
27 27
28 # and do the same things, except that you have a pure-perl fallback now. 28 # and do the same things, except that you have a pure-perl fallback now.
29 29
30DESCRIPTION 30DESCRIPTION
31 This module converts Perl data structures to JSON and vice versa. Its 31 This module converts Perl data structures to JSON and vice versa. Its
43 As this is the n-th-something JSON module on CPAN, what was the reason 43 As this is the n-th-something JSON module on CPAN, what was the reason
44 to write yet another JSON module? While it seems there are many JSON 44 to write yet another JSON module? While it seems there are many JSON
45 modules, none of them correctly handle all corner cases, and in most 45 modules, none of them correctly handle all corner cases, and in most
46 cases their maintainers are unresponsive, gone missing, or not listening 46 cases their maintainers are unresponsive, gone missing, or not listening
47 to bug reports for other reasons. 47 to bug reports for other reasons.
48
49 See COMPARISON, below, for a comparison to some other JSON modules.
50 48
51 See MAPPING, below, on how JSON::XS maps perl values to JSON values and 49 See MAPPING, below, on how JSON::XS maps perl values to JSON values and
52 vice versa. 50 vice versa.
53 51
54 FEATURES 52 FEATURES
379 it is disabled, the same hash might be encoded differently even if 377 it is disabled, the same hash might be encoded differently even if
380 contains the same data, as key-value pairs have no inherent ordering 378 contains the same data, as key-value pairs have no inherent ordering
381 in Perl. 379 in Perl.
382 380
383 This setting has no effect when decoding JSON texts. 381 This setting has no effect when decoding JSON texts.
382
383 This setting has currently no effect on tied hashes.
384 384
385 $json = $json->allow_nonref ([$enable]) 385 $json = $json->allow_nonref ([$enable])
386 $enabled = $json->get_allow_nonref 386 $enabled = $json->get_allow_nonref
387 If $enable is true (or missing), then the "encode" method can 387 If $enable is true (or missing), then the "encode" method can
388 convert a non-reference into its corresponding string, number or 388 convert a non-reference into its corresponding string, number or
629 While this module always has to keep both JSON text and resulting Perl 629 While this module always has to keep both JSON text and resulting Perl
630 data structure in memory at one time, it does allow you to parse a JSON 630 data structure in memory at one time, it does allow you to parse a JSON
631 stream incrementally. It does so by accumulating text until it has a 631 stream incrementally. It does so by accumulating text until it has a
632 full JSON object, which it then can decode. This process is similar to 632 full JSON object, which it then can decode. This process is similar to
633 using "decode_prefix" to see if a full JSON object is available, but is 633 using "decode_prefix" to see if a full JSON object is available, but is
634 much more efficient (JSON::XS will only attempt to parse the JSON text 634 much more efficient (and can be implemented with a minimum of method
635 once it is sure it has enough text to get a decisive result, using a 635 calls).
636 very simple but truly incremental parser).
637 636
638 The following two methods deal with this. 637 JSON::XS will only attempt to parse the JSON text once it is sure it has
638 enough text to get a decisive result, using a very simple but truly
639 incremental parser. This means that it sometimes won't stop as early as
640 the full parser, for example, it doesn't detect parenthese mismatches.
641 The only thing it guarantees is that it starts decoding as soon as a
642 syntactically valid JSON text has been seen. This means you need to set
643 resource limits (e.g. "max_size") to ensure the parser will stop parsing
644 in the presence if syntax errors.
645
646 The following methods implement this incremental parser.
639 647
640 [void, scalar or list context] = $json->incr_parse ([$string]) 648 [void, scalar or list context] = $json->incr_parse ([$string])
641 This is the central parsing function. It can both append new text 649 This is the central parsing function. It can both append new text
642 and extract objects from the stream accumulated so far (both of 650 and extract objects from the stream accumulated so far (both of
643 these functions are optional). 651 these functions are optional).
678 after a JSON object or b) parsing multiple JSON objects separated by 686 after a JSON object or b) parsing multiple JSON objects separated by
679 non-JSON text (such as commas). 687 non-JSON text (such as commas).
680 688
681 $json->incr_skip 689 $json->incr_skip
682 This will reset the state of the incremental parser and will remove 690 This will reset the state of the incremental parser and will remove
683 the parsed text from the input buffer. This is useful after 691 the parsed text from the input buffer so far. This is useful after
684 "incr_parse" died, in which case the input buffer and incremental 692 "incr_parse" died, in which case the input buffer and incremental
685 parser state is left unchanged, to skip the text parsed so far and 693 parser state is left unchanged, to skip the text parsed so far and
686 to reset the parse state. 694 to reset the parse state.
687 695
696 The difference to "incr_reset" is that only text until the parse
697 error occured is removed.
698
688 $json->incr_reset 699 $json->incr_reset
689 This completely resets the incremental parser, that is, after this 700 This completely resets the incremental parser, that is, after this
690 call, it will be as if the parser had never parsed anything. 701 call, it will be as if the parser had never parsed anything.
691 702
692 This is useful if you want ot repeatedly parse JSON objects and want 703 This is useful if you want to repeatedly parse JSON objects and want
693 to ignore any trailing data, which means you have to reset the 704 to ignore any trailing data, which means you have to reset the
694 parser after each successful decode. 705 parser after each successful decode.
695 706
696 LIMITATIONS 707 LIMITATIONS
697 All options that affect decoding are supported, except "allow_nonref". 708 All options that affect decoding are supported, except "allow_nonref".
1063 structure back. This is useful when your channel for JSON transfer 1074 structure back. This is useful when your channel for JSON transfer
1064 is not 8-bit clean or the encoding might be mangled in between (e.g. 1075 is not 8-bit clean or the encoding might be mangled in between (e.g.
1065 in mail), and works because ASCII is a proper subset of most 8-bit 1076 in mail), and works because ASCII is a proper subset of most 8-bit
1066 and multibyte encodings in use in the world. 1077 and multibyte encodings in use in the world.
1067 1078
1079 JSON and ECMAscript
1080 JSON syntax is based on how literals are represented in javascript (the
1081 not-standardised predecessor of ECMAscript) which is presumably why it
1082 is called "JavaScript Object Notation".
1083
1084 However, JSON is not a subset (and also not a superset of course) of
1085 ECMAscript (the standard) or javascript (whatever browsers actually
1086 implement).
1087
1088 If you want to use javascript's "eval" function to "parse" JSON, you
1089 might run into parse errors for valid JSON texts, or the resulting data
1090 structure might not be queryable:
1091
1092 One of the problems is that U+2028 and U+2029 are valid characters
1093 inside JSON strings, but are not allowed in ECMAscript string literals,
1094 so the following Perl fragment will not output something that can be
1095 guaranteed to be parsable by javascript's "eval":
1096
1097 use JSON::XS;
1098
1099 print encode_json [chr 0x2028];
1100
1101 The right fix for this is to use a proper JSON parser in your javascript
1102 programs, and not rely on "eval" (see for example Douglas Crockford's
1103 json2.js parser).
1104
1105 If this is not an option, you can, as a stop-gap measure, simply encode
1106 to ASCII-only JSON:
1107
1108 use JSON::XS;
1109
1110 print JSON::XS->new->ascii->encode ([chr 0x2028]);
1111
1112 Note that this will enlarge the resulting JSON text quite a bit if you
1113 have many non-ASCII characters. You might be tempted to run some regexes
1114 to only escape U+2028 and U+2029, e.g.:
1115
1116 # DO NOT USE THIS!
1117 my $json = JSON::XS->new->utf8->encode ([chr 0x2028]);
1118 $json =~ s/\xe2\x80\xa8/\\u2028/g; # escape U+2028
1119 $json =~ s/\xe2\x80\xa9/\\u2029/g; # escape U+2029
1120 print $json;
1121
1122 Note that *this is a bad idea*: the above only works for U+2028 and
1123 U+2029 and thus only for fully ECMAscript-compliant parsers. Many
1124 existing javascript implementations, however, have issues with other
1125 characters as well - using "eval" naively simply *will* cause problems.
1126
1127 Another problem is that some javascript implementations reserve some
1128 property names for their own purposes (which probably makes them
1129 non-ECMAscript-compliant). For example, Iceweasel reserves the
1130 "__proto__" property name for it's own purposes.
1131
1132 If that is a problem, you could parse try to filter the resulting JSON
1133 output for these property strings, e.g.:
1134
1135 $json =~ s/"__proto__"\s*:/"__proto__renamed":/g;
1136
1137 This works because "__proto__" is not valid outside of strings, so every
1138 occurence of ""__proto__"\s*:" must be a string used as property name.
1139
1140 If you know of other incompatibilities, please let me know.
1141
1068 JSON and YAML 1142 JSON and YAML
1069 You often hear that JSON is a subset of YAML. This is, however, a mass 1143 You often hear that JSON is a subset of YAML. This is, however, a mass
1070 hysteria(*) and very far from the truth (as of the time of this 1144 hysteria(*) and very far from the truth (as of the time of this
1071 writing), so let me state it clearly: *in general, there is no way to 1145 writing), so let me state it clearly: *in general, there is no way to
1072 configure JSON::XS to output a data structure as valid YAML* that works 1146 configure JSON::XS to output a data structure as valid YAML* that works
1079 my $yaml = $to_yaml->encode ($ref) . "\n"; 1153 my $yaml = $to_yaml->encode ($ref) . "\n";
1080 1154
1081 This will *usually* generate JSON texts that also parse as valid YAML. 1155 This will *usually* generate JSON texts that also parse as valid YAML.
1082 Please note that YAML has hardcoded limits on (simple) object key 1156 Please note that YAML has hardcoded limits on (simple) object key
1083 lengths that JSON doesn't have and also has different and incompatible 1157 lengths that JSON doesn't have and also has different and incompatible
1084 unicode handling, so you should make sure that your hash keys are 1158 unicode character escape syntax, so you should make sure that your hash
1085 noticeably shorter than the 1024 "stream characters" YAML allows and 1159 keys are noticeably shorter than the 1024 "stream characters" YAML
1086 that you do not have characters with codepoint values outside the 1160 allows and that you do not have characters with codepoint values outside
1087 Unicode BMP (basic multilingual page). YAML also does not allow "\/" 1161 the Unicode BMP (basic multilingual page). YAML also does not allow "\/"
1088 sequences in strings (which JSON::XS does not *currently* generate, but 1162 sequences in strings (which JSON::XS does not *currently* generate, but
1089 other JSON generators might). 1163 other JSON generators might).
1090 1164
1091 There might be other incompatibilities that I am not aware of (or the 1165 There might be other incompatibilities that I am not aware of (or the
1092 YAML specification has been changed yet again - it does so quite often). 1166 YAML specification has been changed yet again - it does so quite often).
1109 (which is not that difficult or long) and finally make YAML 1183 (which is not that difficult or long) and finally make YAML
1110 compatible to it, and educating users about the changes, instead of 1184 compatible to it, and educating users about the changes, instead of
1111 spreading lies about the real compatibility for many *years* and 1185 spreading lies about the real compatibility for many *years* and
1112 trying to silence people who point out that it isn't true. 1186 trying to silence people who point out that it isn't true.
1113 1187
1188 Addendum/2009: the YAML 1.2 spec is still incomaptible with JSON,
1189 even though the incompatibilities have been documented (and are
1190 known to Brian) for many years and the spec makes explicit claims
1191 that YAML is a superset of JSON. It would be so easy to fix, but
1192 apparently, bullying and corrupting userdata is so much easier.
1193
1114 SPEED 1194 SPEED
1115 It seems that JSON::XS is surprisingly fast, as shown in the following 1195 It seems that JSON::XS is surprisingly fast, as shown in the following
1116 tables. They have been generated with the help of the "eg/bench" program 1196 tables. They have been generated with the help of the "eg/bench" program
1117 in the JSON::XS distribution, to make it easy to compare on your own 1197 in the JSON::XS distribution, to make it easy to compare on your own
1118 system. 1198 system.
1121 single-line JSON string (also available at 1201 single-line JSON string (also available at
1122 <http://dist.schmorp.de/misc/json/short.json>). 1202 <http://dist.schmorp.de/misc/json/short.json>).
1123 1203
1124 {"method": "handleMessage", "params": ["user1", 1204 {"method": "handleMessage", "params": ["user1",
1125 "we were just talking"], "id": null, "array":[1,11,234,-5,1e5,1e7, 1205 "we were just talking"], "id": null, "array":[1,11,234,-5,1e5,1e7,
1126 true, false]} 1206 1, 0]}
1127 1207
1128 It shows the number of encodes/decodes per second (JSON::XS uses the 1208 It shows the number of encodes/decodes per second (JSON::XS uses the
1129 functional interface, while JSON::XS/2 uses the OO interface with 1209 functional interface, while JSON::XS/2 uses the OO interface with
1130 pretty-printing and hashkey sorting enabled, JSON::XS/3 enables shrink). 1210 pretty-printing and hashkey sorting enabled, JSON::XS/3 enables shrink.
1131 Higher is better: 1211 JSON::DWIW/DS uses the deserialise function, while JSON::DWIW::FJ uses
1212 the from_json method). Higher is better:
1132 1213
1133 module | encode | decode | 1214 module | encode | decode |
1134 -----------|------------|------------| 1215 --------------|------------|------------|
1135 JSON 1.x | 4990.842 | 4088.813 | 1216 JSON::DWIW/DS | 86302.551 | 102300.098 |
1136 JSON::DWIW | 51653.990 | 71575.154 | 1217 JSON::DWIW/FJ | 86302.551 | 75983.768 |
1137 JSON::PC | 65948.176 | 74631.744 | 1218 JSON::PP | 15827.562 | 6638.658 |
1138 JSON::PP | 8931.652 | 3817.168 | 1219 JSON::Syck | 63358.066 | 47662.545 |
1139 JSON::Syck | 24877.248 | 27776.848 | 1220 JSON::XS | 511500.488 | 511500.488 |
1140 JSON::XS | 388361.481 | 227951.304 | 1221 JSON::XS/2 | 291271.111 | 388361.481 |
1141 JSON::XS/2 | 227951.304 | 218453.333 | 1222 JSON::XS/3 | 361577.931 | 361577.931 |
1142 JSON::XS/3 | 338250.323 | 218453.333 | 1223 Storable | 66788.280 | 265462.278 |
1143 Storable | 16500.016 | 135300.129 |
1144 -----------+------------+------------+ 1224 --------------+------------+------------+
1145 1225
1146 That is, JSON::XS is about five times faster than JSON::DWIW on 1226 That is, JSON::XS is almost six times faster than JSON::DWIW on
1147 encoding, about three times faster on decoding, and over forty times 1227 encoding, about five times faster on decoding, and over thirty to
1148 faster than JSON, even with pretty-printing and key sorting. It also 1228 seventy times faster than JSON's pure perl implementation. It also
1149 compares favourably to Storable for small amounts of data. 1229 compares favourably to Storable for small amounts of data.
1150 1230
1151 Using a longer test string (roughly 18KB, generated from Yahoo! Locals 1231 Using a longer test string (roughly 18KB, generated from Yahoo! Locals
1152 search API (<http://dist.schmorp.de/misc/json/long.json>). 1232 search API (<http://dist.schmorp.de/misc/json/long.json>).
1153 1233
1154 module | encode | decode | 1234 module | encode | decode |
1155 -----------|------------|------------| 1235 --------------|------------|------------|
1156 JSON 1.x | 55.260 | 34.971 | 1236 JSON::DWIW/DS | 1647.927 | 2673.916 |
1157 JSON::DWIW | 825.228 | 1082.513 | 1237 JSON::DWIW/FJ | 1630.249 | 2596.128 |
1158 JSON::PC | 3571.444 | 2394.829 |
1159 JSON::PP | 210.987 | 32.574 | 1238 JSON::PP | 400.640 | 62.311 |
1160 JSON::Syck | 552.551 | 787.544 | 1239 JSON::Syck | 1481.040 | 1524.869 |
1161 JSON::XS | 5780.463 | 4854.519 | 1240 JSON::XS | 20661.596 | 9541.183 |
1162 JSON::XS/2 | 3869.998 | 4798.975 | 1241 JSON::XS/2 | 10683.403 | 9416.938 |
1163 JSON::XS/3 | 5862.880 | 4798.975 | 1242 JSON::XS/3 | 20661.596 | 9400.054 |
1164 Storable | 4445.002 | 5235.027 | 1243 Storable | 19765.806 | 10000.725 |
1165 -----------+------------+------------+ 1244 --------------+------------+------------+
1166 1245
1167 Again, JSON::XS leads by far (except for Storable which non-surprisingly 1246 Again, JSON::XS leads by far (except for Storable which non-surprisingly
1168 decodes faster). 1247 decodes a bit faster).
1169 1248
1170 On large strings containing lots of high Unicode characters, some 1249 On large strings containing lots of high Unicode characters, some
1171 modules (such as JSON::PC) seem to decode faster than JSON::XS, but the 1250 modules (such as JSON::PC) seem to decode faster than JSON::XS, but the
1172 result will be broken due to missing (or wrong) Unicode handling. Others 1251 result will be broken due to missing (or wrong) Unicode handling. Others
1173 refuse to decode or encode properly, so it was impossible to prepare a 1252 refuse to decode or encode properly, so it was impossible to prepare a
1208 information you might want to make sure that exceptions thrown by 1287 information you might want to make sure that exceptions thrown by
1209 JSON::XS will not end up in front of untrusted eyes. 1288 JSON::XS will not end up in front of untrusted eyes.
1210 1289
1211 If you are using JSON::XS to return packets to consumption by JavaScript 1290 If you are using JSON::XS to return packets to consumption by JavaScript
1212 scripts in a browser you should have a look at 1291 scripts in a browser you should have a look at
1213 <http://jpsykes.com/47/practical-csrf-and-json-security> to see whether 1292 <http://blog.archive.jpsykes.com/47/practical-csrf-and-json-security/>
1214 you are vulnerable to some common attack vectors (which really are 1293 to see whether you are vulnerable to some common attack vectors (which
1215 browser design bugs, but it is still you who will have to deal with it, 1294 really are browser design bugs, but it is still you who will have to
1216 as major browser developers care only for features, not about getting 1295 deal with it, as major browser developers care only for features, not
1217 security right). 1296 about getting security right).
1218 1297
1219THREADS 1298THREADS
1220 This module is *not* guaranteed to be thread safe and there are no plans 1299 This module is *not* guaranteed to be thread safe and there are no plans
1221 to change this until Perl gets thread support (as opposed to the 1300 to change this until Perl gets thread support (as opposed to the
1222 horribly slow so-called "threads" which are simply slow and bloated 1301 horribly slow so-called "threads" which are simply slow and bloated

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines