ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/JSON-XS/XS.pm
(Generate patch)

Comparing JSON-XS/XS.pm (file contents):
Revision 1.110 by root, Sun Jul 20 17:55:19 2008 UTC vs.
Revision 1.115 by root, Tue Feb 17 23:29:38 2009 UTC

49to write yet another JSON module? While it seems there are many JSON 49to write yet another JSON module? While it seems there are many JSON
50modules, none of them correctly handle all corner cases, and in most cases 50modules, none of them correctly handle all corner cases, and in most cases
51their maintainers are unresponsive, gone missing, or not listening to bug 51their maintainers are unresponsive, gone missing, or not listening to bug
52reports for other reasons. 52reports for other reasons.
53 53
54See COMPARISON, below, for a comparison to some other JSON modules.
55
56See MAPPING, below, on how JSON::XS maps perl values to JSON values and 54See MAPPING, below, on how JSON::XS maps perl values to JSON values and
57vice versa. 55vice versa.
58 56
59=head2 FEATURES 57=head2 FEATURES
60 58
104package JSON::XS; 102package JSON::XS;
105 103
106no warnings; 104no warnings;
107use strict; 105use strict;
108 106
109our $VERSION = '2.2222'; 107our $VERSION = '2.232';
110our @ISA = qw(Exporter); 108our @ISA = qw(Exporter);
111 109
112our @EXPORT = qw(encode_json decode_json to_json from_json); 110our @EXPORT = qw(encode_json decode_json to_json from_json);
113 111
114sub to_json($) { 112sub to_json($) {
768JSON object or b) parsing multiple JSON objects separated by non-JSON text 766JSON object or b) parsing multiple JSON objects separated by non-JSON text
769(such as commas). 767(such as commas).
770 768
771=item $json->incr_skip 769=item $json->incr_skip
772 770
773This will reset the state of the incremental parser and will remove the 771This will reset the state of the incremental parser and will remove
774parsed text from the input buffer. This is useful after C<incr_parse> 772the parsed text from the input buffer so far. This is useful after
775died, in which case the input buffer and incremental parser state is left 773C<incr_parse> died, in which case the input buffer and incremental parser
776unchanged, to skip the text parsed so far and to reset the parse state. 774state is left unchanged, to skip the text parsed so far and to reset the
775parse state.
776
777The difference to C<incr_reset> is that only text until the parse error
778occured is removed.
777 779
778=item $json->incr_reset 780=item $json->incr_reset
779 781
780This completely resets the incremental parser, that is, after this call, 782This completely resets the incremental parser, that is, after this call,
781it will be as if the parser had never parsed anything. 783it will be as if the parser had never parsed anything.
782 784
783This is useful if you want ot repeatedly parse JSON objects and want to 785This is useful if you want to repeatedly parse JSON objects and want to
784ignore any trailing data, which means you have to reset the parser after 786ignore any trailing data, which means you have to reset the parser after
785each successful decode. 787each successful decode.
786 788
787=back 789=back
788 790
1181when your channel for JSON transfer is not 8-bit clean or the encoding 1183when your channel for JSON transfer is not 8-bit clean or the encoding
1182might be mangled in between (e.g. in mail), and works because ASCII is a 1184might be mangled in between (e.g. in mail), and works because ASCII is a
1183proper subset of most 8-bit and multibyte encodings in use in the world. 1185proper subset of most 8-bit and multibyte encodings in use in the world.
1184 1186
1185=back 1187=back
1188
1189
1190=head2 JSON and ECMAscript
1191
1192JSON syntax is based on how literals are represented in javascript (the
1193not-standardised predecessor of ECMAscript) which is presumably why it is
1194called "JavaScript Object Notation".
1195
1196However, JSON is not a subset (and also not a superset of course) of
1197ECMAscript (the standard) or javascript (whatever browsers actually
1198implement).
1199
1200If you want to use javascript's C<eval> function to "parse" JSON, you
1201might run into parse errors for valid JSON texts, or the resulting data
1202structure might not be queryable:
1203
1204One of the problems is that U+2028 and U+2029 are valid characters inside
1205JSON strings, but are not allowed in ECMAscript string literals, so the
1206following Perl fragment will not output something that can be guaranteed
1207to be parsable by javascript's C<eval>:
1208
1209 use JSON::XS;
1210
1211 print encode_json [chr 0x2028];
1212
1213The right fix for this is to use a proper JSON parser in your javascript
1214programs, and not rely on C<eval>.
1215
1216If this is not an option, you can, as a stop-gap measure, simply encode to
1217ASCII-only JSON:
1218
1219 use JSON::XS;
1220
1221 print JSON::XS->new->ascii->encode ([chr 0x2028]);
1222
1223And if you are concerned about the size of the resulting JSON text, you
1224can run some regexes to only escape U+2028 and U+2029:
1225
1226 use JSON::XS;
1227
1228 my $json = JSON::XS->new->utf8->encode ([chr 0x2028]);
1229 $json =~ s/\xe2\x80\xa8/\\u2028/g; # escape U+2028
1230 $json =~ s/\xe2\x80\xa9/\\u2029/g; # escape U+2029
1231 print $json;
1232
1233This works because U+2028/U+2029 are not allowed outside of strings and
1234are not used for syntax, so replacing them unconditionally just works.
1235
1236Note, however, that fixing the broken JSON parser is better than working
1237around it in every other generator. The above regexes should work well in
1238other languages, as long as they operate on UTF-8. It is equally valid to
1239replace all occurences of U+2028/2029 directly by their \\u-escaped forms
1240in unicode texts, so they can simply be used to fix any parsers relying on
1241C<eval> by first applying the regexes on the encoded texts.
1242
1243Another problem is that some javascript implementations reserve
1244some property names for their own purposes (which probably makes
1245them non-ECMAscript-compliant). For example, Iceweasel reserves the
1246C<__proto__> property name for it's own purposes.
1247
1248If that is a problem, you could parse try to filter the resulting JSON
1249output for these property strings, e.g.:
1250
1251 $json =~ s/"__proto__"\s*:/"__proto__renamed":/g;
1252
1253This works because C<__proto__> is not valid outside of strings, so every
1254occurence of C<"__proto__"\s*:> must be a string used as property name.
1255
1256If you know of other incompatibilities, please let me know.
1186 1257
1187 1258
1188=head2 JSON and YAML 1259=head2 JSON and YAML
1189 1260
1190You often hear that JSON is a subset of YAML. This is, however, a mass 1261You often hear that JSON is a subset of YAML. This is, however, a mass

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines