… | |
… | |
29 | |
29 | |
30 | DESCRIPTION |
30 | DESCRIPTION |
31 | This module converts Perl data structures to JSON and vice versa. Its |
31 | This module converts Perl data structures to JSON and vice versa. Its |
32 | primary goal is to be *correct* and its secondary goal is to be *fast*. |
32 | primary goal is to be *correct* and its secondary goal is to be *fast*. |
33 | To reach the latter goal it was written in C. |
33 | To reach the latter goal it was written in C. |
34 | |
|
|
35 | Beginning with version 2.0 of the JSON module, when both JSON and |
|
|
36 | JSON::XS are installed, then JSON will fall back on JSON::XS (this can |
|
|
37 | be overridden) with no overhead due to emulation (by inheriting |
|
|
38 | constructor and methods). If JSON::XS is not available, it will fall |
|
|
39 | back to the compatible JSON::PP module as backend, so using JSON instead |
|
|
40 | of JSON::XS gives you a portable JSON API that can be fast when you need |
|
|
41 | and doesn't require a C compiler when that is a problem. |
|
|
42 | |
|
|
43 | As this is the n-th-something JSON module on CPAN, what was the reason |
|
|
44 | to write yet another JSON module? While it seems there are many JSON |
|
|
45 | modules, none of them correctly handle all corner cases, and in most |
|
|
46 | cases their maintainers are unresponsive, gone missing, or not listening |
|
|
47 | to bug reports for other reasons. |
|
|
48 | |
34 | |
49 | See MAPPING, below, on how JSON::XS maps perl values to JSON values and |
35 | See MAPPING, below, on how JSON::XS maps perl values to JSON values and |
50 | vice versa. |
36 | vice versa. |
51 | |
37 | |
52 | FEATURES |
38 | FEATURES |
… | |
… | |
103 | $json_text = JSON::XS->new->utf8->encode ($perl_scalar) |
89 | $json_text = JSON::XS->new->utf8->encode ($perl_scalar) |
104 | |
90 | |
105 | Except being faster. |
91 | Except being faster. |
106 | |
92 | |
107 | $perl_scalar = decode_json $json_text |
93 | $perl_scalar = decode_json $json_text |
108 | The opposite of "encode_json": expects an UTF-8 (binary) string and |
94 | The opposite of "encode_json": expects a UTF-8 (binary) string and |
109 | tries to parse that as an UTF-8 encoded JSON text, returning the |
95 | tries to parse that as a UTF-8 encoded JSON text, returning the |
110 | resulting reference. Croaks on error. |
96 | resulting reference. Croaks on error. |
111 | |
97 | |
112 | This function call is functionally identical to: |
98 | This function call is functionally identical to: |
113 | |
99 | |
114 | $perl_scalar = JSON::XS->new->utf8->decode ($json_text) |
100 | $perl_scalar = JSON::XS->new->utf8->decode ($json_text) |
… | |
… | |
158 | The object oriented interface lets you configure your own encoding or |
144 | The object oriented interface lets you configure your own encoding or |
159 | decoding style, within the limits of supported formats. |
145 | decoding style, within the limits of supported formats. |
160 | |
146 | |
161 | $json = new JSON::XS |
147 | $json = new JSON::XS |
162 | Creates a new JSON::XS object that can be used to de/encode JSON |
148 | Creates a new JSON::XS object that can be used to de/encode JSON |
163 | strings. All boolean flags described below are by default |
149 | strings. All boolean flags described below are by default *disabled* |
164 | *disabled*. |
150 | (with the exception of "allow_nonref", which defaults to *enabled* |
|
|
151 | since version 4.0). |
165 | |
152 | |
166 | The mutators for flags all return the JSON object again and thus |
153 | The mutators for flags all return the JSON object again and thus |
167 | calls can be chained: |
154 | calls can be chained: |
168 | |
155 | |
169 | my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]}) |
156 | my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]}) |
… | |
… | |
225 | |
212 | |
226 | $json = $json->utf8 ([$enable]) |
213 | $json = $json->utf8 ([$enable]) |
227 | $enabled = $json->get_utf8 |
214 | $enabled = $json->get_utf8 |
228 | If $enable is true (or missing), then the "encode" method will |
215 | If $enable is true (or missing), then the "encode" method will |
229 | encode the JSON result into UTF-8, as required by many protocols, |
216 | encode the JSON result into UTF-8, as required by many protocols, |
230 | while the "decode" method expects to be handled an UTF-8-encoded |
217 | while the "decode" method expects to be handed a UTF-8-encoded |
231 | string. Please note that UTF-8-encoded strings do not contain any |
218 | string. Please note that UTF-8-encoded strings do not contain any |
232 | characters outside the range 0..255, they are thus useful for |
219 | characters outside the range 0..255, they are thus useful for |
233 | bytewise/binary I/O. In future versions, enabling this option might |
220 | bytewise/binary I/O. In future versions, enabling this option might |
234 | enable autodetection of the UTF-16 and UTF-32 encoding families, as |
221 | enable autodetection of the UTF-16 and UTF-32 encoding families, as |
235 | described in RFC4627. |
222 | described in RFC4627. |
… | |
… | |
314 | |
301 | |
315 | $json = $json->relaxed ([$enable]) |
302 | $json = $json->relaxed ([$enable]) |
316 | $enabled = $json->get_relaxed |
303 | $enabled = $json->get_relaxed |
317 | If $enable is true (or missing), then "decode" will accept some |
304 | If $enable is true (or missing), then "decode" will accept some |
318 | extensions to normal JSON syntax (see below). "encode" will not be |
305 | extensions to normal JSON syntax (see below). "encode" will not be |
319 | affected in anyway. *Be aware that this option makes you accept |
306 | affected in any way. *Be aware that this option makes you accept |
320 | invalid JSON texts as if they were valid!*. I suggest only to use |
307 | invalid JSON texts as if they were valid!*. I suggest only to use |
321 | this option to parse application-specific files written by humans |
308 | this option to parse application-specific files written by humans |
322 | (configuration files, resource files etc.) |
309 | (configuration files, resource files etc.) |
323 | |
310 | |
324 | If $enable is false (the default), then "decode" will only accept |
311 | If $enable is false (the default), then "decode" will only accept |
… | |
… | |
352 | [ |
339 | [ |
353 | 1, # this comment not allowed in JSON |
340 | 1, # this comment not allowed in JSON |
354 | # neither this one... |
341 | # neither this one... |
355 | ] |
342 | ] |
356 | |
343 | |
|
|
344 | * literal ASCII TAB characters in strings |
|
|
345 | |
|
|
346 | Literal ASCII TAB characters are now allowed in strings (and |
|
|
347 | treated as "\t"). |
|
|
348 | |
|
|
349 | [ |
|
|
350 | "Hello\tWorld", |
|
|
351 | "Hello<TAB>World", # literal <TAB> would not normally be allowed |
|
|
352 | ] |
|
|
353 | |
357 | $json = $json->canonical ([$enable]) |
354 | $json = $json->canonical ([$enable]) |
358 | $enabled = $json->get_canonical |
355 | $enabled = $json->get_canonical |
359 | If $enable is true (or missing), then the "encode" method will |
356 | If $enable is true (or missing), then the "encode" method will |
360 | output JSON objects by sorting their keys. This is adding a |
357 | output JSON objects by sorting their keys. This is adding a |
361 | comparatively high overhead. |
358 | comparatively high overhead. |
… | |
… | |
375 | |
372 | |
376 | This setting has currently no effect on tied hashes. |
373 | This setting has currently no effect on tied hashes. |
377 | |
374 | |
378 | $json = $json->allow_nonref ([$enable]) |
375 | $json = $json->allow_nonref ([$enable]) |
379 | $enabled = $json->get_allow_nonref |
376 | $enabled = $json->get_allow_nonref |
|
|
377 | Unlike other boolean options, this opotion is enabled by default |
|
|
378 | beginning with version 4.0. See "SECURITY CONSIDERATIONS" for the |
|
|
379 | gory details. |
|
|
380 | |
380 | If $enable is true (or missing), then the "encode" method can |
381 | If $enable is true (or missing), then the "encode" method can |
381 | convert a non-reference into its corresponding string, number or |
382 | convert a non-reference into its corresponding string, number or |
382 | null JSON value, which is an extension to RFC4627. Likewise, |
383 | null JSON value, which is an extension to RFC4627. Likewise, |
383 | "decode" will accept those JSON values instead of croaking. |
384 | "decode" will accept those JSON values instead of croaking. |
384 | |
385 | |
385 | If $enable is false, then the "encode" method will croak if it isn't |
386 | If $enable is false, then the "encode" method will croak if it isn't |
386 | passed an arrayref or hashref, as JSON texts must either be an |
387 | passed an arrayref or hashref, as JSON texts must either be an |
387 | object or array. Likewise, "decode" will croak if given something |
388 | object or array. Likewise, "decode" will croak if given something |
388 | that is not a JSON object or array. |
389 | that is not a JSON object or array. |
389 | |
390 | |
390 | Example, encode a Perl scalar as JSON value with enabled |
391 | Example, encode a Perl scalar as JSON value without enabled |
391 | "allow_nonref", resulting in an invalid JSON text: |
392 | "allow_nonref", resulting in an error: |
392 | |
393 | |
393 | JSON::XS->new->allow_nonref->encode ("Hello, World!") |
394 | JSON::XS->new->allow_nonref (0)->encode ("Hello, World!") |
394 | => "Hello, World!" |
395 | => hash- or arrayref expected... |
395 | |
396 | |
396 | $json = $json->allow_unknown ([$enable]) |
397 | $json = $json->allow_unknown ([$enable]) |
397 | $enabled = $json->get_allow_unknown |
398 | $enabled = $json->get_allow_unknown |
398 | If $enable is true (or missing), then "encode" will *not* throw an |
399 | If $enable is true (or missing), then "encode" will *not* throw an |
399 | exception when it encounters values it cannot represent in JSON (for |
400 | exception when it encounters values it cannot represent in JSON (for |
… | |
… | |
445 | this type of conversion. |
446 | this type of conversion. |
446 | |
447 | |
447 | This setting has no effect on "decode". |
448 | This setting has no effect on "decode". |
448 | |
449 | |
449 | $json = $json->allow_tags ([$enable]) |
450 | $json = $json->allow_tags ([$enable]) |
450 | $enabled = $json->allow_tags |
451 | $enabled = $json->get_allow_tags |
451 | See "OBJECT SERIALISATION" for details. |
452 | See "OBJECT SERIALISATION" for details. |
452 | |
453 | |
453 | If $enable is true (or missing), then "encode", upon encountering a |
454 | If $enable is true (or missing), then "encode", upon encountering a |
454 | blessed object, will check for the availability of the "FREEZE" |
455 | blessed object, will check for the availability of the "FREEZE" |
455 | method on the object's class. If found, it will be used to serialise |
456 | method on the object's class. If found, it will be used to serialise |
… | |
… | |
461 | |
462 | |
462 | If $enable is false (the default), then "encode" will not consider |
463 | If $enable is false (the default), then "encode" will not consider |
463 | this type of conversion, and tagged JSON values will cause a parse |
464 | this type of conversion, and tagged JSON values will cause a parse |
464 | error in "decode", as if tags were not part of the grammar. |
465 | error in "decode", as if tags were not part of the grammar. |
465 | |
466 | |
|
|
467 | $json->boolean_values ([$false, $true]) |
|
|
468 | ($false, $true) = $json->get_boolean_values |
|
|
469 | By default, JSON booleans will be decoded as overloaded |
|
|
470 | $Types::Serialiser::false and $Types::Serialiser::true objects. |
|
|
471 | |
|
|
472 | With this method you can specify your own boolean values for |
|
|
473 | decoding - on decode, JSON "false" will be decoded as a copy of |
|
|
474 | $false, and JSON "true" will be decoded as $true ("copy" here is the |
|
|
475 | same thing as assigning a value to another variable, i.e. "$copy = |
|
|
476 | $false"). |
|
|
477 | |
|
|
478 | Calling this method without any arguments will reset the booleans to |
|
|
479 | their default values. |
|
|
480 | |
|
|
481 | "get_boolean_values" will return both $false and $true values, or |
|
|
482 | the empty list when they are set to the default. |
|
|
483 | |
466 | $json = $json->filter_json_object ([$coderef->($hashref)]) |
484 | $json = $json->filter_json_object ([$coderef->($hashref)]) |
467 | When $coderef is specified, it will be called from "decode" each |
485 | When $coderef is specified, it will be called from "decode" each |
468 | time it decodes a JSON object. The only argument is a reference to |
486 | time it decodes a JSON object. The only argument is a reference to |
469 | the newly-created hash. If the code references returns a single |
487 | the newly-created hash. If the code reference returns a single |
470 | scalar (which need not be a reference), this value (i.e. a copy of |
488 | scalar (which need not be a reference), this value (or rather a copy |
471 | that scalar to avoid aliasing) is inserted into the deserialised |
489 | of it) is inserted into the deserialised data structure. If it |
472 | data structure. If it returns an empty list (NOTE: *not* "undef", |
490 | returns an empty list (NOTE: *not* "undef", which is a valid |
473 | which is a valid scalar), the original deserialised hash will be |
491 | scalar), the original deserialised hash will be inserted. This |
474 | inserted. This setting can slow down decoding considerably. |
492 | setting can slow down decoding considerably. |
475 | |
493 | |
476 | When $coderef is omitted or undefined, any existing callback will be |
494 | When $coderef is omitted or undefined, any existing callback will be |
477 | removed and "decode" will not change the deserialised hash in any |
495 | removed and "decode" will not change the deserialised hash in any |
478 | way. |
496 | way. |
479 | |
497 | |
… | |
… | |
622 | |
640 | |
623 | This is useful if your JSON texts are not delimited by an outer |
641 | This is useful if your JSON texts are not delimited by an outer |
624 | protocol and you need to know where the JSON text ends. |
642 | protocol and you need to know where the JSON text ends. |
625 | |
643 | |
626 | JSON::XS->new->decode_prefix ("[1] the tail") |
644 | JSON::XS->new->decode_prefix ("[1] the tail") |
627 | => ([], 3) |
645 | => ([1], 3) |
628 | |
646 | |
629 | INCREMENTAL PARSING |
647 | INCREMENTAL PARSING |
630 | In some cases, there is the need for incremental parsing of JSON texts. |
648 | In some cases, there is the need for incremental parsing of JSON texts. |
631 | While this module always has to keep both JSON text and resulting Perl |
649 | While this module always has to keep both JSON text and resulting Perl |
632 | data structure in memory at one time, it does allow you to parse a JSON |
650 | data structure in memory at one time, it does allow you to parse a JSON |
… | |
… | |
666 | can then use "incr_skip" to skip the erroneous part). This is the |
684 | can then use "incr_skip" to skip the erroneous part). This is the |
667 | most common way of using the method. |
685 | most common way of using the method. |
668 | |
686 | |
669 | And finally, in list context, it will try to extract as many objects |
687 | And finally, in list context, it will try to extract as many objects |
670 | from the stream as it can find and return them, or the empty list |
688 | from the stream as it can find and return them, or the empty list |
671 | otherwise. For this to work, there must be no separators between the |
689 | otherwise. For this to work, there must be no separators (other than |
672 | JSON objects or arrays, instead they must be concatenated |
690 | whitespace) between the JSON objects or arrays, instead they must be |
673 | back-to-back. If an error occurs, an exception will be raised as in |
691 | concatenated back-to-back. If an error occurs, an exception will be |
674 | the scalar context case. Note that in this case, any |
692 | raised as in the scalar context case. Note that in this case, any |
675 | previously-parsed JSON texts will be lost. |
693 | previously-parsed JSON texts will be lost. |
676 | |
694 | |
677 | Example: Parse some JSON arrays/objects in a given string and return |
695 | Example: Parse some JSON arrays/objects in a given string and return |
678 | them. |
696 | them. |
679 | |
697 | |
… | |
… | |
687 | function (I mean it. although in simple tests it might actually |
705 | function (I mean it. although in simple tests it might actually |
688 | work, it *will* fail under real world conditions). As a special |
706 | work, it *will* fail under real world conditions). As a special |
689 | exception, you can also call this method before having parsed |
707 | exception, you can also call this method before having parsed |
690 | anything. |
708 | anything. |
691 | |
709 | |
|
|
710 | That means you can only use this function to look at or manipulate |
|
|
711 | text before or after complete JSON objects, not while the parser is |
|
|
712 | in the middle of parsing a JSON object. |
|
|
713 | |
692 | This function is useful in two cases: a) finding the trailing text |
714 | This function is useful in two cases: a) finding the trailing text |
693 | after a JSON object or b) parsing multiple JSON objects separated by |
715 | after a JSON object or b) parsing multiple JSON objects separated by |
694 | non-JSON text (such as commas). |
716 | non-JSON text (such as commas). |
695 | |
717 | |
696 | $json->incr_skip |
718 | $json->incr_skip |
… | |
… | |
710 | This is useful if you want to repeatedly parse JSON objects and want |
732 | This is useful if you want to repeatedly parse JSON objects and want |
711 | to ignore any trailing data, which means you have to reset the |
733 | to ignore any trailing data, which means you have to reset the |
712 | parser after each successful decode. |
734 | parser after each successful decode. |
713 | |
735 | |
714 | LIMITATIONS |
736 | LIMITATIONS |
715 | All options that affect decoding are supported, except "allow_nonref". |
737 | The incremental parser is a non-exact parser: it works by gathering as |
716 | The reason for this is that it cannot be made to work sensibly: JSON |
738 | much text as possible that *could* be a valid JSON text, followed by |
717 | objects and arrays are self-delimited, i.e. you can concatenate them |
739 | trying to decode it. |
718 | back to back and still decode them perfectly. This does not hold true |
|
|
719 | for JSON numbers, however. |
|
|
720 | |
740 | |
721 | For example, is the string 1 a single JSON number, or is it simply the |
741 | That means it sometimes needs to read more data than strictly necessary |
722 | start of 12? Or is 12 a single JSON number, or the concatenation of 1 |
742 | to diagnose an invalid JSON text. For example, after parsing the |
723 | and 2? In neither case you can tell, and this is why JSON::XS takes the |
743 | following fragment, the parser *could* stop with an error, as this |
724 | conservative route and disallows this case. |
744 | fragment *cannot* be the beginning of a valid JSON text: |
|
|
745 | |
|
|
746 | [, |
|
|
747 | |
|
|
748 | In reality, hopwever, the parser might continue to read data until a |
|
|
749 | length limit is exceeded or it finds a closing bracket. |
725 | |
750 | |
726 | EXAMPLES |
751 | EXAMPLES |
727 | Some examples will make all this clearer. First, a simple example that |
752 | Some examples will make all this clearer. First, a simple example that |
728 | works similarly to "decode_prefix": We want to decode the JSON object at |
753 | works similarly to "decode_prefix": We want to decode the JSON object at |
729 | the start of a string and identify the portion after the JSON object: |
754 | the start of a string and identify the portion after the JSON object: |
… | |
… | |
1041 | more). These values and the paclkage/classname of the object will |
1066 | more). These values and the paclkage/classname of the object will |
1042 | then be encoded as a tagged JSON value in the following format: |
1067 | then be encoded as a tagged JSON value in the following format: |
1043 | |
1068 | |
1044 | ("classname")[FREEZE return values...] |
1069 | ("classname")[FREEZE return values...] |
1045 | |
1070 | |
|
|
1071 | e.g.: |
|
|
1072 | |
|
|
1073 | ("URI")["http://www.google.com/"] |
|
|
1074 | ("MyDate")[2013,10,29] |
|
|
1075 | ("ImageData::JPEG")["Z3...VlCg=="] |
|
|
1076 | |
1046 | For example, the hypothetical "My::Object" "FREEZE" method might use |
1077 | For example, the hypothetical "My::Object" "FREEZE" method might use |
1047 | the objects "type" and "id" members to encode the object: |
1078 | the objects "type" and "id" members to encode the object: |
1048 | |
1079 | |
1049 | sub My::Object::FREEZE { |
1080 | sub My::Object::FREEZE { |
1050 | my ($self, $serialiser) = @_; |
1081 | my ($self, $serialiser) = @_; |
… | |
… | |
1156 | will expect your input strings to be encoded as UTF-8, that is, no |
1187 | will expect your input strings to be encoded as UTF-8, that is, no |
1157 | "character" of the input string must have any value > 255, as UTF-8 |
1188 | "character" of the input string must have any value > 255, as UTF-8 |
1158 | does not allow that. |
1189 | does not allow that. |
1159 | |
1190 | |
1160 | The "utf8" flag therefore switches between two modes: disabled means |
1191 | The "utf8" flag therefore switches between two modes: disabled means |
1161 | you will get a Unicode string in Perl, enabled means you get an |
1192 | you will get a Unicode string in Perl, enabled means you get a UTF-8 |
1162 | UTF-8 encoded octet/binary string in Perl. |
1193 | encoded octet/binary string in Perl. |
1163 | |
1194 | |
1164 | "latin1" or "ascii" flags enabled |
1195 | "latin1" or "ascii" flags enabled |
1165 | With "latin1" (or "ascii") enabled, "encode" will escape characters |
1196 | With "latin1" (or "ascii") enabled, "encode" will escape characters |
1166 | with ordinal values > 255 (> 127 with "ascii") and encode the |
1197 | with ordinal values > 255 (> 127 with "ascii") and encode the |
1167 | remaining characters as specified by the "utf8" flag. |
1198 | remaining characters as specified by the "utf8" flag. |
… | |
… | |
1423 | to see whether you are vulnerable to some common attack vectors (which |
1454 | to see whether you are vulnerable to some common attack vectors (which |
1424 | really are browser design bugs, but it is still you who will have to |
1455 | really are browser design bugs, but it is still you who will have to |
1425 | deal with it, as major browser developers care only for features, not |
1456 | deal with it, as major browser developers care only for features, not |
1426 | about getting security right). |
1457 | about getting security right). |
1427 | |
1458 | |
|
|
1459 | "OLD" VS. "NEW" JSON (RFC4627 VS. RFC7159) |
|
|
1460 | JSON originally required JSON texts to represent an array or object - |
|
|
1461 | scalar values were explicitly not allowed. This has changed, and |
|
|
1462 | versions of JSON::XS beginning with 4.0 reflect this by allowing scalar |
|
|
1463 | values by default. |
|
|
1464 | |
|
|
1465 | One reason why one might not want this is that this removes a |
|
|
1466 | fundamental property of JSON texts, namely that they are self-delimited |
|
|
1467 | and self-contained, or in other words, you could take any number of |
|
|
1468 | "old" JSON texts and paste them together, and the result would be |
|
|
1469 | unambiguously parseable: |
|
|
1470 | |
|
|
1471 | [1,3]{"k":5}[][null] # four JSON texts, without doubt |
|
|
1472 | |
|
|
1473 | By allowing scalars, this property is lost: in the following example, is |
|
|
1474 | this one JSON text (the number 12) or two JSON texts (the numbers 1 and |
|
|
1475 | 2): |
|
|
1476 | |
|
|
1477 | 12 # could be 12, or 1 and 2 |
|
|
1478 | |
|
|
1479 | Another lost property of "old" JSON is that no lookahead is required to |
|
|
1480 | know the end of a JSON text, i.e. the JSON text definitely ended at the |
|
|
1481 | last "]" or "}" character, there was no need to read extra characters. |
|
|
1482 | |
|
|
1483 | For example, a viable network protocol with "old" JSON was to simply |
|
|
1484 | exchange JSON texts without delimiter. For "new" JSON, you have to use a |
|
|
1485 | suitable delimiter (such as a newline) after every JSON text or ensure |
|
|
1486 | you never encode/decode scalar values. |
|
|
1487 | |
|
|
1488 | Most protocols do work by only transferring arrays or objects, and the |
|
|
1489 | easiest way to avoid problems with the "new" JSON definition is to |
|
|
1490 | explicitly disallow scalar values in your encoder and decoder: |
|
|
1491 | |
|
|
1492 | $json_coder = JSON::XS->new->allow_nonref (0) |
|
|
1493 | |
|
|
1494 | This is a somewhat unhappy situation, and the blame can fully be put on |
|
|
1495 | JSON's inmventor, Douglas Crockford, who unilaterally changed the format |
|
|
1496 | in 2006 without consulting the IETF, forcing the IETF to either fork the |
|
|
1497 | format or go with it (as I was told, the IETF wasn't amused). |
|
|
1498 | |
|
|
1499 | RELATIONSHIP WITH I-JSON |
|
|
1500 | JSON is a somewhat sloppily-defined format - it carries around obvious |
|
|
1501 | Javascript baggage, such as not really defining number range, probably |
|
|
1502 | because Javascript only has one type of numbers: IEEE 64 bit floats |
|
|
1503 | ("binary64"). |
|
|
1504 | |
|
|
1505 | For this reaosn, RFC7493 defines "Internet JSON", which is a restricted |
|
|
1506 | subset of JSON that is supposedly more interoperable on the internet. |
|
|
1507 | |
|
|
1508 | While "JSON::XS" does not offer specific support for I-JSON, it of |
|
|
1509 | course accepts valid I-JSON and by default implements some of the |
|
|
1510 | limitations of I-JSON, such as parsing numbers as perl numbers, which |
|
|
1511 | are usually a superset of binary64 numbers. |
|
|
1512 | |
|
|
1513 | To generate I-JSON, follow these rules: |
|
|
1514 | |
|
|
1515 | * always generate UTF-8 |
|
|
1516 | |
|
|
1517 | I-JSON must be encoded in UTF-8, the default for "encode_json". |
|
|
1518 | |
|
|
1519 | * numbers should be within IEEE 754 binary64 range |
|
|
1520 | |
|
|
1521 | Basically all existing perl installations use binary64 to represent |
|
|
1522 | floating point numbers, so all you need to do is to avoid large |
|
|
1523 | integers. |
|
|
1524 | |
|
|
1525 | * objects must not have duplicate keys |
|
|
1526 | |
|
|
1527 | This is trivially done, as "JSON::XS" does not allow duplicate keys. |
|
|
1528 | |
|
|
1529 | * do not generate scalar JSON texts, use "->allow_nonref (0)" |
|
|
1530 | |
|
|
1531 | I-JSON strongly requests you to only encode arrays and objects into |
|
|
1532 | JSON. |
|
|
1533 | |
|
|
1534 | * times should be strings in ISO 8601 format |
|
|
1535 | |
|
|
1536 | There are a myriad of modules on CPAN dealing with ISO 8601 - search |
|
|
1537 | for "ISO8601" on CPAN and use one. |
|
|
1538 | |
|
|
1539 | * encode binary data as base64 |
|
|
1540 | |
|
|
1541 | While it's tempting to just dump binary data as a string (and let |
|
|
1542 | "JSON::XS" do the escaping), for I-JSON, it's *recommended* to |
|
|
1543 | encode binary data as base64. |
|
|
1544 | |
|
|
1545 | There are some other considerations - read RFC7493 for the details if |
|
|
1546 | interested. |
|
|
1547 | |
1428 | INTEROPERABILITY WITH OTHER MODULES |
1548 | INTEROPERABILITY WITH OTHER MODULES |
1429 | "JSON::XS" uses the Types::Serialiser module to provide boolean |
1549 | "JSON::XS" uses the Types::Serialiser module to provide boolean |
1430 | constants. That means that the JSON true and false values will be |
1550 | constants. That means that the JSON true and false values will be |
1431 | comaptible to true and false values of iother modules that do the same, |
1551 | comaptible to true and false values of other modules that do the same, |
1432 | such as JSON::PP and CBOR::XS. |
1552 | such as JSON::PP and CBOR::XS. |
1433 | |
1553 | |
|
|
1554 | INTEROPERABILITY WITH OTHER JSON DECODERS |
|
|
1555 | As long as you only serialise data that can be directly expressed in |
|
|
1556 | JSON, "JSON::XS" is incapable of generating invalid JSON output (modulo |
|
|
1557 | bugs, but "JSON::XS" has found more bugs in the official JSON testsuite |
|
|
1558 | (1) than the official JSON testsuite has found in "JSON::XS" (0)). |
|
|
1559 | |
|
|
1560 | When you have trouble decoding JSON generated by this module using other |
|
|
1561 | decoders, then it is very likely that you have an encoding mismatch or |
|
|
1562 | the other decoder is broken. |
|
|
1563 | |
|
|
1564 | When decoding, "JSON::XS" is strict by default and will likely catch all |
|
|
1565 | errors. There are currently two settings that change this: "relaxed" |
|
|
1566 | makes "JSON::XS" accept (but not generate) some non-standard extensions, |
|
|
1567 | and "allow_tags" will allow you to encode and decode Perl objects, at |
|
|
1568 | the cost of not outputting valid JSON anymore. |
|
|
1569 | |
|
|
1570 | TAGGED VALUE SYNTAX AND STANDARD JSON EN/DECODERS |
|
|
1571 | When you use "allow_tags" to use the extended (and also nonstandard and |
|
|
1572 | invalid) JSON syntax for serialised objects, and you still want to |
|
|
1573 | decode the generated When you want to serialise objects, you can run a |
|
|
1574 | regex to replace the tagged syntax by standard JSON arrays (it only |
|
|
1575 | works for "normal" package names without comma, newlines or single |
|
|
1576 | colons). First, the readable Perl version: |
|
|
1577 | |
|
|
1578 | # if your FREEZE methods return no values, you need this replace first: |
|
|
1579 | $json =~ s/\( \s* (" (?: [^\\":,]+|\\.|::)* ") \s* \) \s* \[\s*\]/[$1]/gx; |
|
|
1580 | |
|
|
1581 | # this works for non-empty constructor arg lists: |
|
|
1582 | $json =~ s/\( \s* (" (?: [^\\":,]+|\\.|::)* ") \s* \) \s* \[/[$1,/gx; |
|
|
1583 | |
|
|
1584 | And here is a less readable version that is easy to adapt to other |
|
|
1585 | languages: |
|
|
1586 | |
|
|
1587 | $json =~ s/\(\s*("([^\\":,]+|\\.|::)*")\s*\)\s*\[/[$1,/g; |
|
|
1588 | |
|
|
1589 | Here is an ECMAScript version (same regex): |
|
|
1590 | |
|
|
1591 | json = json.replace (/\(\s*("([^\\":,]+|\\.|::)*")\s*\)\s*\[/g, "[$1,"); |
|
|
1592 | |
|
|
1593 | Since this syntax converts to standard JSON arrays, it might be hard to |
|
|
1594 | distinguish serialised objects from normal arrays. You can prepend a |
|
|
1595 | "magic number" as first array element to reduce chances of a collision: |
|
|
1596 | |
|
|
1597 | $json =~ s/\(\s*("([^\\":,]+|\\.|::)*")\s*\)\s*\[/["XU1peReLzT4ggEllLanBYq4G9VzliwKF",$1,/g; |
|
|
1598 | |
|
|
1599 | And after decoding the JSON text, you could walk the data structure |
|
|
1600 | looking for arrays with a first element of |
|
|
1601 | "XU1peReLzT4ggEllLanBYq4G9VzliwKF". |
|
|
1602 | |
|
|
1603 | The same approach can be used to create the tagged format with another |
|
|
1604 | encoder. First, you create an array with the magic string as first |
|
|
1605 | member, the classname as second, and constructor arguments last, encode |
|
|
1606 | it as part of your JSON structure, and then: |
|
|
1607 | |
|
|
1608 | $json =~ s/\[\s*"XU1peReLzT4ggEllLanBYq4G9VzliwKF"\s*,\s*("([^\\":,]+|\\.|::)*")\s*,/($1)[/g; |
|
|
1609 | |
|
|
1610 | Again, this has some limitations - the magic string must not be encoded |
|
|
1611 | with character escapes, and the constructor arguments must be non-empty. |
|
|
1612 | |
1434 | THREADS |
1613 | (I-)THREADS |
1435 | This module is *not* guaranteed to be thread safe and there are no plans |
1614 | This module is *not* guaranteed to be ithread (or MULTIPLICITY-) safe |
1436 | to change this until Perl gets thread support (as opposed to the |
1615 | and there are no plans to change this. Note that perl's builtin |
1437 | horribly slow so-called "threads" which are simply slow and bloated |
1616 | so-called threads/ithreads are officially deprecated and should not be |
1438 | process simulations - use fork, it's *much* faster, cheaper, better). |
1617 | used. |
1439 | |
|
|
1440 | (It might actually work, but you have been warned). |
|
|
1441 | |
1618 | |
1442 | THE PERILS OF SETLOCALE |
1619 | THE PERILS OF SETLOCALE |
1443 | Sometimes people avoid the Perl locale support and directly call the |
1620 | Sometimes people avoid the Perl locale support and directly call the |
1444 | system's setlocale function with "LC_ALL". |
1621 | system's setlocale function with "LC_ALL". |
1445 | |
1622 | |
… | |
… | |
1453 | |
1630 | |
1454 | If you need "LC_NUMERIC", you should enable it only around the code that |
1631 | If you need "LC_NUMERIC", you should enable it only around the code that |
1455 | actually needs it (avoiding stringification of numbers), and restore it |
1632 | actually needs it (avoiding stringification of numbers), and restore it |
1456 | afterwards. |
1633 | afterwards. |
1457 | |
1634 | |
|
|
1635 | SOME HISTORY |
|
|
1636 | At the time this module was created there already were a number of JSON |
|
|
1637 | modules available on CPAN, so what was the reason to write yet another |
|
|
1638 | JSON module? While it seems there are many JSON modules, none of them |
|
|
1639 | correctly handled all corner cases, and in most cases their maintainers |
|
|
1640 | are unresponsive, gone missing, or not listening to bug reports for |
|
|
1641 | other reasons. |
|
|
1642 | |
|
|
1643 | Beginning with version 2.0 of the JSON module, when both JSON and |
|
|
1644 | JSON::XS are installed, then JSON will fall back on JSON::XS (this can |
|
|
1645 | be overridden) with no overhead due to emulation (by inheriting |
|
|
1646 | constructor and methods). If JSON::XS is not available, it will fall |
|
|
1647 | back to the compatible JSON::PP module as backend, so using JSON instead |
|
|
1648 | of JSON::XS gives you a portable JSON API that can be fast when you need |
|
|
1649 | it and doesn't require a C compiler when that is a problem. |
|
|
1650 | |
|
|
1651 | Somewhere around version 3, this module was forked into |
|
|
1652 | "Cpanel::JSON::XS", because its maintainer had serious trouble |
|
|
1653 | understanding JSON and insisted on a fork with many bugs "fixed" that |
|
|
1654 | weren't actually bugs, while spreading FUD about this module without |
|
|
1655 | actually giving any details on his accusations. You be the judge, but in |
|
|
1656 | my personal opinion, if you want quality, you will stay away from |
|
|
1657 | dangerous forks like that. |
|
|
1658 | |
1458 | BUGS |
1659 | BUGS |
1459 | While the goal of this module is to be correct, that unfortunately does |
1660 | While the goal of this module is to be correct, that unfortunately does |
1460 | not mean it's bug-free, only that I think its design is bug-free. If you |
1661 | not mean it's bug-free, only that I think its design is bug-free. If you |
1461 | keep reporting bugs they will be fixed swiftly, though. |
1662 | keep reporting bugs they will be fixed swiftly, though. |
1462 | |
1663 | |