1 | NAME |
1 | NAME |
2 | Convert::Scalar - convert between different representations of perl |
2 | JSON::XS - JSON serialising/deserialising, done correctly and fast |
3 | scalars |
|
|
4 | |
3 | |
5 | SYNOPSIS |
4 | SYNOPSIS |
6 | use Convert::Scalar; |
5 | use JSON::XS; |
7 | |
6 | |
8 | DESCRIPTION |
7 | DESCRIPTION |
9 | This module exports various internal perl methods that change the |
8 | This module converts Perl data structures to JSON and vice versa. Its |
10 | internal representation or state of a perl scalar. All of these work |
9 | primary goal is to be *correct* and its secondary goal is to be *fast*. |
11 | in-place, that is, they modify their scalar argument. No functions are |
10 | To reach the latter goal it was written in C. |
|
|
11 | |
|
|
12 | As this is the n-th-something JSON module on CPAN, what was the reason |
|
|
13 | to write yet another JSON module? While it seems there are many JSON |
|
|
14 | modules, none of them correctly handle all corner cases, and in most |
|
|
15 | cases their maintainers are unresponsive, gone missing, or not listening |
|
|
16 | to bug reports for other reasons. |
|
|
17 | |
|
|
18 | See COMPARISON, below, for a comparison to some other JSON modules. |
|
|
19 | |
|
|
20 | FEATURES |
|
|
21 | * correct handling of unicode issues |
|
|
22 | This module knows how to handle Unicode, and even documents how it |
|
|
23 | does so. |
|
|
24 | |
|
|
25 | * round-trip integrity |
|
|
26 | When you serialise a perl data structure using only datatypes |
|
|
27 | supported by JSON, the deserialised data structure is identical on |
|
|
28 | the Perl level. (e.g. the string "2.0" doesn't suddenly become "2"). |
|
|
29 | |
|
|
30 | * strict checking of JSON correctness |
|
|
31 | There is no guessing, no generating of illegal JSON strings by |
|
|
32 | default, and only JSON is accepted as input (the latter is a |
|
|
33 | security feature). |
|
|
34 | |
|
|
35 | * fast |
|
|
36 | compared to other JSON modules, this module compares favourably. |
|
|
37 | |
|
|
38 | * simple to use |
|
|
39 | This module has both a simple functional interface as well as an OO |
|
|
40 | interface. |
|
|
41 | |
|
|
42 | * reasonably versatile output formats |
|
|
43 | You can choose between the most compact format possible, a |
|
|
44 | pure-ascii format, or a pretty-printed format. Or you can combine |
|
|
45 | those features in whatever way you like. |
|
|
46 | |
|
|
47 | FUNCTIONAL INTERFACE |
|
|
48 | The following convinience methods are provided by this module. They are |
12 | exported by default. |
49 | exported by default: |
13 | |
50 | |
14 | The following export tags exist: |
51 | $json_string = to_json $perl_scalar |
|
|
52 | Converts the given Perl data structure (a simple scalar or a |
|
|
53 | reference to a hash or array) to a UTF-8 encoded, binary string |
|
|
54 | (that is, the string contains octets only). Croaks on error. |
15 | |
55 | |
16 | :utf8 all functions with utf8 in their name |
56 | This function call is functionally identical to "JSON::XS->new->utf8 |
17 | :taint all functions with taint in their name |
57 | (1)->encode ($perl_scalar)". |
18 | :refcnt all functions with refcnt in their name |
|
|
19 | :ok all *ok-functions. |
|
|
20 | |
58 | |
21 | utf8 scalar[, mode] |
59 | $perl_scalar = from_json $json_string |
22 | Returns true when the given scalar is marked as utf8, false |
60 | The opposite of "to_json": expects an UTF-8 (binary) string and |
23 | otherwise. If the optional mode argument is given, also forces the |
61 | tries to parse that as an UTF-8 encoded JSON string, returning the |
24 | interpretation of the string to utf8 (mode true) or plain bytes |
62 | resulting simple scalar or reference. Croaks on error. |
25 | (mode false). The actual (byte-) content is not changed. The return |
|
|
26 | value always reflects the state before any modification is done. |
|
|
27 | |
63 | |
28 | This function is useful when you "import" utf8-data into perl, or |
64 | This function call is functionally identical to "JSON::XS->new->utf8 |
29 | when some external function (e.g. storing/retrieving from a |
65 | (1)->decode ($json_string)". |
30 | database) removes the utf8-flag. |
|
|
31 | |
66 | |
32 | utf8_on scalar |
67 | OBJECT-ORIENTED INTERFACE |
33 | Similar to "utf8 scalar, 1", but additionally returns the scalar |
68 | The object oriented interface lets you configure your own encoding or |
34 | (the argument is still modified in-place). |
69 | decoding style, within the limits of supported formats. |
35 | |
70 | |
36 | utf8_off scalar |
71 | $json = new JSON::XS |
37 | Similar to "utf8 scalar, 0", but additionally returns the scalar |
72 | Creates a new JSON::XS object that can be used to de/encode JSON |
38 | (the argument is still modified in-place). |
73 | strings. All boolean flags described below are by default |
|
|
74 | *disabled*. |
39 | |
75 | |
40 | utf8_valid scalar [Perl 5.7] |
76 | The mutators for flags all return the JSON object again and thus |
41 | Returns true if the bytes inside the scalar form a valid utf8 |
77 | calls can be chained: |
42 | string, false otherwise (the check is independent of the actual |
|
|
43 | encoding perl thinks the string is in). |
|
|
44 | |
78 | |
45 | utf8_upgrade scalar |
79 | my $json = JSON::XS->new->utf8(1)->space_after(1)->encode ({a => [1,2]}) |
46 | Convert the string content of the scalar in-place to its |
80 | => {"a": [1, 2]} |
47 | UTF8-encoded form (and also returns it). |
|
|
48 | |
81 | |
49 | utf8_downgrade scalar[, fail_ok=0] |
82 | $json = $json->ascii ($enable) |
50 | Attempt to convert the string content of the scalar from |
83 | If $enable is true, then the "encode" method will not generate |
51 | UTF8-encoded to ISO-8859-1. This may not be possible if the string |
84 | characters outside the code range 0..127. Any unicode characters |
52 | contains characters that cannot be represented in a single byte; if |
85 | outside that range will be escaped using either a single \uXXXX (BMP |
53 | this is the case, it leaves the scalar unchanged and either returns |
86 | characters) or a double \uHHHH\uLLLLL escape sequence, as per |
54 | false or, if "fail_ok" is not true (the default), croaks. |
87 | RFC4627. |
55 | |
88 | |
56 | utf8_encode scalar |
89 | If $enable is false, then the "encode" method will not escape |
57 | Convert the string value of the scalar to UTF8-encoded, but then |
90 | Unicode characters unless necessary. |
58 | turn off the "SvUTF8" flag so that it looks like bytes to perl |
|
|
59 | again. (Might be removed in future versions). |
|
|
60 | |
91 | |
61 | utf8_length scalar |
92 | JSON::XS->new->ascii (1)->encode (chr 0x10401) |
62 | Returns the number of characters in the string, counting wide UTF8 |
93 | => \ud801\udc01 |
63 | characters as a single character, independent of wether the scalar |
|
|
64 | is marked as containing bytes or mulitbyte characters. |
|
|
65 | |
94 | |
66 | unmagic scalar, type |
95 | $json = $json->utf8 ($enable) |
67 | Remove the specified magic from the scalar (DANGEROUS!). |
96 | If $enable is true, then the "encode" method will encode the JSON |
|
|
97 | string into UTF-8, as required by many protocols, while the "decode" |
|
|
98 | method expects to be handled an UTF-8-encoded string. Please note |
|
|
99 | that UTF-8-encoded strings do not contain any characters outside the |
|
|
100 | range 0..255, they are thus useful for bytewise/binary I/O. |
68 | |
101 | |
69 | weaken scalar |
102 | If $enable is false, then the "encode" method will return the JSON |
70 | Weaken a reference. (See also WeakRef). |
103 | string as a (non-encoded) unicode string, while "decode" expects |
|
|
104 | thus a unicode string. Any decoding or encoding (e.g. to UTF-8 or |
|
|
105 | UTF-16) needs to be done yourself, e.g. using the Encode module. |
71 | |
106 | |
72 | taint scalar |
107 | $json = $json->pretty ($enable) |
73 | Taint the scalar. |
108 | This enables (or disables) all of the "indent", "space_before" and |
|
|
109 | "space_after" (and in the future possibly more) flags in one call to |
|
|
110 | generate the most readable (or most compact) form possible. |
74 | |
111 | |
75 | tainted scalar |
112 | my $json = JSON::XS->new->pretty(1)->encode ({a => [1,2]}) |
76 | returns true when the scalar is tainted, false otherwise. |
113 | => |
|
|
114 | { |
|
|
115 | "a" : [ |
|
|
116 | 1, |
|
|
117 | 2 |
|
|
118 | ] |
|
|
119 | } |
77 | |
120 | |
78 | untaint scalar |
121 | $json = $json->indent ($enable) |
79 | Remove the tainted flag from the specified scalar. |
122 | If $enable is true, then the "encode" method will use a multiline |
|
|
123 | format as output, putting every array member or object/hash |
|
|
124 | key-value pair into its own line, identing them properly. |
80 | |
125 | |
81 | grow scalar, newlen |
126 | If $enable is false, no newlines or indenting will be produced, and |
82 | Sets the memory area used for the scalar to the given length, if the |
127 | the resulting JSON strings is guarenteed not to contain any |
83 | current length is less than the new value. This does not affect the |
128 | "newlines". |
84 | contents of the scalar, but is only useful to "pre-allocate" memory |
|
|
85 | space if you know the scalar will grow. The return value is the |
|
|
86 | modified scalar (the scalar is modified in-place). |
|
|
87 | |
129 | |
88 | refcnt scalar[, newrefcnt] |
130 | This setting has no effect when decoding JSON strings. |
89 | Returns the current reference count of the given scalar and |
|
|
90 | optionally sets it to the given reference count. |
|
|
91 | |
131 | |
92 | refcnt_inc scalar |
132 | $json = $json->space_before ($enable) |
93 | Increments the reference count of the given scalar inplace. |
133 | If $enable is true, then the "encode" method will add an extra |
|
|
134 | optional space before the ":" separating keys from values in JSON |
|
|
135 | objects. |
94 | |
136 | |
95 | refcnt_dec scalar |
137 | If $enable is false, then the "encode" method will not add any extra |
96 | Decrements the reference count of the given scalar inplace. Use |
138 | space at those places. |
97 | "weaken" instead if you understand what this function is fore. |
|
|
98 | Better yet: don't use this module in this case. |
|
|
99 | |
139 | |
100 | refcnt_rv scalar[, newrefcnt] |
140 | This setting has no effect when decoding JSON strings. You will also |
101 | Works like "refcnt", but dereferences the given reference first. |
141 | most likely combine this setting with "space_after". |
102 | This is useful to find the reference count of arrays or hashes, |
|
|
103 | which cnanot be passed directly. Remember that taking a reference of |
|
|
104 | some object increases it's reference count, so the reference count |
|
|
105 | used by the *_rv-functions tend to be one higher. |
|
|
106 | |
142 | |
107 | refcnt_inc_rv scalar |
143 | $json = $json->space_after ($enable) |
108 | Works like "refcnt_inc", but dereferences the given reference first. |
144 | If $enable is true, then the "encode" method will add an extra |
|
|
145 | optional space after the ":" separating keys from values in JSON |
|
|
146 | objects and extra whitespace after the "," separating key-value |
|
|
147 | pairs and array members. |
109 | |
148 | |
110 | refcnt_dec_rv scalar |
149 | If $enable is false, then the "encode" method will not add any extra |
111 | Works like "refcnt_dec", but dereferences the given reference first. |
150 | space at those places. |
112 | |
151 | |
113 | ok scalar |
152 | This setting has no effect when decoding JSON strings. |
114 | uok scalar |
|
|
115 | rok scalar |
|
|
116 | pok scalar |
|
|
117 | nok scalar |
|
|
118 | niok scalar |
|
|
119 | Calls SvOK, SvUOK, SvROK, SvPOK, SvNOK or SvNIOK on the given |
|
|
120 | scalar, respectively. |
|
|
121 | |
153 | |
122 | CANDIDATES FOR FUTURE RELEASES |
154 | $json = $json->canonical ($enable) |
123 | The following API functions (perlapi) are considered for future |
155 | If $enable is true, then the "encode" method will output JSON |
124 | inclusion in this module If you want them, write me. |
156 | objects by sorting their keys. This is adding a comparatively high |
|
|
157 | overhead. |
125 | |
158 | |
126 | sv_upgrade |
159 | If $enable is false, then the "encode" method will output key-value |
127 | sv_pvn_force |
160 | pairs in the order Perl stores them (which will likely change |
128 | sv_pvutf8n_force |
161 | between runs of the same script). |
129 | the sv2xx family |
162 | |
|
|
163 | This option is useful if you want the same data structure to be |
|
|
164 | encoded as the same JSON string (given the same overall settings). |
|
|
165 | If it is disabled, the same hash migh be encoded differently even if |
|
|
166 | contains the same data, as key-value pairs have no inherent ordering |
|
|
167 | in Perl. |
|
|
168 | |
|
|
169 | This setting has no effect when decoding JSON strings. |
|
|
170 | |
|
|
171 | $json = $json->allow_nonref ($enable) |
|
|
172 | If $enable is true, then the "encode" method can convert a |
|
|
173 | non-reference into its corresponding string, number or null JSON |
|
|
174 | value, which is an extension to RFC4627. Likewise, "decode" will |
|
|
175 | accept those JSON values instead of croaking. |
|
|
176 | |
|
|
177 | If $enable is false, then the "encode" method will croak if it isn't |
|
|
178 | passed an arrayref or hashref, as JSON strings must either be an |
|
|
179 | object or array. Likewise, "decode" will croak if given something |
|
|
180 | that is not a JSON object or array. |
|
|
181 | |
|
|
182 | $json_string = $json->encode ($perl_scalar) |
|
|
183 | Converts the given Perl data structure (a simple scalar or a |
|
|
184 | reference to a hash or array) to its JSON representation. Simple |
|
|
185 | scalars will be converted into JSON string or number sequences, |
|
|
186 | while references to arrays become JSON arrays and references to |
|
|
187 | hashes become JSON objects. Undefined Perl values (e.g. "undef") |
|
|
188 | become JSON "null" values. Neither "true" nor "false" values will be |
|
|
189 | generated. |
|
|
190 | |
|
|
191 | $perl_scalar = $json->decode ($json_string) |
|
|
192 | The opposite of "encode": expects a JSON string and tries to parse |
|
|
193 | it, returning the resulting simple scalar or reference. Croaks on |
|
|
194 | error. |
|
|
195 | |
|
|
196 | JSON numbers and strings become simple Perl scalars. JSON arrays |
|
|
197 | become Perl arrayrefs and JSON objects become Perl hashrefs. "true" |
|
|
198 | becomes 1, "false" becomes 0 and "null" becomes "undef". |
|
|
199 | |
|
|
200 | COMPARISON |
|
|
201 | As already mentioned, this module was created because none of the |
|
|
202 | existing JSON modules could be made to work correctly. First I will |
|
|
203 | describe the problems (or pleasures) I encountered with various existing |
|
|
204 | JSON modules, followed by some benchmark values. JSON::XS was designed |
|
|
205 | not to suffer from any of these problems or limitations. |
|
|
206 | |
|
|
207 | JSON |
|
|
208 | Slow (but very portable, as it is written in pure Perl). |
|
|
209 | |
|
|
210 | Undocumented/buggy Unicode handling (how JSON handles unicode values |
|
|
211 | is undocumented. One can get far by feeding it unicode strings and |
|
|
212 | doing en-/decoding oneself, but unicode escapes are not working |
|
|
213 | properly). |
|
|
214 | |
|
|
215 | No roundtripping (strings get clobbered if they look like numbers, |
|
|
216 | e.g. the string 2.0 will encode to 2.0 instead of "2.0", and that |
|
|
217 | will decode into the number 2. |
|
|
218 | |
|
|
219 | JSON::PC |
|
|
220 | Very fast. |
|
|
221 | |
|
|
222 | Undocumented/buggy Unicode handling. |
|
|
223 | |
|
|
224 | No roundtripping. |
|
|
225 | |
|
|
226 | Has problems handling many Perl values (e.g. regex results and other |
|
|
227 | magic values will make it croak). |
|
|
228 | |
|
|
229 | Does not even generate valid JSON ("{1,2}" gets converted to "{1:2}" |
|
|
230 | which is not a valid JSON string. |
|
|
231 | |
|
|
232 | Unmaintained (maintainer unresponsive for many months, bugs are not |
|
|
233 | getting fixed). |
|
|
234 | |
|
|
235 | JSON::Syck |
|
|
236 | Very buggy (often crashes). |
|
|
237 | |
|
|
238 | Very inflexible (no human-readable format supported, format pretty |
|
|
239 | much undocumented. I need at least a format for easy reading by |
|
|
240 | humans and a single-line compact format for use in a protocol, and |
|
|
241 | preferably a way to generate ASCII-only JSON strings). |
|
|
242 | |
|
|
243 | Completely broken (and confusingly documented) Unicode handling |
|
|
244 | (unicode escapes are not working properly, you need to set |
|
|
245 | ImplicitUnicode to *different* values on en- and decoding to get |
|
|
246 | symmetric behaviour). |
|
|
247 | |
|
|
248 | No roundtripping (simple cases work, but this depends on wether the |
|
|
249 | scalar value was used in a numeric context or not). |
|
|
250 | |
|
|
251 | Dumping hashes may skip hash values depending on iterator state. |
|
|
252 | |
|
|
253 | Unmaintained (maintainer unresponsive for many months, bugs are not |
|
|
254 | getting fixed). |
|
|
255 | |
|
|
256 | Does not check input for validity (i.e. will accept non-JSON input |
|
|
257 | and return "something" instead of raising an exception. This is a |
|
|
258 | security issue: imagine two banks transfering money between each |
|
|
259 | other using JSON. One bank might parse a given non-JSON request and |
|
|
260 | deduct money, while the other might reject the transaction with a |
|
|
261 | syntax error. While a good protocol will at least recover, that is |
|
|
262 | extra unnecessary work and the transaction will still not succeed). |
|
|
263 | |
|
|
264 | JSON::DWIW |
|
|
265 | Very fast. Very natural. Very nice. |
|
|
266 | |
|
|
267 | Undocumented unicode handling (but the best of the pack. Unicode |
|
|
268 | escapes still don't get parsed properly). |
|
|
269 | |
|
|
270 | Very inflexible. |
|
|
271 | |
|
|
272 | No roundtripping. |
|
|
273 | |
|
|
274 | Does not generate valid JSON (key strings are often unquoted, empty |
|
|
275 | keys result in nothing being output) |
|
|
276 | |
|
|
277 | Does not check input for validity. |
|
|
278 | |
|
|
279 | SPEED |
|
|
280 | It seems that JSON::XS is surprisingly fast, as shown in the following |
|
|
281 | tables. They have been generated with the help of the "eg/bench" program |
|
|
282 | in the JSON::XS distribution, to make it easy to compare on your own |
|
|
283 | system. |
|
|
284 | |
|
|
285 | First is a comparison between various modules using a very simple JSON |
|
|
286 | string, showing the number of encodes/decodes per second (JSON::XS is |
|
|
287 | the functional interface, while JSON::XS/2 is the OO interface with |
|
|
288 | pretty-printing and hashkey sorting enabled). |
|
|
289 | |
|
|
290 | module | encode | decode | |
|
|
291 | -----------|------------|------------| |
|
|
292 | JSON | 14006 | 6820 | |
|
|
293 | JSON::DWIW | 200937 | 120386 | |
|
|
294 | JSON::PC | 85065 | 129366 | |
|
|
295 | JSON::Syck | 59898 | 44232 | |
|
|
296 | JSON::XS | 1171478 | 342435 | |
|
|
297 | JSON::XS/2 | 730760 | 328714 | |
|
|
298 | -----------+------------+------------+ |
|
|
299 | |
|
|
300 | That is, JSON::XS is 6 times faster than than JSON::DWIW and about 80 |
|
|
301 | times faster than JSON, even with pretty-printing and key sorting. |
|
|
302 | |
|
|
303 | Using a longer test string (roughly 8KB, generated from Yahoo! Locals |
|
|
304 | search API (http://nanoref.com/yahooapis/mgPdGg): |
|
|
305 | |
|
|
306 | module | encode | decode | |
|
|
307 | -----------|------------|------------| |
|
|
308 | JSON | 673 | 38 | |
|
|
309 | JSON::DWIW | 5271 | 770 | |
|
|
310 | JSON::PC | 9901 | 2491 | |
|
|
311 | JSON::Syck | 2360 | 786 | |
|
|
312 | JSON::XS | 37398 | 3202 | |
|
|
313 | JSON::XS/2 | 13765 | 3153 | |
|
|
314 | -----------+------------+------------+ |
|
|
315 | |
|
|
316 | Again, JSON::XS leads by far in the encoding case, while still beating |
|
|
317 | every other module in the decoding case. |
|
|
318 | |
|
|
319 | Last example is an almost 8MB large hash with many large binary values |
|
|
320 | (PNG files), resulting in a lot of escaping: |
|
|
321 | |
|
|
322 | BUGS |
|
|
323 | While the goal of this module is to be correct, that unfortunately does |
|
|
324 | not mean its bug-free, only that I think its design is bug-free. It is |
|
|
325 | still very young and not well-tested. If you keep reporting bugs they |
|
|
326 | will be fixed swiftly, though. |
130 | |
327 | |
131 | AUTHOR |
328 | AUTHOR |
132 | Marc Lehmann <schmorp@schmorp.de> |
329 | Marc Lehmann <schmorp@schmorp.de> |
133 | http://home.schmorp.de/ |
330 | http://home.schmorp.de/ |
134 | |
331 | |