[ViewVC] Diff of: cvs/JSON-XS/XS.pm

Comparing JSON-XS/XS.pm (file contents):
Revision 1.114 by root, Wed Jan 21 05:34:08 2009 UTC vs.
Revision 1.171 by root, Thu Nov 15 22:49:06 2018 UTC

…		…
35		35
36	This module converts Perl data structures to JSON and vice versa. Its	36	This module converts Perl data structures to JSON and vice versa. Its
37	primary goal is to be I<correct> and its secondary goal is to be	37	primary goal is to be I<correct> and its secondary goal is to be
38	I<fast>. To reach the latter goal it was written in C.	38	I<fast>. To reach the latter goal it was written in C.
39		39
40	Beginning with version 2.0 of the JSON module, when both JSON and
41	JSON::XS are installed, then JSON will fall back on JSON::XS (this can be
42	overridden) with no overhead due to emulation (by inheriting constructor
43	and methods). If JSON::XS is not available, it will fall back to the
44	compatible JSON::PP module as backend, so using JSON instead of JSON::XS
45	gives you a portable JSON API that can be fast when you need and doesn't
46	require a C compiler when that is a problem.
47
48	As this is the n-th-something JSON module on CPAN, what was the reason
49	to write yet another JSON module? While it seems there are many JSON
50	modules, none of them correctly handle all corner cases, and in most cases
51	their maintainers are unresponsive, gone missing, or not listening to bug
52	reports for other reasons.
53
54	See MAPPING, below, on how JSON::XS maps perl values to JSON values and	40	See MAPPING, below, on how JSON::XS maps perl values to JSON values and
55	vice versa.	41	vice versa.
56		42
57	=head2 FEATURES	43	=head2 FEATURES
58		44
59	=over 4	45	=over
60		46
61	=item * correct Unicode handling	47	=item * correct Unicode handling
62		48
63	This module knows how to handle Unicode, documents how and when it does	49	This module knows how to handle Unicode, documents how and when it does
64	so, and even documents what "correct" means.	50	so, and even documents what "correct" means.
65		51
66	=item * round-trip integrity	52	=item * round-trip integrity
67		53
68	When you serialise a perl data structure using only data types supported	54	When you serialise a perl data structure using only data types supported
69	by JSON, the deserialised data structure is identical on the Perl level.	55	by JSON and Perl, the deserialised data structure is identical on the Perl
70	(e.g. the string "2.0" doesn't suddenly become "2" just because it looks	56	level. (e.g. the string "2.0" doesn't suddenly become "2" just because
71	like a number). There minor I<are> exceptions to this, read the MAPPING	57	it looks like a number). There I<are> minor exceptions to this, read the
72	section below to learn about those.	58	MAPPING section below to learn about those.
73		59
74	=item * strict checking of JSON correctness	60	=item * strict checking of JSON correctness
75		61
76	There is no guessing, no generating of illegal JSON texts by default,	62	There is no guessing, no generating of illegal JSON texts by default,
77	and only JSON is accepted as input by default (the latter is a security	63	and only JSON is accepted as input by default (the latter is a security
…		…
83	this module usually compares favourably in terms of speed, too.	69	this module usually compares favourably in terms of speed, too.
84		70
85	=item * simple to use	71	=item * simple to use
86		72
87	This module has both a simple functional interface as well as an object	73	This module has both a simple functional interface as well as an object
88	oriented interface interface.	74	oriented interface.
89		75
90	=item * reasonably versatile output formats	76	=item * reasonably versatile output formats
91		77
92	You can choose between the most compact guaranteed-single-line format	78	You can choose between the most compact guaranteed-single-line format
93	possible (nice for simple line-based protocols), a pure-ASCII format	79	possible (nice for simple line-based protocols), a pure-ASCII format
…		…
99		85
100	=cut	86	=cut
101		87
102	package JSON::XS;	88	package JSON::XS;
103		89
104	no warnings;	90	use common::sense;
105	use strict;
106		91
107	our $VERSION = '2.231';	92	our $VERSION = '4.0';
108	our @ISA = qw(Exporter);	93	our @ISA = qw(Exporter);
109		94
110	our @EXPORT = qw(encode_json decode_json to_json from_json);	95	our @EXPORT = qw(encode_json decode_json);
111
112	sub to_json($) {
113	require Carp;
114	Carp::croak ("JSON::XS::to_json has been renamed to encode_json, either downgrade to pre-2.0 versions of JSON::XS or rename the call");
115	}
116
117	sub from_json($) {
118	require Carp;
119	Carp::croak ("JSON::XS::from_json has been renamed to decode_json, either downgrade to pre-2.0 versions of JSON::XS or rename the call");
120	}
121		96
122	use Exporter;	97	use Exporter;
123	use XSLoader;	98	use XSLoader;
124		99
		100	use Types::Serialiser ();
		101
125	=head1 FUNCTIONAL INTERFACE	102	=head1 FUNCTIONAL INTERFACE
126		103
127	The following convenience methods are provided by this module. They are	104	The following convenience methods are provided by this module. They are
128	exported by default:	105	exported by default:
129		106
130	=over 4	107	=over
131		108
132	=item $json_text = encode_json $perl_scalar	109	=item $json_text = encode_json $perl_scalar
133		110
134	Converts the given Perl data structure to a UTF-8 encoded, binary string	111	Converts the given Perl data structure to a UTF-8 encoded, binary string
135	(that is, the string contains octets only). Croaks on error.	112	(that is, the string contains octets only). Croaks on error.
…		…
140		117
141	Except being faster.	118	Except being faster.
142		119
143	=item $perl_scalar = decode_json $json_text	120	=item $perl_scalar = decode_json $json_text
144		121
145	The opposite of C<encode_json>: expects an UTF-8 (binary) string and tries	122	The opposite of C<encode_json>: expects a UTF-8 (binary) string and tries
146	to parse that as an UTF-8 encoded JSON text, returning the resulting	123	to parse that as a UTF-8 encoded JSON text, returning the resulting
147	reference. Croaks on error.	124	reference. Croaks on error.
148		125
149	This function call is functionally identical to:	126	This function call is functionally identical to:
150		127
151	$perl_scalar = JSON::XS->new->utf8->decode ($json_text)	128	$perl_scalar = JSON::XS->new->utf8->decode ($json_text)
152		129
153	Except being faster.	130	Except being faster.
154
155	=item $is_boolean = JSON::XS::is_bool $scalar
156
157	Returns true if the passed scalar represents either JSON::XS::true or
158	JSON::XS::false, two constants that act like C<1> and C<0>, respectively
159	and are used to represent JSON C<true> and C<false> values in Perl.
160
161	See MAPPING, below, for more information on how JSON values are mapped to
162	Perl.
163		131
164	=back	132	=back
165		133
166		134
167	=head1 A FEW NOTES ON UNICODE AND PERL	135	=head1 A FEW NOTES ON UNICODE AND PERL
168		136
169	Since this often leads to confusion, here are a few very clear words on	137	Since this often leads to confusion, here are a few very clear words on
170	how Unicode works in Perl, modulo bugs.	138	how Unicode works in Perl, modulo bugs.
171		139
172	=over 4	140	=over
173		141
174	=item 1. Perl strings can store characters with ordinal values > 255.	142	=item 1. Perl strings can store characters with ordinal values > 255.
175		143
176	This enables you to store Unicode characters as single characters in a	144	This enables you to store Unicode characters as single characters in a
177	Perl string - very natural.	145	Perl string - very natural.
…		…
215	=head1 OBJECT-ORIENTED INTERFACE	183	=head1 OBJECT-ORIENTED INTERFACE
216		184
217	The object oriented interface lets you configure your own encoding or	185	The object oriented interface lets you configure your own encoding or
218	decoding style, within the limits of supported formats.	186	decoding style, within the limits of supported formats.
219		187
220	=over 4	188	=over
221		189
222	=item $json = new JSON::XS	190	=item $json = new JSON::XS
223		191
224	Creates a new JSON::XS object that can be used to de/encode JSON	192	Creates a new JSON::XS object that can be used to de/encode JSON
225	strings. All boolean flags described below are by default I<disabled>.	193	strings. All boolean flags described below are by default I<disabled>
		194	(with the exception of C<allow_nonref>, which defaults to I<enabled> since
		195	version C<4.0>).
226		196
227	The mutators for flags all return the JSON object again and thus calls can	197	The mutators for flags all return the JSON object again and thus calls can
228	be chained:	198	be chained:
229		199
230	my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]})	200	my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]})
…		…
288		258
289	=item $enabled = $json->get_utf8	259	=item $enabled = $json->get_utf8
290		260
291	If C<$enable> is true (or missing), then the C<encode> method will encode	261	If C<$enable> is true (or missing), then the C<encode> method will encode
292	the JSON result into UTF-8, as required by many protocols, while the	262	the JSON result into UTF-8, as required by many protocols, while the
293	C<decode> method expects to be handled an UTF-8-encoded string. Please	263	C<decode> method expects to be handed a UTF-8-encoded string. Please
294	note that UTF-8-encoded strings do not contain any characters outside the	264	note that UTF-8-encoded strings do not contain any characters outside the
295	range C<0..255>, they are thus useful for bytewise/binary I/O. In future	265	range C<0..255>, they are thus useful for bytewise/binary I/O. In future
296	versions, enabling this option might enable autodetection of the UTF-16	266	versions, enabling this option might enable autodetection of the UTF-16
297	and UTF-32 encoding families, as described in RFC4627.	267	and UTF-32 encoding families, as described in RFC4627.
298		268
…		…
383		353
384	=item $enabled = $json->get_relaxed	354	=item $enabled = $json->get_relaxed
385		355
386	If C<$enable> is true (or missing), then C<decode> will accept some	356	If C<$enable> is true (or missing), then C<decode> will accept some
387	extensions to normal JSON syntax (see below). C<encode> will not be	357	extensions to normal JSON syntax (see below). C<encode> will not be
388	affected in anyway. I<Be aware that this option makes you accept invalid	358	affected in any way. I<Be aware that this option makes you accept invalid
389	JSON texts as if they were valid!>. I suggest only to use this option to	359	JSON texts as if they were valid!>. I suggest only to use this option to
390	parse application-specific files written by humans (configuration files,	360	parse application-specific files written by humans (configuration files,
391	resource files etc.)	361	resource files etc.)
392		362
393	If C<$enable> is false (the default), then C<decode> will only accept	363	If C<$enable> is false (the default), then C<decode> will only accept
394	valid JSON texts.	364	valid JSON texts.
395		365
396	Currently accepted extensions are:	366	Currently accepted extensions are:
397		367
398	=over 4	368	=over
399		369
400	=item * list items can have an end-comma	370	=item * list items can have an end-comma
401		371
402	JSON I<separates> array elements and key-value pairs with commas. This	372	JSON I<separates> array elements and key-value pairs with commas. This
403	can be annoying if you write JSON texts manually and want to be able to	373	can be annoying if you write JSON texts manually and want to be able to
…		…
422	[	392	[
423	1, # this comment not allowed in JSON	393	1, # this comment not allowed in JSON
424	# neither this one...	394	# neither this one...
425	]	395	]
426		396
		397	=item * literal ASCII TAB characters in strings
		398
		399	Literal ASCII TAB characters are now allowed in strings (and treated as
		400	C<\t>).
		401
		402	[
		403	"Hello\tWorld",
		404	"Hello<TAB>World", # literal <TAB> would not normally be allowed
		405	]
		406
427	=back	407	=back
428		408
429	=item $json = $json->canonical ([$enable])	409	=item $json = $json->canonical ([$enable])
430		410
431	=item $enabled = $json->get_canonical	411	=item $enabled = $json->get_canonical
…		…
433	If C<$enable> is true (or missing), then the C<encode> method will output JSON objects	413	If C<$enable> is true (or missing), then the C<encode> method will output JSON objects
434	by sorting their keys. This is adding a comparatively high overhead.	414	by sorting their keys. This is adding a comparatively high overhead.
435		415
436	If C<$enable> is false, then the C<encode> method will output key-value	416	If C<$enable> is false, then the C<encode> method will output key-value
437	pairs in the order Perl stores them (which will likely change between runs	417	pairs in the order Perl stores them (which will likely change between runs
438	of the same script).	418	of the same script, and can change even within the same run from 5.18
		419	onwards).
439		420
440	This option is useful if you want the same data structure to be encoded as	421	This option is useful if you want the same data structure to be encoded as
441	the same JSON text (given the same overall settings). If it is disabled,	422	the same JSON text (given the same overall settings). If it is disabled,
442	the same hash might be encoded differently even if contains the same data,	423	the same hash might be encoded differently even if contains the same data,
443	as key-value pairs have no inherent ordering in Perl.	424	as key-value pairs have no inherent ordering in Perl.
444		425
445	This setting has no effect when decoding JSON texts.	426	This setting has no effect when decoding JSON texts.
446		427
		428	This setting has currently no effect on tied hashes.
		429
447	=item $json = $json->allow_nonref ([$enable])	430	=item $json = $json->allow_nonref ([$enable])
448		431
449	=item $enabled = $json->get_allow_nonref	432	=item $enabled = $json->get_allow_nonref
		433
		434	Unlike other boolean options, this opotion is enabled by default beginning
		435	with version C<4.0>. See L<SECURITY CONSIDERATIONS> for the gory details.
450		436
451	If C<$enable> is true (or missing), then the C<encode> method can convert a	437	If C<$enable> is true (or missing), then the C<encode> method can convert a
452	non-reference into its corresponding string, number or null JSON value,	438	non-reference into its corresponding string, number or null JSON value,
453	which is an extension to RFC4627. Likewise, C<decode> will accept those JSON	439	which is an extension to RFC4627. Likewise, C<decode> will accept those JSON
454	values instead of croaking.	440	values instead of croaking.
…		…
456	If C<$enable> is false, then the C<encode> method will croak if it isn't	442	If C<$enable> is false, then the C<encode> method will croak if it isn't
457	passed an arrayref or hashref, as JSON texts must either be an object	443	passed an arrayref or hashref, as JSON texts must either be an object
458	or array. Likewise, C<decode> will croak if given something that is not a	444	or array. Likewise, C<decode> will croak if given something that is not a
459	JSON object or array.	445	JSON object or array.
460		446
461	Example, encode a Perl scalar as JSON value with enabled C<allow_nonref>,	447	Example, encode a Perl scalar as JSON value without enabled C<allow_nonref>,
462	resulting in an invalid JSON text:	448	resulting in an error:
463		449
464	JSON::XS->new->allow_nonref->encode ("Hello, World!")	450	JSON::XS->new->allow_nonref (0)->encode ("Hello, World!")
465	=> "Hello, World!"	451	=> hash- or arrayref expected...
466		452
467	=item $json = $json->allow_unknown ([$enable])	453	=item $json = $json->allow_unknown ([$enable])
468		454
469	=item $enabled = $json->get_allow_unknown	455	=item $enabled = $json->get_allow_unknown
470		456
…		…
482		468
483	=item $json = $json->allow_blessed ([$enable])	469	=item $json = $json->allow_blessed ([$enable])
484		470
485	=item $enabled = $json->get_allow_blessed	471	=item $enabled = $json->get_allow_blessed
486		472
		473	See L<OBJECT SERIALISATION> for details.
		474
487	If C<$enable> is true (or missing), then the C<encode> method will not	475	If C<$enable> is true (or missing), then the C<encode> method will not
488	barf when it encounters a blessed reference. Instead, the value of the	476	barf when it encounters a blessed reference that it cannot convert
489	B<convert_blessed> option will decide whether C<null> (C<convert_blessed>	477	otherwise. Instead, a JSON C<null> value is encoded instead of the object.
490	disabled or no C<TO_JSON> method found) or a representation of the
491	object (C<convert_blessed> enabled and C<TO_JSON> method found) is being
492	encoded. Has no effect on C<decode>.
493		478
494	If C<$enable> is false (the default), then C<encode> will throw an	479	If C<$enable> is false (the default), then C<encode> will throw an
495	exception when it encounters a blessed object.	480	exception when it encounters a blessed object that it cannot convert
		481	otherwise.
		482
		483	This setting has no effect on C<decode>.
496		484
497	=item $json = $json->convert_blessed ([$enable])	485	=item $json = $json->convert_blessed ([$enable])
498		486
499	=item $enabled = $json->get_convert_blessed	487	=item $enabled = $json->get_convert_blessed
		488
		489	See L<OBJECT SERIALISATION> for details.
500		490
501	If C<$enable> is true (or missing), then C<encode>, upon encountering a	491	If C<$enable> is true (or missing), then C<encode>, upon encountering a
502	blessed object, will check for the availability of the C<TO_JSON> method	492	blessed object, will check for the availability of the C<TO_JSON> method
503	on the object's class. If found, it will be called in scalar context	493	on the object's class. If found, it will be called in scalar context and
504	and the resulting scalar will be encoded instead of the object. If no	494	the resulting scalar will be encoded instead of the object.
505	C<TO_JSON> method is found, the value of C<allow_blessed> will decide what
506	to do.
507		495
508	The C<TO_JSON> method may safely call die if it wants. If C<TO_JSON>	496	The C<TO_JSON> method may safely call die if it wants. If C<TO_JSON>
509	returns other blessed objects, those will be handled in the same	497	returns other blessed objects, those will be handled in the same
510	way. C<TO_JSON> must take care of not causing an endless recursion cycle	498	way. C<TO_JSON> must take care of not causing an endless recursion cycle
511	(== crash) in this case. The name of C<TO_JSON> was chosen because other	499	(== crash) in this case. The name of C<TO_JSON> was chosen because other
512	methods called by the Perl core (== not by the user of the object) are	500	methods called by the Perl core (== not by the user of the object) are
513	usually in upper case letters and to avoid collisions with any C<to_json>	501	usually in upper case letters and to avoid collisions with any C<to_json>
514	function or method.	502	function or method.
515		503
516	This setting does not yet influence C<decode> in any way, but in the	504	If C<$enable> is false (the default), then C<encode> will not consider
517	future, global hooks might get installed that influence C<decode> and are	505	this type of conversion.
518	enabled by this setting.
519		506
520	If C<$enable> is false, then the C<allow_blessed> setting will decide what	507	This setting has no effect on C<decode>.
521	to do when a blessed object is found.	508
		509	=item $json = $json->allow_tags ([$enable])
		510
		511	=item $enabled = $json->get_allow_tags
		512
		513	See L<OBJECT SERIALISATION> for details.
		514
		515	If C<$enable> is true (or missing), then C<encode>, upon encountering a
		516	blessed object, will check for the availability of the C<FREEZE> method on
		517	the object's class. If found, it will be used to serialise the object into
		518	a nonstandard tagged JSON value (that JSON decoders cannot decode).
		519
		520	It also causes C<decode> to parse such tagged JSON values and deserialise
		521	them via a call to the C<THAW> method.
		522
		523	If C<$enable> is false (the default), then C<encode> will not consider
		524	this type of conversion, and tagged JSON values will cause a parse error
		525	in C<decode>, as if tags were not part of the grammar.
		526
		527	=item $json->boolean_values ([$false, $true])
		528
		529	=item ($false, $true) = $json->get_boolean_values
		530
		531	By default, JSON booleans will be decoded as overloaded
		532	C<$Types::Serialiser::false> and C<$Types::Serialiser::true> objects.
		533
		534	With this method you can specify your own boolean values for decoding -
		535	on decode, JSON C<false> will be decoded as a copy of C<$false>, and JSON
		536	C<true> will be decoded as C<$true> ("copy" here is the same thing as
		537	assigning a value to another variable, i.e. C<$copy = $false>).
		538
		539	Calling this method without any arguments will reset the booleans
		540	to their default values.
		541
		542	C<get_boolean_values> will return both C<$false> and C<$true> values, or
		543	the empty list when they are set to the default.
522		544
523	=item $json = $json->filter_json_object ([$coderef->($hashref)])	545	=item $json = $json->filter_json_object ([$coderef->($hashref)])
524		546
525	When C<$coderef> is specified, it will be called from C<decode> each	547	When C<$coderef> is specified, it will be called from C<decode> each
526	time it decodes a JSON object. The only argument is a reference to the	548	time it decodes a JSON object. The only argument is a reference to
527	newly-created hash. If the code references returns a single scalar (which	549	the newly-created hash. If the code reference returns a single scalar
528	need not be a reference), this value (i.e. a copy of that scalar to avoid	550	(which need not be a reference), this value (or rather a copy of it) is
529	aliasing) is inserted into the deserialised data structure. If it returns	551	inserted into the deserialised data structure. If it returns an empty
530	an empty list (NOTE: I<not> C<undef>, which is a valid scalar), the	552	list (NOTE: I<not> C<undef>, which is a valid scalar), the original
531	original deserialised hash will be inserted. This setting can slow down	553	deserialised hash will be inserted. This setting can slow down decoding
532	decoding considerably.	554	considerably.
533		555
534	When C<$coderef> is omitted or undefined, any existing callback will	556	When C<$coderef> is omitted or undefined, any existing callback will
535	be removed and C<decode> will not change the deserialised hash in any	557	be removed and C<decode> will not change the deserialised hash in any
536	way.	558	way.
537		559
…		…
665		687
666	See SECURITY CONSIDERATIONS, below, for more info on why this is useful.	688	See SECURITY CONSIDERATIONS, below, for more info on why this is useful.
667		689
668	=item $json_text = $json->encode ($perl_scalar)	690	=item $json_text = $json->encode ($perl_scalar)
669		691
670	Converts the given Perl data structure (a simple scalar or a reference	692	Converts the given Perl value or data structure to its JSON
671	to a hash or array) to its JSON representation. Simple scalars will be	693	representation. Croaks on error.
672	converted into JSON string or number sequences, while references to arrays
673	become JSON arrays and references to hashes become JSON objects. Undefined
674	Perl values (e.g. C<undef>) become JSON C<null> values. Neither C<true>
675	nor C<false> values will be generated.
676		694
677	=item $perl_scalar = $json->decode ($json_text)	695	=item $perl_scalar = $json->decode ($json_text)
678		696
679	The opposite of C<encode>: expects a JSON text and tries to parse it,	697	The opposite of C<encode>: expects a JSON text and tries to parse it,
680	returning the resulting simple scalar or reference. Croaks on error.	698	returning the resulting simple scalar or reference. Croaks on error.
681
682	JSON numbers and strings become simple Perl scalars. JSON arrays become
683	Perl arrayrefs and JSON objects become Perl hashrefs. C<true> becomes
684	C<1>, C<false> becomes C<0> and C<null> becomes C<undef>.
685		699
686	=item ($perl_scalar, $characters) = $json->decode_prefix ($json_text)	700	=item ($perl_scalar, $characters) = $json->decode_prefix ($json_text)
687		701
688	This works like the C<decode> method, but instead of raising an exception	702	This works like the C<decode> method, but instead of raising an exception
689	when there is trailing garbage after the first JSON object, it will	703	when there is trailing garbage after the first JSON object, it will
690	silently stop parsing there and return the number of characters consumed	704	silently stop parsing there and return the number of characters consumed
691	so far.	705	so far.
692		706
693	This is useful if your JSON texts are not delimited by an outer protocol	707	This is useful if your JSON texts are not delimited by an outer protocol
694	(which is not the brightest thing to do in the first place) and you need
695	to know where the JSON text ends.	708	and you need to know where the JSON text ends.
696		709
697	JSON::XS->new->decode_prefix ("[1] the tail")	710	JSON::XS->new->decode_prefix ("[1] the tail")
698	=> ([], 3)	711	=> ([1], 3)
699		712
700	=back	713	=back
701		714
702		715
703	=head1 INCREMENTAL PARSING	716	=head1 INCREMENTAL PARSING
…		…
712	calls).	725	calls).
713		726
714	JSON::XS will only attempt to parse the JSON text once it is sure it	727	JSON::XS will only attempt to parse the JSON text once it is sure it
715	has enough text to get a decisive result, using a very simple but	728	has enough text to get a decisive result, using a very simple but
716	truly incremental parser. This means that it sometimes won't stop as	729	truly incremental parser. This means that it sometimes won't stop as
717	early as the full parser, for example, it doesn't detect parenthese	730	early as the full parser, for example, it doesn't detect mismatched
718	mismatches. The only thing it guarantees is that it starts decoding as	731	parentheses. The only thing it guarantees is that it starts decoding as
719	soon as a syntactically valid JSON text has been seen. This means you need	732	soon as a syntactically valid JSON text has been seen. This means you need
720	to set resource limits (e.g. C<max_size>) to ensure the parser will stop	733	to set resource limits (e.g. C<max_size>) to ensure the parser will stop
721	parsing in the presence if syntax errors.	734	parsing in the presence if syntax errors.
722		735
723	The following methods implement this incremental parser.	736	The following methods implement this incremental parser.
724		737
725	=over 4	738	=over
726		739
727	=item [void, scalar or list context] = $json->incr_parse ([$string])	740	=item [void, scalar or list context] = $json->incr_parse ([$string])
728		741
729	This is the central parsing function. It can both append new text and	742	This is the central parsing function. It can both append new text and
730	extract objects from the stream accumulated so far (both of these	743	extract objects from the stream accumulated so far (both of these
…		…
739		752
740	If the method is called in scalar context, then it will try to extract	753	If the method is called in scalar context, then it will try to extract
741	exactly I<one> JSON object. If that is successful, it will return this	754	exactly I<one> JSON object. If that is successful, it will return this
742	object, otherwise it will return C<undef>. If there is a parse error,	755	object, otherwise it will return C<undef>. If there is a parse error,
743	this method will croak just as C<decode> would do (one can then use	756	this method will croak just as C<decode> would do (one can then use
744	C<incr_skip> to skip the errornous part). This is the most common way of	757	C<incr_skip> to skip the erroneous part). This is the most common way of
745	using the method.	758	using the method.
746		759
747	And finally, in list context, it will try to extract as many objects	760	And finally, in list context, it will try to extract as many objects
748	from the stream as it can find and return them, or the empty list	761	from the stream as it can find and return them, or the empty list
749	otherwise. For this to work, there must be no separators between the JSON	762	otherwise. For this to work, there must be no separators (other than
750	objects or arrays, instead they must be concatenated back-to-back. If	763	whitespace) between the JSON objects or arrays, instead they must be
751	an error occurs, an exception will be raised as in the scalar context	764	concatenated back-to-back. If an error occurs, an exception will be
752	case. Note that in this case, any previously-parsed JSON texts will be	765	raised as in the scalar context case. Note that in this case, any
753	lost.	766	previously-parsed JSON texts will be lost.
		767
		768	Example: Parse some JSON arrays/objects in a given string and return
		769	them.
		770
		771	my @objs = JSON::XS->new->incr_parse ("[5][7][1,2]");
754		772
755	=item $lvalue_string = $json->incr_text	773	=item $lvalue_string = $json->incr_text
756		774
757	This method returns the currently stored JSON fragment as an lvalue, that	775	This method returns the currently stored JSON fragment as an lvalue, that
758	is, you can manipulate it. This I<only> works when a preceding call to	776	is, you can manipulate it. This I<only> works when a preceding call to
…		…
760	all other circumstances you must not call this function (I mean it.	778	all other circumstances you must not call this function (I mean it.
761	although in simple tests it might actually work, it I<will> fail under	779	although in simple tests it might actually work, it I<will> fail under
762	real world conditions). As a special exception, you can also call this	780	real world conditions). As a special exception, you can also call this
763	method before having parsed anything.	781	method before having parsed anything.
764		782
		783	That means you can only use this function to look at or manipulate text
		784	before or after complete JSON objects, not while the parser is in the
		785	middle of parsing a JSON object.
		786
765	This function is useful in two cases: a) finding the trailing text after a	787	This function is useful in two cases: a) finding the trailing text after a
766	JSON object or b) parsing multiple JSON objects separated by non-JSON text	788	JSON object or b) parsing multiple JSON objects separated by non-JSON text
767	(such as commas).	789	(such as commas).
768		790
769	=item $json->incr_skip	791	=item $json->incr_skip
…		…
773	C<incr_parse> died, in which case the input buffer and incremental parser	795	C<incr_parse> died, in which case the input buffer and incremental parser
774	state is left unchanged, to skip the text parsed so far and to reset the	796	state is left unchanged, to skip the text parsed so far and to reset the
775	parse state.	797	parse state.
776		798
777	The difference to C<incr_reset> is that only text until the parse error	799	The difference to C<incr_reset> is that only text until the parse error
778	occured is removed.	800	occurred is removed.
779		801
780	=item $json->incr_reset	802	=item $json->incr_reset
781		803
782	This completely resets the incremental parser, that is, after this call,	804	This completely resets the incremental parser, that is, after this call,
783	it will be as if the parser had never parsed anything.	805	it will be as if the parser had never parsed anything.
…		…
788		810
789	=back	811	=back
790		812
791	=head2 LIMITATIONS	813	=head2 LIMITATIONS
792		814
793	All options that affect decoding are supported, except	815	The incremental parser is a non-exact parser: it works by gathering as
794	C<allow_nonref>. The reason for this is that it cannot be made to	816	much text as possible that I<could> be a valid JSON text, followed by
795	work sensibly: JSON objects and arrays are self-delimited, i.e. you can concatenate	817	trying to decode it.
796	them back to back and still decode them perfectly. This does not hold true
797	for JSON numbers, however.
798		818
799	For example, is the string C<1> a single JSON number, or is it simply the	819	That means it sometimes needs to read more data than strictly necessary to
800	start of C<12>? Or is C<12> a single JSON number, or the concatenation	820	diagnose an invalid JSON text. For example, after parsing the following
801	of C<1> and C<2>? In neither case you can tell, and this is why JSON::XS	821	fragment, the parser I<could> stop with an error, as this fragment
802	takes the conservative route and disallows this case.	822	I<cannot> be the beginning of a valid JSON text:
		823
		824	[,
		825
		826	In reality, hopwever, the parser might continue to read data until a
		827	length limit is exceeded or it finds a closing bracket.
803		828
804	=head2 EXAMPLES	829	=head2 EXAMPLES
805		830
806	Some examples will make all this clearer. First, a simple example that	831	Some examples will make all this clearer. First, a simple example that
807	works similarly to C<decode_prefix>: We want to decode the JSON object at	832	works similarly to C<decode_prefix>: We want to decode the JSON object at
…		…
951	refers to the abstract Perl language itself.	976	refers to the abstract Perl language itself.
952		977
953		978
954	=head2 JSON -> PERL	979	=head2 JSON -> PERL
955		980
956	=over 4	981	=over
957		982
958	=item object	983	=item object
959		984
960	A JSON object becomes a reference to a hash in Perl. No ordering of object	985	A JSON object becomes a reference to a hash in Perl. No ordering of object
961	keys is preserved (JSON does not preserve object key ordering itself).	986	keys is preserved (JSON does not preserve object key ordering itself).
…		…
981	If the number consists of digits only, JSON::XS will try to represent	1006	If the number consists of digits only, JSON::XS will try to represent
982	it as an integer value. If that fails, it will try to represent it as	1007	it as an integer value. If that fails, it will try to represent it as
983	a numeric (floating point) value if that is possible without loss of	1008	a numeric (floating point) value if that is possible without loss of
984	precision. Otherwise it will preserve the number as a string value (in	1009	precision. Otherwise it will preserve the number as a string value (in
985	which case you lose roundtripping ability, as the JSON number will be	1010	which case you lose roundtripping ability, as the JSON number will be
986	re-encoded toa JSON string).	1011	re-encoded to a JSON string).
987		1012
988	Numbers containing a fractional or exponential part will always be	1013	Numbers containing a fractional or exponential part will always be
989	represented as numeric (floating point) values, possibly at a loss of	1014	represented as numeric (floating point) values, possibly at a loss of
990	precision (in which case you might lose perfect roundtripping ability, but	1015	precision (in which case you might lose perfect roundtripping ability, but
991	the JSON number will still be re-encoded as a JSON number).	1016	the JSON number will still be re-encoded as a JSON number).
992		1017
		1018	Note that precision is not accuracy - binary floating point values cannot
		1019	represent most decimal fractions exactly, and when converting from and to
		1020	floating point, JSON::XS only guarantees precision up to but not including
		1021	the least significant bit.
		1022
993	=item true, false	1023	=item true, false
994		1024
995	These JSON atoms become C<JSON::XS::true> and C<JSON::XS::false>,	1025	These JSON atoms become C<Types::Serialiser::true> and
996	respectively. They are overloaded to act almost exactly like the numbers	1026	C<Types::Serialiser::false>, respectively. They are overloaded to act
997	C<1> and C<0>. You can check whether a scalar is a JSON boolean by using	1027	almost exactly like the numbers C<1> and C<0>. You can check whether
998	the C<JSON::XS::is_bool> function.	1028	a scalar is a JSON boolean by using the C<Types::Serialiser::is_bool>
		1029	function (after C<use Types::Serialier>, of course).
999		1030
1000	=item null	1031	=item null
1001		1032
1002	A JSON null atom becomes C<undef> in Perl.	1033	A JSON null atom becomes C<undef> in Perl.
		1034
		1035	=item shell-style comments (C<< # I<text> >>)
		1036
		1037	As a nonstandard extension to the JSON syntax that is enabled by the
		1038	C<relaxed> setting, shell-style comments are allowed. They can start
		1039	anywhere outside strings and go till the end of the line.
		1040
		1041	=item tagged values (C<< (I<tag>)I<value> >>).
		1042
		1043	Another nonstandard extension to the JSON syntax, enabled with the
		1044	C<allow_tags> setting, are tagged values. In this implementation, the
		1045	I<tag> must be a perl package/class name encoded as a JSON string, and the
		1046	I<value> must be a JSON array encoding optional constructor arguments.
		1047
		1048	See L<OBJECT SERIALISATION>, below, for details.
1003		1049
1004	=back	1050	=back
1005		1051
1006		1052
1007	=head2 PERL -> JSON	1053	=head2 PERL -> JSON
1008		1054
1009	The mapping from Perl to JSON is slightly more difficult, as Perl is a	1055	The mapping from Perl to JSON is slightly more difficult, as Perl is a
1010	truly typeless language, so we can only guess which JSON type is meant by	1056	truly typeless language, so we can only guess which JSON type is meant by
1011	a Perl value.	1057	a Perl value.
1012		1058
1013	=over 4	1059	=over
1014		1060
1015	=item hash references	1061	=item hash references
1016		1062
1017	Perl hash references become JSON objects. As there is no inherent ordering	1063	Perl hash references become JSON objects. As there is no inherent
1018	in hash keys (or JSON objects), they will usually be encoded in a	1064	ordering in hash keys (or JSON objects), they will usually be encoded
1019	pseudo-random order that can change between runs of the same program but	1065	in a pseudo-random order. JSON::XS can optionally sort the hash keys
1020	stays generally the same within a single run of a program. JSON::XS can	1066	(determined by the I<canonical> flag), so the same datastructure will
1021	optionally sort the hash keys (determined by the I<canonical> flag), so	1067	serialise to the same JSON text (given same settings and version of
1022	the same datastructure will serialise to the same JSON text (given same	1068	JSON::XS), but this incurs a runtime overhead and is only rarely useful,
1023	settings and version of JSON::XS), but this incurs a runtime overhead	1069	e.g. when you want to compare some JSON text against another for equality.
1024	and is only rarely useful, e.g. when you want to compare some JSON text
1025	against another for equality.
1026		1070
1027	=item array references	1071	=item array references
1028		1072
1029	Perl array references become JSON arrays.	1073	Perl array references become JSON arrays.
1030		1074
1031	=item other references	1075	=item other references
1032		1076
1033	Other unblessed references are generally not allowed and will cause an	1077	Other unblessed references are generally not allowed and will cause an
1034	exception to be thrown, except for references to the integers C<0> and	1078	exception to be thrown, except for references to the integers C<0> and
1035	C<1>, which get turned into C<false> and C<true> atoms in JSON. You can	1079	C<1>, which get turned into C<false> and C<true> atoms in JSON.
1036	also use C<JSON::XS::false> and C<JSON::XS::true> to improve readability.
1037		1080
		1081	Since C<JSON::XS> uses the boolean model from L<Types::Serialiser>, you
		1082	can also C<use Types::Serialiser> and then use C<Types::Serialiser::false>
		1083	and C<Types::Serialiser::true> to improve readability.
		1084
		1085	use Types::Serialiser;
1038	encode_json [\0, JSON::XS::true] # yields [false,true]	1086	encode_json [\0, Types::Serialiser::true] # yields [false,true]
1039		1087
1040	=item JSON::XS::true, JSON::XS::false	1088	=item Types::Serialiser::true, Types::Serialiser::false
1041		1089
1042	These special values become JSON true and JSON false values,	1090	These special values from the L<Types::Serialiser> module become JSON true
1043	respectively. You can also use C<\1> and C<\0> directly if you want.	1091	and JSON false values, respectively. You can also use C<\1> and C<\0>
		1092	directly if you want.
1044		1093
1045	=item blessed objects	1094	=item blessed objects
1046		1095
1047	Blessed objects are not directly representable in JSON. See the	1096	Blessed objects are not directly representable in JSON, but C<JSON::XS>
1048	C<allow_blessed> and C<convert_blessed> methods on various options on	1097	allows various ways of handling objects. See L<OBJECT SERIALISATION>,
1049	how to deal with this: basically, you can choose between throwing an	1098	below, for details.
1050	exception, encoding the reference as if it weren't blessed, or provide
1051	your own serialiser method.
1052		1099
1053	=item simple scalars	1100	=item simple scalars
1054		1101
1055	Simple Perl scalars (any scalar that is not a reference) are the most	1102	Simple Perl scalars (any scalar that is not a reference) are the most
1056	difficult objects to encode: JSON::XS will encode undefined scalars as	1103	difficult objects to encode: JSON::XS will encode undefined scalars as
…		…
1084		1131
1085	You can not currently force the type in other, less obscure, ways. Tell me	1132	You can not currently force the type in other, less obscure, ways. Tell me
1086	if you need this capability (but don't forget to explain why it's needed	1133	if you need this capability (but don't forget to explain why it's needed
1087	:).	1134	:).
1088		1135
		1136	Note that numerical precision has the same meaning as under Perl (so
		1137	binary to decimal conversion follows the same rules as in Perl, which
		1138	can differ to other languages). Also, your perl interpreter might expose
		1139	extensions to the floating point numbers of your platform, such as
		1140	infinities or NaN's - these cannot be represented in JSON, and it is an
		1141	error to pass those in.
		1142
1089	=back	1143	=back
		1144
		1145	=head2 OBJECT SERIALISATION
		1146
		1147	As JSON cannot directly represent Perl objects, you have to choose between
		1148	a pure JSON representation (without the ability to deserialise the object
		1149	automatically again), and a nonstandard extension to the JSON syntax,
		1150	tagged values.
		1151
		1152	=head3 SERIALISATION
		1153
		1154	What happens when C<JSON::XS> encounters a Perl object depends on the
		1155	C<allow_blessed>, C<convert_blessed> and C<allow_tags> settings, which are
		1156	used in this order:
		1157
		1158	=over
		1159
		1160	=item 1. C<allow_tags> is enabled and the object has a C<FREEZE> method.
		1161
		1162	In this case, C<JSON::XS> uses the L<Types::Serialiser> object
		1163	serialisation protocol to create a tagged JSON value, using a nonstandard
		1164	extension to the JSON syntax.
		1165
		1166	This works by invoking the C<FREEZE> method on the object, with the first
		1167	argument being the object to serialise, and the second argument being the
		1168	constant string C<JSON> to distinguish it from other serialisers.
		1169
		1170	The C<FREEZE> method can return any number of values (i.e. zero or
		1171	more). These values and the paclkage/classname of the object will then be
		1172	encoded as a tagged JSON value in the following format:
		1173
		1174	("classname")[FREEZE return values...]
		1175
		1176	e.g.:
		1177
		1178	("URI")["http://www.google.com/"]
		1179	("MyDate")[2013,10,29]
		1180	("ImageData::JPEG")["Z3...VlCg=="]
		1181
		1182	For example, the hypothetical C<My::Object> C<FREEZE> method might use the
		1183	objects C<type> and C<id> members to encode the object:
		1184
		1185	sub My::Object::FREEZE {
		1186	my ($self, $serialiser) = @_;
		1187
		1188	($self->{type}, $self->{id})
		1189	}
		1190
		1191	=item 2. C<convert_blessed> is enabled and the object has a C<TO_JSON> method.
		1192
		1193	In this case, the C<TO_JSON> method of the object is invoked in scalar
		1194	context. It must return a single scalar that can be directly encoded into
		1195	JSON. This scalar replaces the object in the JSON text.
		1196
		1197	For example, the following C<TO_JSON> method will convert all L<URI>
		1198	objects to JSON strings when serialised. The fatc that these values
		1199	originally were L<URI> objects is lost.
		1200
		1201	sub URI::TO_JSON {
		1202	my ($uri) = @_;
		1203	$uri->as_string
		1204	}
		1205
		1206	=item 3. C<allow_blessed> is enabled.
		1207
		1208	The object will be serialised as a JSON null value.
		1209
		1210	=item 4. none of the above
		1211
		1212	If none of the settings are enabled or the respective methods are missing,
		1213	C<JSON::XS> throws an exception.
		1214
		1215	=back
		1216
		1217	=head3 DESERIALISATION
		1218
		1219	For deserialisation there are only two cases to consider: either
		1220	nonstandard tagging was used, in which case C<allow_tags> decides,
		1221	or objects cannot be automatically be deserialised, in which
		1222	case you can use postprocessing or the C<filter_json_object> or
		1223	C<filter_json_single_key_object> callbacks to get some real objects our of
		1224	your JSON.
		1225
		1226	This section only considers the tagged value case: I a tagged JSON object
		1227	is encountered during decoding and C<allow_tags> is disabled, a parse
		1228	error will result (as if tagged values were not part of the grammar).
		1229
		1230	If C<allow_tags> is enabled, C<JSON::XS> will look up the C<THAW> method
		1231	of the package/classname used during serialisation (it will not attempt
		1232	to load the package as a Perl module). If there is no such method, the
		1233	decoding will fail with an error.
		1234
		1235	Otherwise, the C<THAW> method is invoked with the classname as first
		1236	argument, the constant string C<JSON> as second argument, and all the
		1237	values from the JSON array (the values originally returned by the
		1238	C<FREEZE> method) as remaining arguments.
		1239
		1240	The method must then return the object. While technically you can return
		1241	any Perl scalar, you might have to enable the C<enable_nonref> setting to
		1242	make that work in all cases, so better return an actual blessed reference.
		1243
		1244	As an example, let's implement a C<THAW> function that regenerates the
		1245	C<My::Object> from the C<FREEZE> example earlier:
		1246
		1247	sub My::Object::THAW {
		1248	my ($class, $serialiser, $type, $id) = @_;
		1249
		1250	$class->new (type => $type, id => $id)
		1251	}
1090		1252
1091		1253
1092	=head1 ENCODING/CODESET FLAG NOTES	1254	=head1 ENCODING/CODESET FLAG NOTES
1093		1255
1094	The interested reader might have seen a number of flags that signify	1256	The interested reader might have seen a number of flags that signify
…		…
1112	takes those codepoint numbers and I<encodes> them, in our case into	1274	takes those codepoint numbers and I<encodes> them, in our case into
1113	octets. Unicode is (among other things) a codeset, UTF-8 is an encoding,	1275	octets. Unicode is (among other things) a codeset, UTF-8 is an encoding,
1114	and ISO-8859-1 (= latin 1) and ASCII are both codesets I<and> encodings at	1276	and ISO-8859-1 (= latin 1) and ASCII are both codesets I<and> encodings at
1115	the same time, which can be confusing.	1277	the same time, which can be confusing.
1116		1278
1117	=over 4	1279	=over
1118		1280
1119	=item C<utf8> flag disabled	1281	=item C<utf8> flag disabled
1120		1282
1121	When C<utf8> is disabled (the default), then C<encode>/C<decode> generate	1283	When C<utf8> is disabled (the default), then C<encode>/C<decode> generate
1122	and expect Unicode strings, that is, characters with high ordinal Unicode	1284	and expect Unicode strings, that is, characters with high ordinal Unicode
1123	values (> 255) will be encoded as such characters, and likewise such	1285	values (> 255) will be encoded as such characters, and likewise such
1124	characters are decoded as-is, no canges to them will be done, except	1286	characters are decoded as-is, no changes to them will be done, except
1125	"(re-)interpreting" them as Unicode codepoints or Unicode characters,	1287	"(re-)interpreting" them as Unicode codepoints or Unicode characters,
1126	respectively (to Perl, these are the same thing in strings unless you do	1288	respectively (to Perl, these are the same thing in strings unless you do
1127	funny/weird/dumb stuff).	1289	funny/weird/dumb stuff).
1128		1290
1129	This is useful when you want to do the encoding yourself (e.g. when you	1291	This is useful when you want to do the encoding yourself (e.g. when you
…		…
1139	expect your input strings to be encoded as UTF-8, that is, no "character"	1301	expect your input strings to be encoded as UTF-8, that is, no "character"
1140	of the input string must have any value > 255, as UTF-8 does not allow	1302	of the input string must have any value > 255, as UTF-8 does not allow
1141	that.	1303	that.
1142		1304
1143	The C<utf8> flag therefore switches between two modes: disabled means you	1305	The C<utf8> flag therefore switches between two modes: disabled means you
1144	will get a Unicode string in Perl, enabled means you get an UTF-8 encoded	1306	will get a Unicode string in Perl, enabled means you get a UTF-8 encoded
1145	octet/binary string in Perl.	1307	octet/binary string in Perl.
1146		1308
1147	=item C<latin1> or C<ascii> flags enabled	1309	=item C<latin1> or C<ascii> flags enabled
1148		1310
1149	With C<latin1> (or C<ascii>) enabled, C<encode> will escape characters	1311	With C<latin1> (or C<ascii>) enabled, C<encode> will escape characters
…		…
1185	proper subset of most 8-bit and multibyte encodings in use in the world.	1347	proper subset of most 8-bit and multibyte encodings in use in the world.
1186		1348
1187	=back	1349	=back
1188		1350
1189		1351
		1352	=head2 JSON and ECMAscript
		1353
		1354	JSON syntax is based on how literals are represented in javascript (the
		1355	not-standardised predecessor of ECMAscript) which is presumably why it is
		1356	called "JavaScript Object Notation".
		1357
		1358	However, JSON is not a subset (and also not a superset of course) of
		1359	ECMAscript (the standard) or javascript (whatever browsers actually
		1360	implement).
		1361
		1362	If you want to use javascript's C<eval> function to "parse" JSON, you
		1363	might run into parse errors for valid JSON texts, or the resulting data
		1364	structure might not be queryable:
		1365
		1366	One of the problems is that U+2028 and U+2029 are valid characters inside
		1367	JSON strings, but are not allowed in ECMAscript string literals, so the
		1368	following Perl fragment will not output something that can be guaranteed
		1369	to be parsable by javascript's C<eval>:
		1370
		1371	use JSON::XS;
		1372
		1373	print encode_json [chr 0x2028];
		1374
		1375	The right fix for this is to use a proper JSON parser in your javascript
		1376	programs, and not rely on C<eval> (see for example Douglas Crockford's
		1377	F<json2.js> parser).
		1378
		1379	If this is not an option, you can, as a stop-gap measure, simply encode to
		1380	ASCII-only JSON:
		1381
		1382	use JSON::XS;
		1383
		1384	print JSON::XS->new->ascii->encode ([chr 0x2028]);
		1385
		1386	Note that this will enlarge the resulting JSON text quite a bit if you
		1387	have many non-ASCII characters. You might be tempted to run some regexes
		1388	to only escape U+2028 and U+2029, e.g.:
		1389
		1390	# DO NOT USE THIS!
		1391	my $json = JSON::XS->new->utf8->encode ([chr 0x2028]);
		1392	$json =~ s/\xe2\x80\xa8/\\u2028/g; # escape U+2028
		1393	$json =~ s/\xe2\x80\xa9/\\u2029/g; # escape U+2029
		1394	print $json;
		1395
		1396	Note that I<this is a bad idea>: the above only works for U+2028 and
		1397	U+2029 and thus only for fully ECMAscript-compliant parsers. Many existing
		1398	javascript implementations, however, have issues with other characters as
		1399	well - using C<eval> naively simply I<will> cause problems.
		1400
		1401	Another problem is that some javascript implementations reserve
		1402	some property names for their own purposes (which probably makes
		1403	them non-ECMAscript-compliant). For example, Iceweasel reserves the
		1404	C<__proto__> property name for its own purposes.
		1405
		1406	If that is a problem, you could parse try to filter the resulting JSON
		1407	output for these property strings, e.g.:
		1408
		1409	$json =~ s/"__proto__"\s*:/"__proto__renamed":/g;
		1410
		1411	This works because C<__proto__> is not valid outside of strings, so every
		1412	occurrence of C<"__proto__"\s*:> must be a string used as property name.
		1413
		1414	If you know of other incompatibilities, please let me know.
		1415
		1416
1190	=head2 JSON and YAML	1417	=head2 JSON and YAML
1191		1418
1192	You often hear that JSON is a subset of YAML. This is, however, a mass	1419	You often hear that JSON is a subset of YAML. This is, however, a mass
1193	hysteria(*) and very far from the truth (as of the time of this writing),	1420	hysteria(*) and very far from the truth (as of the time of this writing),
1194	so let me state it clearly: I<in general, there is no way to configure	1421	so let me state it clearly: I<in general, there is no way to configure
…		…
1202	my $yaml = $to_yaml->encode ($ref) . "\n";	1429	my $yaml = $to_yaml->encode ($ref) . "\n";
1203		1430
1204	This will I<usually> generate JSON texts that also parse as valid	1431	This will I<usually> generate JSON texts that also parse as valid
1205	YAML. Please note that YAML has hardcoded limits on (simple) object key	1432	YAML. Please note that YAML has hardcoded limits on (simple) object key
1206	lengths that JSON doesn't have and also has different and incompatible	1433	lengths that JSON doesn't have and also has different and incompatible
1207	unicode handling, so you should make sure that your hash keys are	1434	unicode character escape syntax, so you should make sure that your hash
1208	noticeably shorter than the 1024 "stream characters" YAML allows and that	1435	keys are noticeably shorter than the 1024 "stream characters" YAML allows
1209	you do not have characters with codepoint values outside the Unicode BMP	1436	and that you do not have characters with codepoint values outside the
1210	(basic multilingual page). YAML also does not allow C<\/> sequences in	1437	Unicode BMP (basic multilingual page). YAML also does not allow C<\/>
1211	strings (which JSON::XS does not I<currently> generate, but other JSON	1438	sequences in strings (which JSON::XS does not I<currently> generate, but
1212	generators might).	1439	other JSON generators might).
1213		1440
1214	There might be other incompatibilities that I am not aware of (or the YAML	1441	There might be other incompatibilities that I am not aware of (or the YAML
1215	specification has been changed yet again - it does so quite often). In	1442	specification has been changed yet again - it does so quite often). In
1216	general you should not try to generate YAML with a JSON generator or vice	1443	general you should not try to generate YAML with a JSON generator or vice
1217	versa, or try to parse JSON with a YAML parser or vice versa: chances are	1444	versa, or try to parse JSON with a YAML parser or vice versa: chances are
1218	high that you will run into severe interoperability problems when you	1445	high that you will run into severe interoperability problems when you
1219	least expect it.	1446	least expect it.
1220		1447
1221	=over 4	1448	=over
1222		1449
1223	=item (*)	1450	=item (*)
1224		1451
1225	I have been pressured multiple times by Brian Ingerson (one of the	1452	I have been pressured multiple times by Brian Ingerson (one of the
1226	authors of the YAML specification) to remove this paragraph, despite him	1453	authors of the YAML specification) to remove this paragraph, despite him
…		…
1236	that difficult or long) and finally make YAML compatible to it, and	1463	that difficult or long) and finally make YAML compatible to it, and
1237	educating users about the changes, instead of spreading lies about the	1464	educating users about the changes, instead of spreading lies about the
1238	real compatibility for many I<years> and trying to silence people who	1465	real compatibility for many I<years> and trying to silence people who
1239	point out that it isn't true.	1466	point out that it isn't true.
1240		1467
		1468	Addendum/2009: the YAML 1.2 spec is still incompatible with JSON, even
		1469	though the incompatibilities have been documented (and are known to Brian)
		1470	for many years and the spec makes explicit claims that YAML is a superset
		1471	of JSON. It would be so easy to fix, but apparently, bullying people and
		1472	corrupting userdata is so much easier.
		1473
1241	=back	1474	=back
1242		1475
1243		1476
1244	=head2 SPEED	1477	=head2 SPEED
1245		1478
…		…
1252	a very short single-line JSON string (also available at	1485	a very short single-line JSON string (also available at
1253	L<http://dist.schmorp.de/misc/json/short.json>).	1486	L<http://dist.schmorp.de/misc/json/short.json>).
1254		1487
1255	{"method": "handleMessage", "params": ["user1",	1488	{"method": "handleMessage", "params": ["user1",
1256	"we were just talking"], "id": null, "array":[1,11,234,-5,1e5,1e7,	1489	"we were just talking"], "id": null, "array":[1,11,234,-5,1e5,1e7,
1257	true, false]}	1490	1, 0]}
1258		1491
1259	It shows the number of encodes/decodes per second (JSON::XS uses	1492	It shows the number of encodes/decodes per second (JSON::XS uses
1260	the functional interface, while JSON::XS/2 uses the OO interface	1493	the functional interface, while JSON::XS/2 uses the OO interface
1261	with pretty-printing and hashkey sorting enabled, JSON::XS/3 enables	1494	with pretty-printing and hashkey sorting enabled, JSON::XS/3 enables
1262	shrink). Higher is better:	1495	shrink. JSON::DWIW/DS uses the deserialise function, while JSON::DWIW::FJ
		1496	uses the from_json method). Higher is better:
1263		1497
1264	module \| encode \| decode \|	1498	module \| encode \| decode \|
1265	-----------\|------------\|------------\|	1499	--------------\|------------\|------------\|
1266	JSON 1.x \| 4990.842 \| 4088.813 \|	1500	JSON::DWIW/DS \| 86302.551 \| 102300.098 \|
1267	JSON::DWIW \| 51653.990 \| 71575.154 \|	1501	JSON::DWIW/FJ \| 86302.551 \| 75983.768 \|
1268	JSON::PC \| 65948.176 \| 74631.744 \|	1502	JSON::PP \| 15827.562 \| 6638.658 \|
1269	JSON::PP \| 8931.652 \| 3817.168 \|	1503	JSON::Syck \| 63358.066 \| 47662.545 \|
1270	JSON::Syck \| 24877.248 \| 27776.848 \|	1504	JSON::XS \| 511500.488 \| 511500.488 \|
1271	JSON::XS \| 388361.481 \| 227951.304 \|	1505	JSON::XS/2 \| 291271.111 \| 388361.481 \|
1272	JSON::XS/2 \| 227951.304 \| 218453.333 \|	1506	JSON::XS/3 \| 361577.931 \| 361577.931 \|
1273	JSON::XS/3 \| 338250.323 \| 218453.333 \|	1507	Storable \| 66788.280 \| 265462.278 \|
1274	Storable \| 16500.016 \| 135300.129 \|
1275	-----------+------------+------------+	1508	--------------+------------+------------+
1276		1509
1277	That is, JSON::XS is about five times faster than JSON::DWIW on encoding,	1510	That is, JSON::XS is almost six times faster than JSON::DWIW on encoding,
1278	about three times faster on decoding, and over forty times faster	1511	about five times faster on decoding, and over thirty to seventy times
1279	than JSON, even with pretty-printing and key sorting. It also compares	1512	faster than JSON's pure perl implementation. It also compares favourably
1280	favourably to Storable for small amounts of data.	1513	to Storable for small amounts of data.
1281		1514
1282	Using a longer test string (roughly 18KB, generated from Yahoo! Locals	1515	Using a longer test string (roughly 18KB, generated from Yahoo! Locals
1283	search API (L<http://dist.schmorp.de/misc/json/long.json>).	1516	search API (L<http://dist.schmorp.de/misc/json/long.json>).
1284		1517
1285	module \| encode \| decode \|	1518	module \| encode \| decode \|
1286	-----------\|------------\|------------\|	1519	--------------\|------------\|------------\|
1287	JSON 1.x \| 55.260 \| 34.971 \|	1520	JSON::DWIW/DS \| 1647.927 \| 2673.916 \|
1288	JSON::DWIW \| 825.228 \| 1082.513 \|	1521	JSON::DWIW/FJ \| 1630.249 \| 2596.128 \|
1289	JSON::PC \| 3571.444 \| 2394.829 \|
1290	JSON::PP \| 210.987 \| 32.574 \|	1522	JSON::PP \| 400.640 \| 62.311 \|
1291	JSON::Syck \| 552.551 \| 787.544 \|	1523	JSON::Syck \| 1481.040 \| 1524.869 \|
1292	JSON::XS \| 5780.463 \| 4854.519 \|	1524	JSON::XS \| 20661.596 \| 9541.183 \|
1293	JSON::XS/2 \| 3869.998 \| 4798.975 \|	1525	JSON::XS/2 \| 10683.403 \| 9416.938 \|
1294	JSON::XS/3 \| 5862.880 \| 4798.975 \|	1526	JSON::XS/3 \| 20661.596 \| 9400.054 \|
1295	Storable \| 4445.002 \| 5235.027 \|	1527	Storable \| 19765.806 \| 10000.725 \|
1296	-----------+------------+------------+	1528	--------------+------------+------------+
1297		1529
1298	Again, JSON::XS leads by far (except for Storable which non-surprisingly	1530	Again, JSON::XS leads by far (except for Storable which non-surprisingly
1299	decodes faster).	1531	decodes a bit faster).
1300		1532
1301	On large strings containing lots of high Unicode characters, some modules	1533	On large strings containing lots of high Unicode characters, some modules
1302	(such as JSON::PC) seem to decode faster than JSON::XS, but the result	1534	(such as JSON::PC) seem to decode faster than JSON::XS, but the result
1303	will be broken due to missing (or wrong) Unicode handling. Others refuse	1535	will be broken due to missing (or wrong) Unicode handling. Others refuse
1304	to decode or encode properly, so it was impossible to prepare a fair	1536	to decode or encode properly, so it was impossible to prepare a fair
…		…
1340	information you might want to make sure that exceptions thrown by JSON::XS	1572	information you might want to make sure that exceptions thrown by JSON::XS
1341	will not end up in front of untrusted eyes.	1573	will not end up in front of untrusted eyes.
1342		1574
1343	If you are using JSON::XS to return packets to consumption	1575	If you are using JSON::XS to return packets to consumption
1344	by JavaScript scripts in a browser you should have a look at	1576	by JavaScript scripts in a browser you should have a look at
1345	L<http://jpsykes.com/47/practical-csrf-and-json-security> to see whether	1577	L<http://blog.archive.jpsykes.com/47/practical-csrf-and-json-security/> to
1346	you are vulnerable to some common attack vectors (which really are browser	1578	see whether you are vulnerable to some common attack vectors (which really
1347	design bugs, but it is still you who will have to deal with it, as major	1579	are browser design bugs, but it is still you who will have to deal with
1348	browser developers care only for features, not about getting security	1580	it, as major browser developers care only for features, not about getting
1349	right).	1581	security right).
1350		1582
1351		1583
		1584	=head2 "OLD" VS. "NEW" JSON (RFC4627 VS. RFC7159)
		1585
		1586	JSON originally required JSON texts to represent an array or object -
		1587	scalar values were explicitly not allowed. This has changed, and versions
		1588	of JSON::XS beginning with C<4.0> reflect this by allowing scalar values
		1589	by default.
		1590
		1591	One reason why one might not want this is that this removes a fundamental
		1592	property of JSON texts, namely that they are self-delimited and
		1593	self-contained, or in other words, you could take any number of "old"
		1594	JSON texts and paste them together, and the result would be unambiguously
		1595	parseable:
		1596
		1597	[1,3]{"k":5}[][null] # four JSON texts, without doubt
		1598
		1599	By allowing scalars, this property is lost: in the following example, is
		1600	this one JSON text (the number 12) or two JSON texts (the numbers 1 and
		1601	2):
		1602
		1603	12 # could be 12, or 1 and 2
		1604
		1605	Another lost property of "old" JSON is that no lookahead is required to
		1606	know the end of a JSON text, i.e. the JSON text definitely ended at the
		1607	last C<]> or C<}> character, there was no need to read extra characters.
		1608
		1609	For example, a viable network protocol with "old" JSON was to simply
		1610	exchange JSON texts without delimiter. For "new" JSON, you have to use a
		1611	suitable delimiter (such as a newline) after every JSON text or ensure you
		1612	never encode/decode scalar values.
		1613
		1614	Most protocols do work by only transferring arrays or objects, and the
		1615	easiest way to avoid problems with the "new" JSON definition is to
		1616	explicitly disallow scalar values in your encoder and decoder:
		1617
		1618	$json_coder = JSON::XS->new->allow_nonref (0)
		1619
		1620	This is a somewhat unhappy situation, and the blame can fully be put on
		1621	JSON's inmventor, Douglas Crockford, who unilaterally changed the format
		1622	in 2006 without consulting the IETF, forcing the IETF to either fork the
		1623	format or go with it (as I was told, the IETF wasn't amused).
		1624
		1625
		1626	=head1 RELATIONSHIP WITH I-JSON
		1627
		1628	JSON is a somewhat sloppily-defined format - it carries around obvious
		1629	Javascript baggage, such as not really defining number range, probably
		1630	because Javascript only has one type of numbers: IEEE 64 bit floats
		1631	("binary64").
		1632
		1633	For this reaosn, RFC7493 defines "Internet JSON", which is a restricted
		1634	subset of JSON that is supposedly more interoperable on the internet.
		1635
		1636	While C<JSON::XS> does not offer specific support for I-JSON, it of course
		1637	accepts valid I-JSON and by default implements some of the limitations
		1638	of I-JSON, such as parsing numbers as perl numbers, which are usually a
		1639	superset of binary64 numbers.
		1640
		1641	To generate I-JSON, follow these rules:
		1642
		1643	=over
		1644
		1645	=item * always generate UTF-8
		1646
		1647	I-JSON must be encoded in UTF-8, the default for C<encode_json>.
		1648
		1649	=item * numbers should be within IEEE 754 binary64 range
		1650
		1651	Basically all existing perl installations use binary64 to represent
		1652	floating point numbers, so all you need to do is to avoid large integers.
		1653
		1654	=item * objects must not have duplicate keys
		1655
		1656	This is trivially done, as C<JSON::XS> does not allow duplicate keys.
		1657
		1658	=item * do not generate scalar JSON texts, use C<< ->allow_nonref (0) >>
		1659
		1660	I-JSON strongly requests you to only encode arrays and objects into JSON.
		1661
		1662	=item * times should be strings in ISO 8601 format
		1663
		1664	There are a myriad of modules on CPAN dealing with ISO 8601 - search for
		1665	C<ISO8601> on CPAN and use one.
		1666
		1667	=item * encode binary data as base64
		1668
		1669	While it's tempting to just dump binary data as a string (and let
		1670	C<JSON::XS> do the escaping), for I-JSON, it's I<recommended> to encode
		1671	binary data as base64.
		1672
		1673	=back
		1674
		1675	There are some other considerations - read RFC7493 for the details if
		1676	interested.
		1677
		1678
		1679	=head1 INTEROPERABILITY WITH OTHER MODULES
		1680
		1681	C<JSON::XS> uses the L<Types::Serialiser> module to provide boolean
		1682	constants. That means that the JSON true and false values will be
		1683	comaptible to true and false values of other modules that do the same,
		1684	such as L<JSON::PP> and L<CBOR::XS>.
		1685
		1686
		1687	=head1 INTEROPERABILITY WITH OTHER JSON DECODERS
		1688
		1689	As long as you only serialise data that can be directly expressed in JSON,
		1690	C<JSON::XS> is incapable of generating invalid JSON output (modulo bugs,
		1691	but C<JSON::XS> has found more bugs in the official JSON testsuite (1)
		1692	than the official JSON testsuite has found in C<JSON::XS> (0)).
		1693
		1694	When you have trouble decoding JSON generated by this module using other
		1695	decoders, then it is very likely that you have an encoding mismatch or the
		1696	other decoder is broken.
		1697
		1698	When decoding, C<JSON::XS> is strict by default and will likely catch all
		1699	errors. There are currently two settings that change this: C<relaxed>
		1700	makes C<JSON::XS> accept (but not generate) some non-standard extensions,
		1701	and C<allow_tags> will allow you to encode and decode Perl objects, at the
		1702	cost of not outputting valid JSON anymore.
		1703
		1704	=head2 TAGGED VALUE SYNTAX AND STANDARD JSON EN/DECODERS
		1705
		1706	When you use C<allow_tags> to use the extended (and also nonstandard and
		1707	invalid) JSON syntax for serialised objects, and you still want to decode
		1708	the generated When you want to serialise objects, you can run a regex
		1709	to replace the tagged syntax by standard JSON arrays (it only works for
		1710	"normal" package names without comma, newlines or single colons). First,
		1711	the readable Perl version:
		1712
		1713	# if your FREEZE methods return no values, you need this replace first:
		1714	$json =~ s/$ \s* (" (?: [^\\":,]+\|\\.\|::)* ") \s* $ \s* \[\s*\]/[$1]/gx;
		1715
		1716	# this works for non-empty constructor arg lists:
		1717	$json =~ s/$ \s* (" (?: [^\\":,]+\|\\.\|::)* ") \s* $ \s* \[/[$1,/gx;
		1718
		1719	And here is a less readable version that is easy to adapt to other
		1720	languages:
		1721
		1722	$json =~ s/$\s("([^\\":,]+\|\\.\|::)")\s$\s\[/[$1,/g;
		1723
		1724	Here is an ECMAScript version (same regex):
		1725
		1726	json = json.replace (/$\s("([^\\":,]+\|\\.\|::)")\s$\s\[/g, "[$1,");
		1727
		1728	Since this syntax converts to standard JSON arrays, it might be hard to
		1729	distinguish serialised objects from normal arrays. You can prepend a
		1730	"magic number" as first array element to reduce chances of a collision:
		1731
		1732	$json =~ s/$\s("([^\\":,]+\|\\.\|::)")\s$\s\[/["XU1peReLzT4ggEllLanBYq4G9VzliwKF",$1,/g;
		1733
		1734	And after decoding the JSON text, you could walk the data
		1735	structure looking for arrays with a first element of
		1736	C<XU1peReLzT4ggEllLanBYq4G9VzliwKF>.
		1737
		1738	The same approach can be used to create the tagged format with another
		1739	encoder. First, you create an array with the magic string as first member,
		1740	the classname as second, and constructor arguments last, encode it as part
		1741	of your JSON structure, and then:
		1742
		1743	$json =~ s/\[\s"XU1peReLzT4ggEllLanBYq4G9VzliwKF"\s,\s("([^\\":,]+\|\\.\|::)")\s*,/($1)[/g;
		1744
		1745	Again, this has some limitations - the magic string must not be encoded
		1746	with character escapes, and the constructor arguments must be non-empty.
		1747
		1748
1352	=head1 THREADS	1749	=head1 (I-)THREADS
1353		1750
1354	This module is I<not> guaranteed to be thread safe and there are no	1751	This module is I<not> guaranteed to be ithread (or MULTIPLICITY-) safe
1355	plans to change this until Perl gets thread support (as opposed to the	1752	and there are no plans to change this. Note that perl's builtin so-called
1356	horribly slow so-called "threads" which are simply slow and bloated	1753	threads/ithreads are officially deprecated and should not be used.
1357	process simulations - use fork, it's I<much> faster, cheaper, better).
1358		1754
1359	(It might actually work, but you have been warned).	1755
		1756	=head1 THE PERILS OF SETLOCALE
		1757
		1758	Sometimes people avoid the Perl locale support and directly call the
		1759	system's setlocale function with C<LC_ALL>.
		1760
		1761	This breaks both perl and modules such as JSON::XS, as stringification of
		1762	numbers no longer works correctly (e.g. C<$x = 0.1; print "$x"+1> might
		1763	print C<1>, and JSON::XS might output illegal JSON as JSON::XS relies on
		1764	perl to stringify numbers).
		1765
		1766	The solution is simple: don't call C<setlocale>, or use it for only those
		1767	categories you need, such as C<LC_MESSAGES> or C<LC_CTYPE>.
		1768
		1769	If you need C<LC_NUMERIC>, you should enable it only around the code that
		1770	actually needs it (avoiding stringification of numbers), and restore it
		1771	afterwards.
		1772
		1773
		1774	=head1 SOME HISTORY
		1775
		1776	At the time this module was created there already were a number of JSON
		1777	modules available on CPAN, so what was the reason to write yet another
		1778	JSON module? While it seems there are many JSON modules, none of them
		1779	correctly handled all corner cases, and in most cases their maintainers
		1780	are unresponsive, gone missing, or not listening to bug reports for other
		1781	reasons.
		1782
		1783	Beginning with version 2.0 of the JSON module, when both JSON and
		1784	JSON::XS are installed, then JSON will fall back on JSON::XS (this can be
		1785	overridden) with no overhead due to emulation (by inheriting constructor
		1786	and methods). If JSON::XS is not available, it will fall back to the
		1787	compatible JSON::PP module as backend, so using JSON instead of JSON::XS
		1788	gives you a portable JSON API that can be fast when you need it and
		1789	doesn't require a C compiler when that is a problem.
		1790
		1791	Somewhere around version 3, this module was forked into
		1792	C<Cpanel::JSON::XS>, because its maintainer had serious trouble
		1793	understanding JSON and insisted on a fork with many bugs "fixed" that
		1794	weren't actually bugs, while spreading FUD about this module without
		1795	actually giving any details on his accusations. You be the judge, but
		1796	in my personal opinion, if you want quality, you will stay away from
		1797	dangerous forks like that.
1360		1798
1361		1799
1362	=head1 BUGS	1800	=head1 BUGS
1363		1801
1364	While the goal of this module is to be correct, that unfortunately does	1802	While the goal of this module is to be correct, that unfortunately does
…		…
1368	Please refrain from using rt.cpan.org or any other bug reporting	1806	Please refrain from using rt.cpan.org or any other bug reporting
1369	service. I put the contact address into my modules for a reason.	1807	service. I put the contact address into my modules for a reason.
1370		1808
1371	=cut	1809	=cut
1372		1810
1373	our $true = do { bless \(my $dummy = 1), "JSON::XS::Boolean" };	1811	BEGIN {
1374	our $false = do { bless \(my $dummy = 0), "JSON::XS::Boolean" };	1812	*true = \$Types::Serialiser::true;
		1813	*true = \&Types::Serialiser::true;
		1814	*false = \$Types::Serialiser::false;
		1815	*false = \&Types::Serialiser::false;
		1816	*is_bool = \&Types::Serialiser::is_bool;
1375		1817
1376	sub true() { $true }	1818	JSON::XS::Boolean:: = Types::Serialiser::Boolean::;
1377	sub false() { $false }
1378
1379	sub is_bool($) {
1380	UNIVERSAL::isa $_[0], "JSON::XS::Boolean"
1381	# or UNIVERSAL::isa $_[0], "JSON::Literal"
1382	}	1819	}
1383		1820
1384	XSLoader::load "JSON::XS", $VERSION;	1821	XSLoader::load "JSON::XS", $VERSION;
1385
1386	package JSON::XS::Boolean;
1387
1388	use overload
1389	"0+" => sub { ${$_[0]} },
1390	"++" => sub { $_[0] = ${$_[0]} + 1 },
1391	"--" => sub { $_[0] = ${$_[0]} - 1 },
1392	fallback => 1;
1393
1394	1;
1395		1822
1396	=head1 SEE ALSO	1823	=head1 SEE ALSO
1397		1824
1398	The F<json_xs> command line utility for quick experiments.	1825	The F<json_xs> command line utility for quick experiments.
1399		1826
…		…
1402	Marc Lehmann <schmorp@schmorp.de>	1829	Marc Lehmann <schmorp@schmorp.de>
1403	http://home.schmorp.de/	1830	http://home.schmorp.de/
1404		1831
1405	=cut	1832	=cut
1406		1833
		1834	1
		1835

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing JSON-XS/XS.pm (file contents): Revision 1.114 by root, Wed Jan 21 05:34:08 2009 UTC vs. Revision 1.171 by root, Thu Nov 15 22:49:06 2018 UTC

Diff Legend

Comparing JSON-XS/XS.pm (file contents):
Revision 1.114 by root, Wed Jan 21 05:34:08 2009 UTC vs.
Revision 1.171 by root, Thu Nov 15 22:49:06 2018 UTC