[ViewVC] Diff of: cvs/CBOR-XS/XS.pm

Comparing CBOR-XS/XS.pm (file contents):
Revision 1.68 by root, Wed Jul 17 09:37:16 2019 UTC vs.
Revision 1.87 by root, Mon Dec 19 20:31:33 2022 UTC

…		…
64		64
65	package CBOR::XS;	65	package CBOR::XS;
66		66
67	use common::sense;	67	use common::sense;
68		68
69	our $VERSION = 1.71;	69	our $VERSION = 1.86;
70	our @ISA = qw(Exporter);	70	our @ISA = qw(Exporter);
71		71
72	our @EXPORT = qw(encode_cbor decode_cbor);	72	our @EXPORT = qw(encode_cbor decode_cbor);
73		73
74	use Exporter;	74	use Exporter;
…		…
121	but configures the coder object to be safe to use with untrusted	121	but configures the coder object to be safe to use with untrusted
122	data. Currently, this is equivalent to:	122	data. Currently, this is equivalent to:
123		123
124	my $cbor = CBOR::XS	124	my $cbor = CBOR::XS
125	->new	125	->new
		126	->validate_utf8
126	->forbid_objects	127	->forbid_objects
127	->filter (\&CBOR::XS::safe_filter)	128	->filter (\&CBOR::XS::safe_filter)
128	->max_size (1e8);	129	->max_size (1e8);
129		130
130	But is more future proof (it is better to crash because of a change than	131	But is more future proof (it is better to crash because of a change than
…		…
133	=cut	134	=cut
134		135
135	sub new_safe {	136	sub new_safe {
136	CBOR::XS	137	CBOR::XS
137	->new	138	->new
		139	->validate_utf8
138	->forbid_objects	140	->forbid_objects
139	->filter (\&CBOR::XS::safe_filter)	141	->filter (\&CBOR::XS::safe_filter)
140	->max_size (1e8)	142	->max_size (1e8)
141	}	143	}
142		144
…		…
215	(L<http://cbor.schmorp.de/value-sharing>), as without decoder support, the	217	(L<http://cbor.schmorp.de/value-sharing>), as without decoder support, the
216	resulting data structure might be unusable.	218	resulting data structure might be unusable.
217		219
218	Detecting shared values incurs a runtime overhead when values are encoded	220	Detecting shared values incurs a runtime overhead when values are encoded
219	that have a reference counter large than one, and might unnecessarily	221	that have a reference counter large than one, and might unnecessarily
220	increase the encoded size, as potentially shared values are encode as	222	increase the encoded size, as potentially shared values are encoded as
221	shareable whether or not they are actually shared.	223	shareable whether or not they are actually shared.
222		224
223	At the moment, only targets of references can be shared (e.g. scalars,	225	At the moment, only targets of references can be shared (e.g. scalars,
224	arrays or hashes pointed to by a reference). Weirder constructs, such as	226	arrays or hashes pointed to by a reference). Weirder constructs, such as
225	an array with multiple "copies" of the I<same> string, which are hard but	227	an array with multiple "copies" of the I<same> string, which are hard but
…		…
330	strings as CBOR byte strings.	332	strings as CBOR byte strings.
331		333
332	This option does not affect C<decode> in any way.	334	This option does not affect C<decode> in any way.
333		335
334	This option has similar advantages and disadvantages as C<text_keys>. In	336	This option has similar advantages and disadvantages as C<text_keys>. In
335	addition, this option effectively removes the ability to encode byte	337	addition, this option effectively removes the ability to automatically
336	strings, which might break some C<FREEZE> and C<TO_CBOR> methods that rely	338	encode byte strings, which might break some C<FREEZE> and C<TO_CBOR>
337	on this, such as bignum encoding, so this option is mainly useful for very	339	methods that rely on this.
338	simple data.	340
		341	A workaround is to use explicit type casts, which are unaffected by this option.
339		342
340	=item $cbor = $cbor->validate_utf8 ([$enable])	343	=item $cbor = $cbor->validate_utf8 ([$enable])
341		344
342	=item $enabled = $cbor->get_validate_utf8	345	=item $enabled = $cbor->get_validate_utf8
343		346
…		…
453	when there is trailing garbage after the CBOR string, it will silently	456	when there is trailing garbage after the CBOR string, it will silently
454	stop parsing there and return the number of characters consumed so far.	457	stop parsing there and return the number of characters consumed so far.
455		458
456	This is useful if your CBOR texts are not delimited by an outer protocol	459	This is useful if your CBOR texts are not delimited by an outer protocol
457	and you need to know where the first CBOR string ends amd the next one	460	and you need to know where the first CBOR string ends amd the next one
458	starts.	461	starts - CBOR strings are self-delimited, so it is possible to concatenate
		462	CBOR strings without any delimiters or size fields and recover their data.
459		463
460	CBOR::XS->new->decode_prefix ("......")	464	CBOR::XS->new->decode_prefix ("......")
461	=> ("...", 3)	465	=> ("...", 3)
462		466
463	=back	467	=back
…		…
469	Perl data structure in memory at one time, it does allow you to parse a	473	Perl data structure in memory at one time, it does allow you to parse a
470	CBOR stream incrementally, using a similar to using "decode_prefix" to see	474	CBOR stream incrementally, using a similar to using "decode_prefix" to see
471	if a full CBOR object is available, but is much more efficient.	475	if a full CBOR object is available, but is much more efficient.
472		476
473	It basically works by parsing as much of a CBOR string as possible - if	477	It basically works by parsing as much of a CBOR string as possible - if
474	the CBOR data is not complete yet, the pasrer will remember where it was,	478	the CBOR data is not complete yet, the parser will remember where it was,
475	to be able to restart when more data has been accumulated. Once enough	479	to be able to restart when more data has been accumulated. Once enough
476	data is available to either decode a complete CBOR value or raise an	480	data is available to either decode a complete CBOR value or raise an
477	error, a real decode will be attempted.	481	error, a real decode will be attempted.
478		482
479	A typical use case would be a network protocol that consists of sending	483	A typical use case would be a network protocol that consists of sending
…		…
631	create such objects.	635	create such objects.
632		636
633	=item Types::Serialiser::true, Types::Serialiser::false, Types::Serialiser::error	637	=item Types::Serialiser::true, Types::Serialiser::false, Types::Serialiser::error
634		638
635	These special values become CBOR true, CBOR false and CBOR undefined	639	These special values become CBOR true, CBOR false and CBOR undefined
636	values, respectively. You can also use C<\1>, C<\0> and C<\undef> directly	640	values, respectively.
637	if you want.
638		641
639	=item other blessed objects	642	=item other blessed objects
640		643
641	Other blessed objects are serialised via C<TO_CBOR> or C<FREEZE>. See	644	Other blessed objects are serialised via C<TO_CBOR> or C<FREEZE>. See
642	L<TAG HANDLING AND EXTENSIONS> for specific classes handled by this	645	L<TAG HANDLING AND EXTENSIONS> for specific classes handled by this
…		…
667	"$x"; # stringified	670	"$x"; # stringified
668	$x .= ""; # another, more awkward way to stringify	671	$x .= ""; # another, more awkward way to stringify
669	print $x; # perl does it for you, too, quite often	672	print $x; # perl does it for you, too, quite often
670		673
671	You can force whether a string is encoded as byte or text string by using	674	You can force whether a string is encoded as byte or text string by using
672	C<utf8::upgrade> and C<utf8::downgrade> (if C<text_strings> is disabled):	675	C<utf8::upgrade> and C<utf8::downgrade> (if C<text_strings> is disabled).
673		676
674	utf8::upgrade $x; # encode $x as text string	677	utf8::upgrade $x; # encode $x as text string
675	utf8::downgrade $x; # encode $x as byte string	678	utf8::downgrade $x; # encode $x as byte string
		679
		680	More options are available, see L<TYPE CASTS>, below, and the C<text_keys>
		681	and C<text_strings> options.
676		682
677	Perl doesn't define what operations up- and downgrade strings, so if the	683	Perl doesn't define what operations up- and downgrade strings, so if the
678	difference between byte and text is important, you should up- or downgrade	684	difference between byte and text is important, you should up- or downgrade
679	your string as late as possible before encoding. You can also force the	685	your string as late as possible before encoding. You can also force the
680	use of CBOR text strings by using C<text_keys> or C<text_strings>.	686	use of CBOR text strings by using C<text_keys> or C<text_strings>.
…		…
695	format will be used. Perls that use formats other than IEEE double to	701	format will be used. Perls that use formats other than IEEE double to
696	represent numerical values are supported, but might suffer loss of	702	represent numerical values are supported, but might suffer loss of
697	precision.	703	precision.
698		704
699	=back	705	=back
		706
		707	=head2 TYPE CASTS
		708
		709	B<EXPERIMENTAL>: As an experimental extension, C<CBOR::XS> allows you to
		710	force specific CBOR types to be used when encoding. That allows you to
		711	encode types not normally accessible (e.g. half floats) as well as force
		712	string types even when C<text_strings> is in effect.
		713
		714	Type forcing is done by calling a special "cast" function which keeps a
		715	copy of the value and returns a new value that can be handed over to any
		716	CBOR encoder function.
		717
		718	The following casts are currently available (all of which are unary
		719	operators, that is, have a prototype of C<$>):
		720
		721	=over
		722
		723	=item CBOR::XS::as_int $value
		724
		725	Forces the value to be encoded as some form of (basic, not bignum) integer
		726	type.
		727
		728	=item CBOR::XS::as_text $value
		729
		730	Forces the value to be encoded as (UTF-8) text values.
		731
		732	=item CBOR::XS::as_bytes $value
		733
		734	Forces the value to be encoded as a (binary) string value.
		735
		736	Example: encode a perl string as binary even though C<text_strings> is in
		737	effect.
		738
		739	CBOR::XS->new->text_strings->encode ([4, "text", CBOR::XS::bytes "bytevalue"]);
		740
		741	=item CBOR::XS::as_bool $value
		742
		743	Converts a Perl boolean (which can be any kind of scalar) into a CBOR
		744	boolean. Strictly the same, but shorter to write, than:
		745
		746	$value ? Types::Serialiser::true : Types::Serialiser::false
		747
		748	=item CBOR::XS::as_float16 $value
		749
		750	Forces half-float (IEEE 754 binary16) encoding of the given value.
		751
		752	=item CBOR::XS::as_float32 $value
		753
		754	Forces single-float (IEEE 754 binary32) encoding of the given value.
		755
		756	=item CBOR::XS::as_float64 $value
		757
		758	Forces double-float (IEEE 754 binary64) encoding of the given value.
		759
		760	=item CBOR::XS::as_cbor $cbor_text
		761
		762	Not a type cast per-se, this type cast forces the argument to be encoded
		763	as-is. This can be used to embed pre-encoded CBOR data.
		764
		765	Note that no checking on the validity of the C<$cbor_text> is done - it's
		766	the callers responsibility to correctly encode values.
		767
		768	=item CBOR::XS::as_map [key => value...]
		769
		770	Treat the array reference as key value pairs and output a CBOR map. This
		771	allows you to generate CBOR maps with arbitrary key types (or, if you
		772	don't care about semantics, duplicate keys or pairs in a custom order),
		773	which is otherwise hard to do with Perl.
		774
		775	The single argument must be an array reference with an even number of
		776	elements.
		777
		778	Note that only the reference to the array is copied, the array itself is
		779	not. Modifications done to the array before calling an encoding function
		780	will be reflected in the encoded output.
		781
		782	Example: encode a CBOR map with a string and an integer as keys.
		783
		784	encode_cbor CBOR::XS::as_map [string => "value", 5 => "value"]
		785
		786	=back
		787
		788	=cut
		789
		790	sub CBOR::XS::as_cbor ($) { bless [$_[0], 0, undef], CBOR::XS::Tagged:: }
		791	sub CBOR::XS::as_int ($) { bless [$_[0], 1, undef], CBOR::XS::Tagged:: }
		792	sub CBOR::XS::as_bytes ($) { bless [$_[0], 2, undef], CBOR::XS::Tagged:: }
		793	sub CBOR::XS::as_text ($) { bless [$_[0], 3, undef], CBOR::XS::Tagged:: }
		794	sub CBOR::XS::as_float16 ($) { bless [$_[0], 4, undef], CBOR::XS::Tagged:: }
		795	sub CBOR::XS::as_float32 ($) { bless [$_[0], 5, undef], CBOR::XS::Tagged:: }
		796	sub CBOR::XS::as_float64 ($) { bless [$_[0], 6, undef], CBOR::XS::Tagged:: }
		797
		798	sub CBOR::XS::as_bool ($) { $_[0] ? $Types::Serialiser::true : $Types::Serialiser::false }
		799
		800	sub CBOR::XS::as_map ($) {
		801	ARRAY:: eq ref $_[0]
		802	and $#{ $_[0] } & 1
		803	or do { require Carp; Carp::croak ("CBOR::XS::as_map only acepts array references with an even number of elements, caught") };
		804
		805	bless [$_[0], 7, undef], CBOR::XS::Tagged::
		806	}
700		807
701	=head2 OBJECT SERIALISATION	808	=head2 OBJECT SERIALISATION
702		809
703	This module implements both a CBOR-specific and the generic	810	This module implements both a CBOR-specific and the generic
704	L<Types::Serialier> object serialisation protocol. The following	811	L<Types::Serialier> object serialisation protocol. The following
…		…
1057		1164
1058		1165
1059	=head1 SECURITY CONSIDERATIONS	1166	=head1 SECURITY CONSIDERATIONS
1060		1167
1061	Tl;dr... if you want to decode or encode CBOR from untrusted sources, you	1168	Tl;dr... if you want to decode or encode CBOR from untrusted sources, you
1062	should start with a coder object created via C<new_safe>:	1169	should start with a coder object created via C<new_safe> (which implements
		1170	the mitigations explained below):
1063		1171
1064	my $coder = CBOR::XS->new_safe;	1172	my $coder = CBOR::XS->new_safe;
1065		1173
1066	my $data = $coder->decode ($cbor_text);	1174	my $data = $coder->decode ($cbor_text);
1067	my $cbor = $coder->encode ($data);	1175	my $cbor = $coder->encode ($data);
…		…
1089	even if all your C<THAW> methods are secure, encoding data structures from	1197	even if all your C<THAW> methods are secure, encoding data structures from
1090	untrusted sources can invoke those and trigger bugs in those.	1198	untrusted sources can invoke those and trigger bugs in those.
1091		1199
1092	So, if you are not sure about the security of all the modules you	1200	So, if you are not sure about the security of all the modules you
1093	have loaded (you shouldn't), you should disable this part using	1201	have loaded (you shouldn't), you should disable this part using
1094	C<forbid_objects>.	1202	C<forbid_objects> or using C<new_safe>.
1095		1203
1096	=item CBOR can be extended with tags that call library code	1204	=item CBOR can be extended with tags that call library code
1097		1205
1098	CBOR can be extended with tags, and C<CBOR::XS> has a registry of	1206	CBOR can be extended with tags, and C<CBOR::XS> has a registry of
1099	conversion functions for many existing tags that can be extended via	1207	conversion functions for many existing tags that can be extended via
1100	third-party modules (see the C<filter> method).	1208	third-party modules (see the C<filter> method).
1101		1209
1102	If you don't trust these, you should configure the "safe" filter function,	1210	If you don't trust these, you should configure the "safe" filter function,
1103	C<CBOR::XS::safe_filter>, which by default only includes conversion	1211	C<CBOR::XS::safe_filter> (C<new_safe> does this), which by default only
1104	functions that are considered "safe" by the author (but again, they can be	1212	includes conversion functions that are considered "safe" by the author
1105	extended by third party modules).	1213	(but again, they can be extended by third party modules).
1106		1214
1107	Depending on your level of paranoia, you can use the "safe" filter:	1215	Depending on your level of paranoia, you can use the "safe" filter:
1108		1216
1109	$cbor->filter (\&CBOR::XS::safe_filter);	1217	$cbor->filter (\&CBOR::XS::safe_filter);
1110		1218
…		…
1125	the size of CBOR data you accept, or make sure then when your resources	1233	the size of CBOR data you accept, or make sure then when your resources
1126	run out, that's just fine (e.g. by using a separate process that can	1234	run out, that's just fine (e.g. by using a separate process that can
1127	crash safely). The size of a CBOR string in octets is usually a good	1235	crash safely). The size of a CBOR string in octets is usually a good
1128	indication of the size of the resources required to decode it into a Perl	1236	indication of the size of the resources required to decode it into a Perl
1129	structure. While CBOR::XS can check the size of the CBOR text (using	1237	structure. While CBOR::XS can check the size of the CBOR text (using
1130	C<max_size>), it might be too late when you already have it in memory, so	1238	C<max_size> - done by C<new_safe>), it might be too late when you already
1131	you might want to check the size before you accept the string.	1239	have it in memory, so you might want to check the size before you accept
		1240	the string.
1132		1241
1133	As for encoding, it is possible to construct data structures that are	1242	As for encoding, it is possible to construct data structures that are
1134	relatively small but result in large CBOR texts (for example by having an	1243	relatively small but result in large CBOR texts (for example by having an
1135	array full of references to the same big data structure, which will all be	1244	array full of references to the same big data structure, which will all be
1136	deep-cloned during encoding by default). This is rarely an actual issue	1245	deep-cloned during encoding by default). This is rarely an actual issue
…		…
1149	method.	1258	method.
1150		1259
1151	=item Resource-starving attacks: CPU en-/decoding complexity	1260	=item Resource-starving attacks: CPU en-/decoding complexity
1152		1261
1153	CBOR::XS will use the L<Math::BigInt>, L<Math::BigFloat> and	1262	CBOR::XS will use the L<Math::BigInt>, L<Math::BigFloat> and
1154	L<Math::BigRat> libraries to represent encode/decode bignums. These can	1263	L<Math::BigRat> libraries to represent encode/decode bignums. These can be
1155	be very slow (as in, centuries of CPU time) and can even crash your	1264	very slow (as in, centuries of CPU time) and can even crash your program
1156	program (and are generally not very trustworthy). See the next section for	1265	(and are generally not very trustworthy). See the next section on bignum
1157	details.	1266	security for details.
1158		1267
1159	=item Data breaches: leaking information in error messages	1268	=item Data breaches: leaking information in error messages
1160		1269
1161	CBOR::XS might leak contents of your Perl data structures in its error	1270	CBOR::XS might leak contents of your Perl data structures in its error
1162	messages, so when you serialise sensitive information you might want to	1271	messages, so when you serialise sensitive information you might want to
…		…
1225	=head1 LIMITATIONS ON PERLS WITHOUT 64-BIT INTEGER SUPPORT	1334	=head1 LIMITATIONS ON PERLS WITHOUT 64-BIT INTEGER SUPPORT
1226		1335
1227	On perls that were built without 64 bit integer support (these are rare	1336	On perls that were built without 64 bit integer support (these are rare
1228	nowadays, even on 32 bit architectures, as all major Perl distributions	1337	nowadays, even on 32 bit architectures, as all major Perl distributions
1229	are built with 64 bit integer support), support for any kind of 64 bit	1338	are built with 64 bit integer support), support for any kind of 64 bit
1230	integer in CBOR is very limited - most likely, these 64 bit values will	1339	value in CBOR is very limited - most likely, these 64 bit values will
1231	be truncated, corrupted, or otherwise not decoded correctly. This also	1340	be truncated, corrupted, or otherwise not decoded correctly. This also
1232	includes string, array and map sizes that are stored as 64 bit integers.	1341	includes string, float, array and map sizes that are stored as 64 bit
		1342	integers.
1233		1343
1234		1344
1235	=head1 THREADS	1345	=head1 THREADS
1236		1346
1237	This module is I<not> guaranteed to be thread safe and there are no	1347	This module is I<not> guaranteed to be thread safe and there are no

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing CBOR-XS/XS.pm (file contents): Revision 1.68 by root, Wed Jul 17 09:37:16 2019 UTC vs. Revision 1.87 by root, Mon Dec 19 20:31:33 2022 UTC

Diff Legend

Comparing CBOR-XS/XS.pm (file contents):
Revision 1.68 by root, Wed Jul 17 09:37:16 2019 UTC vs.
Revision 1.87 by root, Mon Dec 19 20:31:33 2022 UTC