[ViewVC] Contents of: cvs/CBOR-XS/README

NAME
    CBOR::XS - Concise Binary Object Representation (CBOR, RFC7049)

SYNOPSIS
     use CBOR::XS;

     $binary_cbor_data = encode_cbor $perl_value;
     $perl_value       = decode_cbor $binary_cbor_data;

     # OO-interface

     $coder = CBOR::XS->new;
     #TODO

DESCRIPTION
    WARNING! THIS IS A PRE-ALPHA RELEASE! IT WILL CRASH, CORRUPT YOUR DATA
    AND EAT YOUR CHILDREN!

    This module converts Perl data structures to CBOR and vice versa. Its
    primary goal is to be *correct* and its secondary goal is to be *fast*.
    To reach the latter goal it was written in C.

    See MAPPING, below, on how CBOR::XS maps perl values to CBOR values and
    vice versa.

FUNCTIONAL INTERFACE
    The following convenience methods are provided by this module. They are
    exported by default:

    $cbor_data = encode_cbor $perl_scalar
        Converts the given Perl data structure to CBOR representation.
        Croaks on error.

    $perl_scalar = decode_cbor $cbor_data
        The opposite of "encode_cbor": expects a valid CBOR string to parse,
        returning the resulting perl scalar. Croaks on error.

OBJECT-ORIENTED INTERFACE
    The object oriented interface lets you configure your own encoding or
    decoding style, within the limits of supported formats.

    $cbor = new CBOR::XS
        Creates a new CBOR::XS object that can be used to de/encode CBOR
        strings. All boolean flags described below are by default
        *disabled*.

        The mutators for flags all return the CBOR object again and thus
        calls can be chained:

        #TODO my $cbor = CBOR::XS->new->encode ({a => [1,2]});

    $cbor = $cbor->max_depth ([$maximum_nesting_depth])
    $max_depth = $cbor->get_max_depth
        Sets the maximum nesting level (default 512) accepted while encoding
        or decoding. If a higher nesting level is detected in CBOR data or a
        Perl data structure, then the encoder and decoder will stop and
        croak at that point.

        Nesting level is defined by number of hash- or arrayrefs that the
        encoder needs to traverse to reach a given point or the number of
        "{" or "[" characters without their matching closing parenthesis
        crossed to reach a given character in a string.

        Setting the maximum depth to one disallows any nesting, so that
        ensures that the object is only a single hash/object or array.

        If no argument is given, the highest possible setting will be used,
        which is rarely useful.

        Note that nesting is implemented by recursion in C. The default
        value has been chosen to be as large as typical operating systems
        allow without crashing.

        See SECURITY CONSIDERATIONS, below, for more info on why this is
        useful.

    $cbor = $cbor->max_size ([$maximum_string_size])
    $max_size = $cbor->get_max_size
        Set the maximum length a CBOR string may have (in bytes) where
        decoding is being attempted. The default is 0, meaning no limit.
        When "decode" is called on a string that is longer then this many
        bytes, it will not attempt to decode the string but throw an
        exception. This setting has no effect on "encode" (yet).

        If no argument is given, the limit check will be deactivated (same
        as when 0 is specified).

        See SECURITY CONSIDERATIONS, below, for more info on why this is
        useful.

    $cbor_data = $cbor->encode ($perl_scalar)
        Converts the given Perl data structure (a scalar value) to its CBOR
        representation.

    $perl_scalar = $cbor->decode ($cbor_data)
        The opposite of "encode": expects CBOR data and tries to parse it,
        returning the resulting simple scalar or reference. Croaks on error.

    ($perl_scalar, $octets) = $cbor->decode_prefix ($cbor_data)
        This works like the "decode" method, but instead of raising an
        exception when there is trailing garbage after the CBOR string, it
        will silently stop parsing there and return the number of characters
        consumed so far.

        This is useful if your CBOR texts are not delimited by an outer
        protocol and you need to know where the first CBOR string ends amd
        the next one starts.

           CBOR::XS->new->decode_prefix ("......")
           => ("...", 3)

MAPPING
    This section describes how CBOR::XS maps Perl values to CBOR values and
    vice versa. These mappings are designed to "do the right thing" in most
    circumstances automatically, preserving round-tripping characteristics
    (what you put in comes out as something equivalent).

    For the more enlightened: note that in the following descriptions,
    lowercase *perl* refers to the Perl interpreter, while uppercase *Perl*
    refers to the abstract Perl language itself.

  CBOR -> PERL
    True, False
        These CBOR values become "CBOR::XS::true" and "CBOR::XS::false",
        respectively. They are overloaded to act almost exactly like the
        numbers 1 and 0. You can check whether a scalar is a CBOR boolean by
        using the "CBOR::XS::is_bool" function.

    Null, Undefined
        CBOR Null and Undefined values becomes "undef" in Perl (in the
        future, Undefined may raise an exception).

  PERL -> CBOR
    The mapping from Perl to CBOR is slightly more difficult, as Perl is a
    truly typeless language, so we can only guess which CBOR type is meant
    by a Perl value.

    hash references
        Perl hash references become CBOR maps. As there is no inherent
        ordering in hash keys (or CBOR maps), they will usually be encoded
        in a pseudo-random order.

    array references
        Perl array references become CBOR arrays.

    other references
        Other unblessed references are generally not allowed and will cause
        an exception to be thrown, except for references to the integers 0
        and 1, which get turned into "False" and "True" in CBOR.

    CBOR::XS::true, CBOR::XS::false
        These special values become CBOR True and CBOR False values,
        respectively. You can also use "\1" and "\0" directly if you want.

    blessed objects
        Blessed objects are not directly representable in CBOR. TODO See the
        "allow_blessed" and "convert_blessed" methods on various options on
        how to deal with this: basically, you can choose between throwing an
        exception, encoding the reference as if it weren't blessed, or
        provide your own serialiser method.

    simple scalars
        TODO Simple Perl scalars (any scalar that is not a reference) are
        the most difficult objects to encode: CBOR::XS will encode undefined
        scalars as CBOR "Null" values, scalars that have last been used in a
        string context before encoding as CBOR strings, and anything else as
        number value:

           # dump as number
           encode_cbor [2]                      # yields [2]
           encode_cbor [-3.0e17]                # yields [-3e+17]
           my $value = 5; encode_cbor [$value]  # yields [5]

           # used as string, so dump as string
           print $value;
           encode_cbor [$value]                 # yields ["5"]

           # undef becomes null
           encode_cbor [undef]                  # yields [null]

        You can force the type to be a CBOR string by stringifying it:

           my $x = 3.1; # some variable containing a number
           "$x";        # stringified
           $x .= "";    # another, more awkward way to stringify
           print $x;    # perl does it for you, too, quite often

        You can force the type to be a CBOR number by numifying it:

           my $x = "3"; # some variable containing a string
           $x += 0;     # numify it, ensuring it will be dumped as a number
           $x *= 1;     # same thing, the choice is yours.

        You can not currently force the type in other, less obscure, ways.
        Tell me if you need this capability (but don't forget to explain why
        it's needed :).

        Note that numerical precision has the same meaning as under Perl (so
        binary to decimal conversion follows the same rules as in Perl,
        which can differ to other languages). Also, your perl interpreter
        might expose extensions to the floating point numbers of your
        platform, such as infinities or NaN's - these cannot be represented
        in CBOR, and it is an error to pass those in.

  MAGIC HEADER
    There is no way to distinguish CBOR from other formats programmatically.
    To make it easier to distinguish CBOR from other formats, the CBOR
    specification has a special "magic string" that can be prepended to any
    CBOR string without changing it's meaning.

    This string is available as $CBOR::XS::MAGIC. This module does not
    prepend this string tot he CBOR data it generates, but it will ignroe it
    if present, so users can prepend this string as a "file type" indicator
    as required.

  CBOR and JSON
    TODO

SECURITY CONSIDERATIONS
    When you are using CBOR in a protocol, talking to untrusted potentially
    hostile creatures requires relatively few measures.

    First of all, your CBOR decoder should be secure, that is, should not
    have any buffer overflows. Obviously, this module should ensure that and
    I am trying hard on making that true, but you never know.

    Second, you need to avoid resource-starving attacks. That means you
    should limit the size of CBOR data you accept, or make sure then when
    your resources run out, that's just fine (e.g. by using a separate
    process that can crash safely). The size of a CBOR string in octets is
    usually a good indication of the size of the resources required to
    decode it into a Perl structure. While CBOR::XS can check the size of
    the CBOR text, it might be too late when you already have it in memory,
    so you might want to check the size before you accept the string.

    Third, CBOR::XS recurses using the C stack when decoding objects and
    arrays. The C stack is a limited resource: for instance, on my amd64
    machine with 8MB of stack size I can decode around 180k nested arrays
    but only 14k nested CBOR objects (due to perl itself recursing deeply on
    croak to free the temporary). If that is exceeded, the program crashes.
    To be conservative, the default nesting limit is set to 512. If your
    process has a smaller stack, you should adjust this setting accordingly
    with the "max_depth" method.

    Something else could bomb you, too, that I forgot to think of. In that
    case, you get to keep the pieces. I am always open for hints, though...

    Also keep in mind that CBOR::XS might leak contents of your Perl data
    structures in its error messages, so when you serialise sensitive
    information you might want to make sure that exceptions thrown by
    CBOR::XS will not end up in front of untrusted eyes.

CBOR IMPLEMENTATION NOTES
    This section contains some random implementation notes. They do not
    describe guaranteed behaviour, but merely behaviour as-is implemented
    right now.

    64 bit integers are only properly decoded when Perl was built with 64
    bit support.

    Strings and arrays are encoded with a definite length. Hashes as well,
    unless they are tied (or otherwise magical).

    Only the double data type is supported for NV data types - when Perl
    uses long double to represent floating point values, they might not be
    encoded properly. Half precision types are accepted, but not encoded.

    Strict mode and canonical mode are not implemented.

THREADS
    This module is *not* guaranteed to be thread safe and there are no plans
    to change this until Perl gets thread support (as opposed to the
    horribly slow so-called "threads" which are simply slow and bloated
    process simulations - use fork, it's *much* faster, cheaper, better).

    (It might actually work, but you have been warned).

BUGS
    While the goal of this module is to be correct, that unfortunately does
    not mean it's bug-free, only that I think its design is bug-free. If you
    keep reporting bugs they will be fixed swiftly, though.

    Please refrain from using rt.cpan.org or any other bug reporting
    service. I put the contact address into my modules for a reason.

SEE ALSO
    The JSON and JSON::XS modules that do similar, but human-readable,
    serialisation.

AUTHOR
     Marc Lehmann <schmorp@schmorp.de>
     http://home.schmorp.de/

Revision:	1.3
Committed:	Sat Oct 26 11:08:34 2013 UTC (10 years, 8 months ago) by root
Branch:	MAIN
CVS Tags:	rel-0_02
Changes since 1.2:	+14 -2 lines
Log Message:	0.02
#	Content
1	NAME
2	CBOR::XS - Concise Binary Object Representation (CBOR, RFC7049)
3
4	SYNOPSIS
5	use CBOR::XS;
6
7	$binary_cbor_data = encode_cbor $perl_value;
8	$perl_value = decode_cbor $binary_cbor_data;
9
10	# OO-interface
11
12	$coder = CBOR::XS->new;
13	#TODO
14
15	DESCRIPTION
16	WARNING! THIS IS A PRE-ALPHA RELEASE! IT WILL CRASH, CORRUPT YOUR DATA
17	AND EAT YOUR CHILDREN!
18
19	This module converts Perl data structures to CBOR and vice versa. Its
20	primary goal is to be correct and its secondary goal is to be fast.
21	To reach the latter goal it was written in C.
22
23	See MAPPING, below, on how CBOR::XS maps perl values to CBOR values and
24	vice versa.
25
26	FUNCTIONAL INTERFACE
27	The following convenience methods are provided by this module. They are
28	exported by default:
29
30	$cbor_data = encode_cbor $perl_scalar
31	Converts the given Perl data structure to CBOR representation.
32	Croaks on error.
33
34	$perl_scalar = decode_cbor $cbor_data
35	The opposite of "encode_cbor": expects a valid CBOR string to parse,
36	returning the resulting perl scalar. Croaks on error.
37
38	OBJECT-ORIENTED INTERFACE
39	The object oriented interface lets you configure your own encoding or
40	decoding style, within the limits of supported formats.
41
42	$cbor = new CBOR::XS
43	Creates a new CBOR::XS object that can be used to de/encode CBOR
44	strings. All boolean flags described below are by default
45	disabled.
46
47	The mutators for flags all return the CBOR object again and thus
48	calls can be chained:
49
50	#TODO my $cbor = CBOR::XS->new->encode ({a => [1,2]});
51
52	$cbor = $cbor->max_depth ([$maximum_nesting_depth])
53	$max_depth = $cbor->get_max_depth
54	Sets the maximum nesting level (default 512) accepted while encoding
55	or decoding. If a higher nesting level is detected in CBOR data or a
56	Perl data structure, then the encoder and decoder will stop and
57	croak at that point.
58
59	Nesting level is defined by number of hash- or arrayrefs that the
60	encoder needs to traverse to reach a given point or the number of
61	"{" or "[" characters without their matching closing parenthesis
62	crossed to reach a given character in a string.
63
64	Setting the maximum depth to one disallows any nesting, so that
65	ensures that the object is only a single hash/object or array.
66
67	If no argument is given, the highest possible setting will be used,
68	which is rarely useful.
69
70	Note that nesting is implemented by recursion in C. The default
71	value has been chosen to be as large as typical operating systems
72	allow without crashing.
73
74	See SECURITY CONSIDERATIONS, below, for more info on why this is
75	useful.
76
77	$cbor = $cbor->max_size ([$maximum_string_size])
78	$max_size = $cbor->get_max_size
79	Set the maximum length a CBOR string may have (in bytes) where
80	decoding is being attempted. The default is 0, meaning no limit.
81	When "decode" is called on a string that is longer then this many
82	bytes, it will not attempt to decode the string but throw an
83	exception. This setting has no effect on "encode" (yet).
84
85	If no argument is given, the limit check will be deactivated (same
86	as when 0 is specified).
87
88	See SECURITY CONSIDERATIONS, below, for more info on why this is
89	useful.
90
91	$cbor_data = $cbor->encode ($perl_scalar)
92	Converts the given Perl data structure (a scalar value) to its CBOR
93	representation.
94
95	$perl_scalar = $cbor->decode ($cbor_data)
96	The opposite of "encode": expects CBOR data and tries to parse it,
97	returning the resulting simple scalar or reference. Croaks on error.
98
99	($perl_scalar, $octets) = $cbor->decode_prefix ($cbor_data)
100	This works like the "decode" method, but instead of raising an
101	exception when there is trailing garbage after the CBOR string, it
102	will silently stop parsing there and return the number of characters
103	consumed so far.
104
105	This is useful if your CBOR texts are not delimited by an outer
106	protocol and you need to know where the first CBOR string ends amd
107	the next one starts.
108
109	CBOR::XS->new->decode_prefix ("......")
110	=> ("...", 3)
111
112	MAPPING
113	This section describes how CBOR::XS maps Perl values to CBOR values and
114	vice versa. These mappings are designed to "do the right thing" in most
115	circumstances automatically, preserving round-tripping characteristics
116	(what you put in comes out as something equivalent).
117
118	For the more enlightened: note that in the following descriptions,
119	lowercase perl refers to the Perl interpreter, while uppercase Perl
120	refers to the abstract Perl language itself.
121
122	CBOR -> PERL
123	True, False
124	These CBOR values become "CBOR::XS::true" and "CBOR::XS::false",
125	respectively. They are overloaded to act almost exactly like the
126	numbers 1 and 0. You can check whether a scalar is a CBOR boolean by
127	using the "CBOR::XS::is_bool" function.
128
129	Null, Undefined
130	CBOR Null and Undefined values becomes "undef" in Perl (in the
131	future, Undefined may raise an exception).
132
133	PERL -> CBOR
134	The mapping from Perl to CBOR is slightly more difficult, as Perl is a
135	truly typeless language, so we can only guess which CBOR type is meant
136	by a Perl value.
137
138	hash references
139	Perl hash references become CBOR maps. As there is no inherent
140	ordering in hash keys (or CBOR maps), they will usually be encoded
141	in a pseudo-random order.
142
143	array references
144	Perl array references become CBOR arrays.
145
146	other references
147	Other unblessed references are generally not allowed and will cause
148	an exception to be thrown, except for references to the integers 0
149	and 1, which get turned into "False" and "True" in CBOR.
150
151	CBOR::XS::true, CBOR::XS::false
152	These special values become CBOR True and CBOR False values,
153	respectively. You can also use "\1" and "\0" directly if you want.
154
155	blessed objects
156	Blessed objects are not directly representable in CBOR. TODO See the
157	"allow_blessed" and "convert_blessed" methods on various options on
158	how to deal with this: basically, you can choose between throwing an
159	exception, encoding the reference as if it weren't blessed, or
160	provide your own serialiser method.
161
162	simple scalars
163	TODO Simple Perl scalars (any scalar that is not a reference) are
164	the most difficult objects to encode: CBOR::XS will encode undefined
165	scalars as CBOR "Null" values, scalars that have last been used in a
166	string context before encoding as CBOR strings, and anything else as
167	number value:
168
169	# dump as number
170	encode_cbor [2] # yields [2]
171	encode_cbor [-3.0e17] # yields [-3e+17]
172	my $value = 5; encode_cbor [$value] # yields [5]
173
174	# used as string, so dump as string
175	print $value;
176	encode_cbor [$value] # yields ["5"]
177
178	# undef becomes null
179	encode_cbor [undef] # yields [null]
180
181	You can force the type to be a CBOR string by stringifying it:
182
183	my $x = 3.1; # some variable containing a number
184	"$x"; # stringified
185	$x .= ""; # another, more awkward way to stringify
186	print $x; # perl does it for you, too, quite often
187
188	You can force the type to be a CBOR number by numifying it:
189
190	my $x = "3"; # some variable containing a string
191	$x += 0; # numify it, ensuring it will be dumped as a number
192	$x *= 1; # same thing, the choice is yours.
193
194	You can not currently force the type in other, less obscure, ways.
195	Tell me if you need this capability (but don't forget to explain why
196	it's needed :).
197
198	Note that numerical precision has the same meaning as under Perl (so
199	binary to decimal conversion follows the same rules as in Perl,
200	which can differ to other languages). Also, your perl interpreter
201	might expose extensions to the floating point numbers of your
202	platform, such as infinities or NaN's - these cannot be represented
203	in CBOR, and it is an error to pass those in.
204
205	MAGIC HEADER
206	There is no way to distinguish CBOR from other formats programmatically.
207	To make it easier to distinguish CBOR from other formats, the CBOR
208	specification has a special "magic string" that can be prepended to any
209	CBOR string without changing it's meaning.
210
211	This string is available as $CBOR::XS::MAGIC. This module does not
212	prepend this string tot he CBOR data it generates, but it will ignroe it
213	if present, so users can prepend this string as a "file type" indicator
214	as required.
215
216	CBOR and JSON
217	TODO
218
219	SECURITY CONSIDERATIONS
220	When you are using CBOR in a protocol, talking to untrusted potentially
221	hostile creatures requires relatively few measures.
222
223	First of all, your CBOR decoder should be secure, that is, should not
224	have any buffer overflows. Obviously, this module should ensure that and
225	I am trying hard on making that true, but you never know.
226
227	Second, you need to avoid resource-starving attacks. That means you
228	should limit the size of CBOR data you accept, or make sure then when
229	your resources run out, that's just fine (e.g. by using a separate
230	process that can crash safely). The size of a CBOR string in octets is
231	usually a good indication of the size of the resources required to
232	decode it into a Perl structure. While CBOR::XS can check the size of
233	the CBOR text, it might be too late when you already have it in memory,
234	so you might want to check the size before you accept the string.
235
236	Third, CBOR::XS recurses using the C stack when decoding objects and
237	arrays. The C stack is a limited resource: for instance, on my amd64
238	machine with 8MB of stack size I can decode around 180k nested arrays
239	but only 14k nested CBOR objects (due to perl itself recursing deeply on
240	croak to free the temporary). If that is exceeded, the program crashes.
241	To be conservative, the default nesting limit is set to 512. If your
242	process has a smaller stack, you should adjust this setting accordingly
243	with the "max_depth" method.
244
245	Something else could bomb you, too, that I forgot to think of. In that
246	case, you get to keep the pieces. I am always open for hints, though...
247
248	Also keep in mind that CBOR::XS might leak contents of your Perl data
249	structures in its error messages, so when you serialise sensitive
250	information you might want to make sure that exceptions thrown by
251	CBOR::XS will not end up in front of untrusted eyes.
252
253	CBOR IMPLEMENTATION NOTES
254	This section contains some random implementation notes. They do not
255	describe guaranteed behaviour, but merely behaviour as-is implemented
256	right now.
257
258	64 bit integers are only properly decoded when Perl was built with 64
259	bit support.
260
261	Strings and arrays are encoded with a definite length. Hashes as well,
262	unless they are tied (or otherwise magical).
263
264	Only the double data type is supported for NV data types - when Perl
265	uses long double to represent floating point values, they might not be
266	encoded properly. Half precision types are accepted, but not encoded.
267
268	Strict mode and canonical mode are not implemented.
269
270	THREADS
271	This module is not guaranteed to be thread safe and there are no plans
272	to change this until Perl gets thread support (as opposed to the
273	horribly slow so-called "threads" which are simply slow and bloated
274	process simulations - use fork, it's much faster, cheaper, better).
275
276	(It might actually work, but you have been warned).
277
278	BUGS
279	While the goal of this module is to be correct, that unfortunately does
280	not mean it's bug-free, only that I think its design is bug-free. If you
281	keep reporting bugs they will be fixed swiftly, though.
282
283	Please refrain from using rt.cpan.org or any other bug reporting
284	service. I put the contact address into my modules for a reason.
285
286	SEE ALSO
287	The JSON and JSON::XS modules that do similar, but human-readable,
288	serialisation.
289
290	AUTHOR
291	Marc Lehmann <schmorp@schmorp.de>
292	http://home.schmorp.de/
293