ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/CBOR-XS/README
(Generate patch)

Comparing CBOR-XS/README (file contents):
Revision 1.9 by root, Fri Nov 22 16:18:59 2013 UTC vs.
Revision 1.10 by root, Thu Nov 28 16:09:04 2013 UTC

21 # data was decoded 21 # data was decoded
22 substr $many_cbor_strings, 0, $length, ""; # remove decoded cbor string 22 substr $many_cbor_strings, 0, $length, ""; # remove decoded cbor string
23 } 23 }
24 24
25DESCRIPTION 25DESCRIPTION
26 WARNING! This module is very new, and not very well tested (that's up to
27 you to do). Furthermore, details of the implementation might change
28 freely before version 1.0. And lastly, most extensions depend on an IANA
29 assignment, and until that assignment is official, this implementation
30 is not interoperable with other implementations (even future versions of
31 this module) until the assignment is done.
32
33 You are still invited to try out CBOR, and this module.
34
35 This module converts Perl data structures to the Concise Binary Object 26 This module converts Perl data structures to the Concise Binary Object
36 Representation (CBOR) and vice versa. CBOR is a fast binary 27 Representation (CBOR) and vice versa. CBOR is a fast binary
37 serialisation format that aims to use a superset of the JSON data model, 28 serialisation format that aims to use an (almost) superset of the JSON
38 i.e. when you can represent something in JSON, you should be able to 29 data model, i.e. when you can represent something useful in JSON, you
39 represent it in CBOR. 30 should be able to represent it in CBOR.
40 31
41 In short, CBOR is a faster and very compact binary alternative to JSON, 32 In short, CBOR is a faster and quite compact binary alternative to JSON,
42 with the added ability of supporting serialisation of Perl objects. 33 with the added ability of supporting serialisation of Perl objects.
43 (JSON often compresses better than CBOR though, so if you plan to 34 (JSON often compresses better than CBOR though, so if you plan to
44 compress the data later you might want to compare both formats first). 35 compress the data later and speed is less important you might want to
36 compare both formats first).
45 37
46 To give you a general idea about speed, with texts in the megabyte 38 To give you a general idea about speed, with texts in the megabyte
47 range, "CBOR::XS" usually encodes roughly twice as fast as Storable or 39 range, "CBOR::XS" usually encodes roughly twice as fast as Storable or
48 JSON::XS and decodes about 15%-30% faster than those. The shorter the 40 JSON::XS and decodes about 15%-30% faster than those. The shorter the
49 data, the worse Storable performs in comparison. 41 data, the worse Storable performs in comparison.
50 42
51 As for compactness, "CBOR::XS" encoded data structures are usually about 43 Regarding compactness, "CBOR::XS"-encoded data structures are usually
52 20% smaller than the same data encoded as (compact) JSON or Storable. 44 about 20% smaller than the same data encoded as (compact) JSON or
45 Storable.
53 46
54 In addition to the core CBOR data format, this module implements a 47 In addition to the core CBOR data format, this module implements a
55 number of extensions, to support cyclic and self-referencing data 48 number of extensions, to support cyclic and shared data structures (see
56 structures (see "allow_sharing"), string deduplication (see 49 "allow_sharing"), string deduplication (see "pack_strings") and scalar
57 "allow_stringref") and scalar references (always enabled). 50 references (always enabled).
58 51
59 The primary goal of this module is to be *correct* and the secondary 52 The primary goal of this module is to be *correct* and the secondary
60 goal is to be *fast*. To reach the latter goal it was written in C. 53 goal is to be *fast*. To reach the latter goal it was written in C.
61 54
62 See MAPPING, below, on how CBOR::XS maps perl values to CBOR values and 55 See MAPPING, below, on how CBOR::XS maps perl values to CBOR values and
147 same object, such as an array, is referenced multiple times), but 140 same object, such as an array, is referenced multiple times), but
148 instead will emit a reference to the earlier value. 141 instead will emit a reference to the earlier value.
149 142
150 This means that such values will only be encoded once, and will not 143 This means that such values will only be encoded once, and will not
151 result in a deep cloning of the value on decode, in decoders 144 result in a deep cloning of the value on decode, in decoders
152 supporting the value sharing extension. 145 supporting the value sharing extension. This also makes it possible
146 to encode cyclic data structures.
153 147
154 It is recommended to leave it off unless you know your communication 148 It is recommended to leave it off unless you know your communication
155 partner supports the value sharing extensions to CBOR 149 partner supports the value sharing extensions to CBOR
156 (http://cbor.schmorp.de/value-sharing). 150 (<http://cbor.schmorp.de/value-sharing>), as without decoder
151 support, the resulting data structure might be unusable.
157 152
158 Detecting shared values incurs a runtime overhead when values are 153 Detecting shared values incurs a runtime overhead when values are
159 encoded that have a reference counter large than one, and might 154 encoded that have a reference counter large than one, and might
160 unnecessarily increase the encoded size, as potentially shared 155 unnecessarily increase the encoded size, as potentially shared
161 values are encode as sharable whether or not they are actually 156 values are encode as sharable whether or not they are actually
163 158
164 At the moment, only targets of references can be shared (e.g. 159 At the moment, only targets of references can be shared (e.g.
165 scalars, arrays or hashes pointed to by a reference). Weirder 160 scalars, arrays or hashes pointed to by a reference). Weirder
166 constructs, such as an array with multiple "copies" of the *same* 161 constructs, such as an array with multiple "copies" of the *same*
167 string, which are hard but not impossible to create in Perl, are not 162 string, which are hard but not impossible to create in Perl, are not
168 supported (this is the same as for Storable). 163 supported (this is the same as with Storable).
169 164
170 If $enable is false (the default), then "encode" will encode 165 If $enable is false (the default), then "encode" will encode shared
171 exception when it encounters anything it cannot encode as CBOR. 166 data structures repeatedly, unsharing them in the process. Cyclic
167 data structures cannot be encoded in this mode.
172 168
173 This option does not affect "decode" in any way - shared values and 169 This option does not affect "decode" in any way - shared values and
174 references will always be decoded properly if present. 170 references will always be decoded properly if present.
175 171
176 $cbor = $cbor->allow_stringref ([$enable]) 172 $cbor = $cbor->pack_strings ([$enable])
177 $enabled = $cbor->get_allow_stringref 173 $enabled = $cbor->get_pack_strings
178 If $enable is true (or missing), then "encode" will try not to 174 If $enable is true (or missing), then "encode" will try not to
179 encode the same string twice, but will instead encode a reference to 175 encode the same string twice, but will instead encode a reference to
180 the string instead. Depending on your data format. this can save a 176 the string instead. Depending on your data format, this can save a
181 lot of space, but also results in a very large runtime overhead 177 lot of space, but also results in a very large runtime overhead
182 (expect encoding times to be 2-4 times as high as without). 178 (expect encoding times to be 2-4 times as high as without).
183 179
184 It is recommended to leave it off unless you know your 180 It is recommended to leave it off unless you know your
185 communications partner supports the stringref extension to CBOR 181 communications partner supports the stringref extension to CBOR
186 (http://cbor.schmorp.de/stringref). 182 (<http://cbor.schmorp.de/stringref>), as without decoder support,
183 the resulting data structure might not be usable.
187 184
188 If $enable is false (the default), then "encode" will encode 185 If $enable is false (the default), then "encode" will encode strings
189 exception when it encounters anything it cannot encode as CBOR. 186 the standard CBOR way.
190 187
191 This option does not affect "decode" in any way - string references 188 This option does not affect "decode" in any way - string references
192 will always be decoded properly if present. 189 will always be decoded properly if present.
193 190
194 $cbor = $cbor->filter ([$cb->($tag, $value)]) 191 $cbor = $cbor->filter ([$cb->($tag, $value)])
218 it must be a code reference that is called with tag and value, and 215 it must be a code reference that is called with tag and value, and
219 is responsible for decoding the value. If no entry exists, it 216 is responsible for decoding the value. If no entry exists, it
220 returns no values. 217 returns no values.
221 218
222 Example: decode all tags not handled internally into 219 Example: decode all tags not handled internally into
223 CBOR::XS::Tagged objects, with no other special handling (useful 220 "CBOR::XS::Tagged" objects, with no other special handling (useful
224 when working with potentially "unsafe" CBOR data). 221 when working with potentially "unsafe" CBOR data).
225 222
226 CBOR::XS->new->filter (sub { })->decode ($cbor_data); 223 CBOR::XS->new->filter (sub { })->decode ($cbor_data);
227 224
228 Example: provide a global filter for tag 1347375694, converting the 225 Example: provide a global filter for tag 1347375694, converting the
269 integers 266 integers
270 CBOR integers become (numeric) perl scalars. On perls without 64 bit 267 CBOR integers become (numeric) perl scalars. On perls without 64 bit
271 support, 64 bit integers will be truncated or otherwise corrupted. 268 support, 64 bit integers will be truncated or otherwise corrupted.
272 269
273 byte strings 270 byte strings
274 Byte strings will become octet strings in Perl (the byte values 271 Byte strings will become octet strings in Perl (the Byte values
275 0..255 will simply become characters of the same value in Perl). 272 0..255 will simply become characters of the same value in Perl).
276 273
277 UTF-8 strings 274 UTF-8 strings
278 UTF-8 strings in CBOR will be decoded, i.e. the UTF-8 octets will be 275 UTF-8 strings in CBOR will be decoded, i.e. the UTF-8 octets will be
279 decoded into proper Unicode code points. At the moment, the validity 276 decoded into proper Unicode code points. At the moment, the validity
297 294
298 tagged values 295 tagged values
299 Tagged items consists of a numeric tag and another CBOR value. 296 Tagged items consists of a numeric tag and another CBOR value.
300 297
301 See "TAG HANDLING AND EXTENSIONS" and the description of "->filter" 298 See "TAG HANDLING AND EXTENSIONS" and the description of "->filter"
302 for details. 299 for details on which tags are handled how.
303 300
304 anything else 301 anything else
305 Anything else (e.g. unsupported simple values) will raise a decoding 302 Anything else (e.g. unsupported simple values) will raise a decoding
306 error. 303 error.
307 304
308 PERL -> CBOR 305 PERL -> CBOR
309 The mapping from Perl to CBOR is slightly more difficult, as Perl is a 306 The mapping from Perl to CBOR is slightly more difficult, as Perl is a
310 truly typeless language, so we can only guess which CBOR type is meant 307 typeless language. That means this module can only guess which CBOR type
311 by a Perl value. 308 is meant by a perl value.
312 309
313 hash references 310 hash references
314 Perl hash references become CBOR maps. As there is no inherent 311 Perl hash references become CBOR maps. As there is no inherent
315 ordering in hash keys (or CBOR maps), they will usually be encoded 312 ordering in hash keys (or CBOR maps), they will usually be encoded
316 in a pseudo-random order. 313 in a pseudo-random order. This order can be different each time a
314 hahs is encoded.
317 315
318 Currently, tied hashes will use the indefinite-length format, while 316 Currently, tied hashes will use the indefinite-length format, while
319 normal hashes will use the fixed-length format. 317 normal hashes will use the fixed-length format.
320 318
321 array references 319 array references
322 Perl array references become fixed-length CBOR arrays. 320 Perl array references become fixed-length CBOR arrays.
323 321
324 other references 322 other references
325 Other unblessed references are generally not allowed and will cause 323 Other unblessed references will be represented using the indirection
326 an exception to be thrown, except for references to the integers 0 324 tag extension (tag value 22098,
327 and 1, which get turned into false and true in CBOR. 325 <http://cbor.schmorp.de/indirection>). CBOR decoders are guaranteed
326 to be able to decode these values somehow, by either "doing the
327 right thing", decoding into a generic tagged object, simply ignoring
328 the tag, or something else.
328 329
329 CBOR::XS::Tagged objects 330 CBOR::XS::Tagged objects
330 Objects of this type must be arrays consisting of a single "[tag, 331 Objects of this type must be arrays consisting of a single "[tag,
331 value]" pair. The (numerical) tag will be encoded as a CBOR tag, the 332 value]" pair. The (numerical) tag will be encoded as a CBOR tag, the
332 value will be encoded as appropriate for the value. You cna use 333 value will be encoded as appropriate for the value. You must use
333 "CBOR::XS::tag" to create such objects. 334 "CBOR::XS::tag" to create such objects.
334 335
335 Types::Serialiser::true, Types::Serialiser::false, 336 Types::Serialiser::true, Types::Serialiser::false,
336 Types::Serialiser::error 337 Types::Serialiser::error
337 These special values become CBOR true, CBOR false and CBOR undefined 338 These special values become CBOR true, CBOR false and CBOR undefined
353 # dump as number 354 # dump as number
354 encode_cbor [2] # yields [2] 355 encode_cbor [2] # yields [2]
355 encode_cbor [-3.0e17] # yields [-3e+17] 356 encode_cbor [-3.0e17] # yields [-3e+17]
356 my $value = 5; encode_cbor [$value] # yields [5] 357 my $value = 5; encode_cbor [$value] # yields [5]
357 358
358 # used as string, so dump as string 359 # used as string, so dump as string (either byte or text)
359 print $value; 360 print $value;
360 encode_cbor [$value] # yields ["5"] 361 encode_cbor [$value] # yields ["5"]
361 362
362 # undef becomes null 363 # undef becomes null
363 encode_cbor [undef] # yields [null] 364 encode_cbor [undef] # yields [null]
366 367
367 my $x = 3.1; # some variable containing a number 368 my $x = 3.1; # some variable containing a number
368 "$x"; # stringified 369 "$x"; # stringified
369 $x .= ""; # another, more awkward way to stringify 370 $x .= ""; # another, more awkward way to stringify
370 print $x; # perl does it for you, too, quite often 371 print $x; # perl does it for you, too, quite often
372
373 You can force whether a string ie encoded as byte or text string by
374 using "utf8::upgrade" and "utf8::downgrade"):
375
376 utf8::upgrade $x; # encode $x as text string
377 utf8::downgrade $x; # encode $x as byte string
378
379 Perl doesn't define what operations up- and downgrade strings, so if
380 the difference between byte and text is important, you should up- or
381 downgrade your string as late as possible before encoding.
371 382
372 You can force the type to be a CBOR number by numifying it: 383 You can force the type to be a CBOR number by numifying it:
373 384
374 my $x = "3"; # some variable containing a string 385 my $x = "3"; # some variable containing a string
375 $x += 0; # numify it, ensuring it will be dumped as a number 386 $x += 0; # numify it, ensuring it will be dumped as a number
439 450
440 sub URI::TO_CBOR { 451 sub URI::TO_CBOR {
441 my ($self) = @_; 452 my ($self) = @_;
442 my $uri = "$self"; # stringify uri 453 my $uri = "$self"; # stringify uri
443 utf8::upgrade $uri; # make sure it will be encoded as UTF-8 string 454 utf8::upgrade $uri; # make sure it will be encoded as UTF-8 string
444 CBOR::XS::tagged 32, "$_[0]" 455 CBOR::XS::tag 32, "$_[0]"
445 } 456 }
446 457
447 This will encode URIs as a UTF-8 string with tag 32, which indicates an 458 This will encode URIs as a UTF-8 string with tag 32, which indicates an
448 URI. 459 URI.
449 460
568 579
569 ENFORCED TAGS 580 ENFORCED TAGS
570 These tags are always handled when decoding, and their handling cannot 581 These tags are always handled when decoding, and their handling cannot
571 be overriden by the user. 582 be overriden by the user.
572 583
573 <unassigned> (perl-object, <http://cbor.schmorp.de/perl-object>) 584 26 (perl-object, <http://cbor.schmorp.de/perl-object>)
574 These tags are automatically created (and decoded) for serialisable 585 These tags are automatically created (and decoded) for serialisable
575 objects using the "FREEZE/THAW" methods (the Types::Serialier object 586 objects using the "FREEZE/THAW" methods (the Types::Serialier object
576 serialisation protocol). See "OBJECT SERIALISATION" for details. 587 serialisation protocol). See "OBJECT SERIALISATION" for details.
577 588
578 <unassigned>, <unassigned> (sharable, sharedref, L 589 28, 29 (sharable, sharedref, L <http://cbor.schmorp.de/value-sharing>)
579 <http://cbor.schmorp.de/value-sharing>)
580 These tags are automatically decoded when encountered, resulting in 590 These tags are automatically decoded when encountered, resulting in
581 shared values in the decoded object. They are only encoded, however, 591 shared values in the decoded object. They are only encoded, however,
582 when "allow_sharable" is enabled. 592 when "allow_sharable" is enabled.
583 593
584 <unassigned>, <unassigned> (stringref-namespace, stringref, L 594 256, 25 (stringref-namespace, stringref, L
585 <http://cbor.schmorp.de/stringref>) 595 <http://cbor.schmorp.de/stringref>)
586 These tags are automatically decoded when encountered. They are only 596 These tags are automatically decoded when encountered. They are only
587 encoded, however, when "allow_stringref" is enabled. 597 encoded, however, when "pack_strings" is enabled.
588 598
589 22098 (indirection, <http://cbor.schmorp.de/indirection>) 599 22098 (indirection, <http://cbor.schmorp.de/indirection>)
590 This tag is automatically generated when a reference are encountered 600 This tag is automatically generated when a reference are encountered
591 (with the exception of hash and array refernces). It is converted to 601 (with the exception of hash and array refernces). It is converted to
592 a reference when decoding. 602 a reference when decoding.

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines