--- CBOR-XS/XS.pm 2013/10/27 20:40:25 1.6 +++ CBOR-XS/XS.pm 2013/10/29 00:20:26 1.11 @@ -28,9 +28,14 @@ =head1 DESCRIPTION -WARNING! THIS IS A PRE-ALPHA RELEASE! IT WILL CRASH, CORRUPT YOUR DATA -AND EAT YOUR CHILDREN! (Actually, apart from being untested and a bit -feature-limited, it might already be useful). +WARNING! This module is very new, and not very well tested (that's up to +you to do). Furthermore, details of the implementation might change freely +before version 1.0. And lastly, the object serialisation protocol depends +on a pending IANA assignment, and until that assignment is official, this +implementation is not interoperable with other implementations (even +future versions of this module) until the assignment is done. + +You are still invited to try out CBOR, and this module. This module converts Perl data structures to the Concise Binary Object Representation (CBOR) and vice versa. CBOR is a fast binary serialisation @@ -38,8 +43,10 @@ can represent something in JSON, you should be able to represent it in CBOR. -This makes it a faster and more compact binary alternative to JSON, with -the added ability of supporting serialising of perl objects. +In short, CBOR is a faster and very compact binary alternative to JSON, +with the added ability of supporting serialisation of Perl objects. (JSON +often compresses better than CBOR though, so if you plan to compress the +data later you might want to compare both formats first). The primary goal of this module is to be I and the secondary goal is to be I. To reach the latter goal it was written in C. @@ -53,7 +60,7 @@ use common::sense; -our $VERSION = 0.03; +our $VERSION = 0.05; our @ISA = qw(Exporter); our @EXPORT = qw(encode_cbor decode_cbor); @@ -223,16 +230,9 @@ =item CBOR tag 256 (perl object) -The tag value C<256> (TODO: pending iana registration) will be used to -deserialise a Perl object. - -TODO For this to work, the class must be loaded and must have a -C method. The decoder will then call the C method -with the constructor arguments provided by the C method (see -below). - -The C method must return a single value that will then be used -as the deserialised value. +The tag value C<256> (TODO: pending iana registration) will be used +to deserialise a Perl object serialised with C. See L, below, for details. =item CBOR tag 55799 (magic header) @@ -294,11 +294,10 @@ values, respectively. You can also use C<\1>, C<\0> and C<\undef> directly if you want. -=item blessed objects +=item other blessed objects -Other blessed objects currently need to have a C method. It -will be called on every object that is being serialised, and must return -something that can be encoded in CBOR. +Other blessed objects are serialised via C or C. See +L, below, for details. =item simple scalars @@ -346,8 +345,105 @@ =back +=head2 OBJECT SERIALISATION + +This module knows two way to serialise a Perl object: The CBOR-specific +way, and the generic way. + +Whenever the encoder encounters a Perl object that it cnanot serialise +directly (most of them), it will first look up the C method on +it. + +If it has a C method, it will call it with the object as only +argument, and expects exactly one return value, which it will then +substitute and encode it in the place of the object. + +Otherwise, it will look up the C method. If it exists, it will +call it with the object as first argument, and the constant string C +as the second argument, to distinguish it from other serialisers. + +The C method can return any number of values (i.e. zero or +more). These will be encoded as CBOR perl object, together with the +classname. + +If an object supports neither C nor C, encoding will fail +with an error. + +Objects encoded via C cannot be automatically decoded, but +objects encoded via C can be decoded using the following protocol: + +When an encoded CBOR perl object is encountered by the decoder, it will +look up the C method, by using the stored classname, and will fail +if the method cannot be found. + +After the lookup it will call the C method with the stored classname +as first argument, the constant string C as second argument, and all +values returned by C as remaining arguments. + +=head4 EXAMPLES + +Here is an example C method: + + sub My::Object::TO_CBOR { + my ($obj) = @_; + + ["this is a serialised My::Object object", $obj->{id}] + } + +When a C is encoded to CBOR, it will instead encode a simple +array with two members: a string, and the "object id". Decoding this CBOR +string will yield a normal perl array reference in place of the object. + +A more useful and practical example would be a serialisation method for +the URI module. CBOR has a custom tag value for URIs, namely 32: + + sub URI::TO_CBOR { + my ($self) = @_; + my $uri = "$self"; # stringify uri + utf8::upgrade $uri; # make sure it will be encoded as UTF-8 string + CBOR::XS::tagged 32, "$_[0]" + } + +This will encode URIs as a UTF-8 string with tag 32, which indicates an +URI. + +Decoding such an URI will not (currently) give you an URI object, but +instead a CBOR::XS::Tagged object with tag number 32 and the string - +exactly what was returned by C. + +To serialise an object so it can automatically be deserialised, you need +to use C and C. To take the URI module as example, this +would be a possible implementation: + + sub URI::FREEZE { + my ($self, $serialiser) = @_; + "$self" # encode url string + } + + sub URI::THAW { + my ($class, $serialiser, $uri) = @_; + + $class->new ($uri) + } + +Unlike C, multiple values can be returned by C. For +example, a C method that returns "type", "id" and "variant" values +would cause an invocation of C with 5 arguments: + + sub My::Object::FREEZE { + my ($self, $serialiser) = @_; + + ($self->{type}, $self->{id}, $self->{variant}) + } + + sub My::Object::THAW { + my ($class, $serialiser, $type, $id, $variant) = @_; + + $class- $type, id => $id, variant => $variant) + } + -=head2 MAGIC HEADER +=head1 MAGIC HEADER There is no way to distinguish CBOR from other formats programmatically. To make it easier to distinguish CBOR from other @@ -360,7 +456,7 @@ required. -=head2 CBOR and JSON +=head1 CBOR and JSON CBOR is supposed to implement a superset of the JSON data model, and is, with some coercion, able to represent all JSON texts (something that other