--- CBOR-XS/doc/value-sharing.pod 2013/11/26 09:43:39 1.1 +++ CBOR-XS/doc/value-sharing.pod 2013/11/28 10:39:06 1.3 @@ -1,12 +1,12 @@ =head1 REGISTRATION INFORMATION - Tag (shareable) + Tag 28 (shareable) Data Item multiple Semantics mark value as (potentially) shared Reference http://cbor.schmorp.de/value-sharing Contact Marc A. Lehmann - Tag (sharedref) + Tag 29 (sharedref) Data Item unsigned integer Semantics reference nth marked value Reference http://cbor.schmorp.de/value-sharing @@ -36,14 +36,18 @@ pointer, all references to the value should point to the same memory object). +Space saving is a side effect of encoding data structures with large +shared substructures using this extension - the CBOR representation will +then have similar space requirements as the original dats structure. + =head1 DESCRIPTION To share values, the first occurrence of the value must be explicitly -tagged with the shareable tag (value ). +tagged with the shareable tag (value C<28>). Subsequent occurrences can then be encoded by encoding the index of a previously marked value tagged with the sharedref tag (value -). That is, index 0 refers to the first value marked as +C<29>). That is, index 0 refers to the first value marked as shareable in the CBOR stream, index 1 to the second and so on. There is no requirement to actually refer to a value marked as @@ -72,9 +76,79 @@ aware and the encoder to be under control of the sequence in which data items are encoded into the CBOR stream. This means these tags cannot be implemented on top of every generic CBOR encoder/decoder (which might -reorder entries in a map); they need to be integrated into their works. +reorder entries in a map); they typically need to be integrated into their +works. =head1 EXAMPLES - +=head2 A simple shared array + +The following Perl fragment creates an array reference with three entries, +all of which array references themselves. All of them contain the same +data, but the first two actually reference the same (shared) data +structure: + + $data = [ ([]) x 2, [] ]; + +This is another way to create it: + + my $shared = []; + $data = [ $shared, $shared, [] ]; + +The shared aspect means that setting an element of the first nested +arrayref also makes it visible inside the second nested arrayref, as it is +the same array: + + $data->[0][0] = "test"; + # results in: [["test"],["test"],[]] + +A standard CBOR en-/decoder will encode three separate arrays, which +will decode into three separate arrays again. So when the original data +structure is en- and then decoded, the arrays will be unshared: + + $unshared = decode_cbor encode_cbor [ ([]) x 2, [] ]; + $unshared->[0][0] = "test"; + # results in: [["test"],[],[]] + +The CBOR encoding might be: + + 83 # array(3) + 80 # array(0) + 80 # array(0) + 80 # array(0) + +The value-sharing extension allows an encoder to flag the two arrays and +keep the shared arrays actually shared. For example, the CBOR::XS encoder, +when configured to use the value sharing extension, will emit this CBOR +value: + + [28([]), 29(0), []] + + 83 # array(3) + d8 1c # tag(28) + 80 # array(0) + d8 1d # tag(29) + 00 # unsigned(0) + 80 # array(0) + +When decoding it, the first two array references will point to the same array. + +=head2 A cyclic data structure + +The following cyclic Perl data structure references itself from within +itself. Here a decoder will see a reference to the shared value I +it has completely decoded the shared value: + + $data = []; + $data->[0] = $data; # make the first array eleemnt refer to the array + +This data structure is not representable in standard CBOR. Using the value +sharing extension, it can be encoded as follows: + + 28([29(0)]) + + d8 1c # tag(28) + 81 # array(1) + d8 1d # tag(29) + 00 # unsigned(0)