ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/CBOR-XS/doc/value-sharing.pod
Revision: 1.4
Committed: Fri Nov 25 06:14:14 2016 UTC (7 years, 6 months ago) by root
Branch: MAIN
Changes since 1.3: +1 -1 lines
Log Message:
*** empty log message ***

File Contents

# User Rev Content
1 root 1.1 =head1 REGISTRATION INFORMATION
2    
3 root 1.2 Tag 28 (shareable)
4 root 1.1 Data Item multiple
5     Semantics mark value as (potentially) shared
6     Reference http://cbor.schmorp.de/value-sharing
7     Contact Marc A. Lehmann <cbor@schmorp.de>
8    
9 root 1.2 Tag 29 (sharedref)
10 root 1.1 Data Item unsigned integer
11     Semantics reference nth marked value
12     Reference http://cbor.schmorp.de/value-sharing
13     Contact Marc A. Lehmann <cbor@schmorp.de>
14    
15     =head1 SUMMARY
16    
17     These two tags can be used to implement shared value support in CBOR.
18    
19     =head1 RATIONALE
20    
21     Many serialisable data structures can contain values that are
22     shared. For example, in Perl, you could have an array with two hash
23     references pointing to the same object.
24    
25     When serialising these data structures to CBOR, these values either
26     become unshared (duplicated), or, when the structure contains cycles,
27     they are not serialisable into CBOR at all.
28    
29     This extension implements explicit shared value support - encoders need
30     to explicitly mark values as potentially shared and can later refer to
31     them.
32    
33     This extension is not meant to save space in the CBOR representation by
34     encoding duplicated values only once - the shared values are supposed
35     to refer to the same value after decoding (e.g. when implemented as a
36     pointer, all references to the value should point to the same memory
37     object).
38    
39 root 1.2 Space saving is a side effect of encoding data structures with large
40     shared substructures using this extension - the CBOR representation will
41     then have similar space requirements as the original dats structure.
42    
43 root 1.1 =head1 DESCRIPTION
44    
45     To share values, the first occurrence of the value must be explicitly
46 root 1.2 tagged with the shareable tag (value C<28>).
47 root 1.1
48     Subsequent occurrences can then be encoded by encoding the index
49     of a previously marked value tagged with the sharedref tag (value
50 root 1.2 C<29>). That is, index 0 refers to the first value marked as
51 root 1.1 shareable in the CBOR stream, index 1 to the second and so on.
52    
53     There is no requirement to actually refer to a value marked as
54     shareable - encoders can mark any value they want without ever
55     referring to them.
56    
57     Implementors are advised that, to be able to encode cyclic structures,
58     it must be possible to refer to a value before it is completely
59     decoded. For example, during decoding of a map, some entries can
60     refer to the map being decoded. Thus an implementation cannot decode
61     a shareable value and then record it for later references - it has to
62     record the reference before decoding the value.
63    
64     This can be handled in a variety of ways. For example, in Perl, values
65     not explicitly referenced (hash, array, scalar ref) can not (normally)
66     be shared, so it only has to handle explicit references. Other shared
67     values will usually become unshared, for effienciy reasons.
68    
69     Implementations that do not support sharing can duplicate the values
70     after decoding, or they can use fix-up lists to fix shared references
71     after decoding.
72    
73     =head2 IMPLEMENTATION NOTE
74    
75     The semantics of shareable/sharedref tags require the decoder to be
76     aware and the encoder to be under control of the sequence in which data
77     items are encoded into the CBOR stream. This means these tags cannot be
78     implemented on top of every generic CBOR encoder/decoder (which might
79 root 1.2 reorder entries in a map); they typically need to be integrated into their
80     works.
81 root 1.1
82     =head1 EXAMPLES
83    
84 root 1.2 =head2 A simple shared array
85    
86     The following Perl fragment creates an array reference with three entries,
87     all of which array references themselves. All of them contain the same
88     data, but the first two actually reference the same (shared) data
89     structure:
90    
91     $data = [ ([]) x 2, [] ];
92    
93     This is another way to create it:
94    
95     my $shared = [];
96     $data = [ $shared, $shared, [] ];
97    
98     The shared aspect means that setting an element of the first nested
99     arrayref also makes it visible inside the second nested arrayref, as it is
100     the same array:
101    
102     $data->[0][0] = "test";
103     # results in: [["test"],["test"],[]]
104    
105     A standard CBOR en-/decoder will encode three separate arrays, which
106     will decode into three separate arrays again. So when the original data
107     structure is en- and then decoded, the arrays will be unshared:
108    
109     $unshared = decode_cbor encode_cbor [ ([]) x 2, [] ];
110     $unshared->[0][0] = "test";
111     # results in: [["test"],[],[]]
112    
113     The CBOR encoding might be:
114    
115     83 # array(3)
116     80 # array(0)
117     80 # array(0)
118     80 # array(0)
119    
120     The value-sharing extension allows an encoder to flag the two arrays and
121     keep the shared arrays actually shared. For example, the CBOR::XS encoder,
122     when configured to use the value sharing extension, will emit this CBOR
123     value:
124    
125     [28([]), 29(0), []]
126    
127     83 # array(3)
128     d8 1c # tag(28)
129     80 # array(0)
130     d8 1d # tag(29)
131     00 # unsigned(0)
132     80 # array(0)
133    
134     When decoding it, the first two array references will point to the same array.
135    
136     =head2 A cyclic data structure
137    
138     The following cyclic Perl data structure references itself from within
139     itself. Here a decoder will see a reference to the shared value I<before>
140 root 1.3 it has completely decoded the shared value:
141 root 1.2
142     $data = [];
143 root 1.4 $data->[0] = $data; # make the first array element refer to the array
144 root 1.2
145     This data structure is not representable in standard CBOR. Using the value
146     sharing extension, it can be encoded as follows:
147    
148     28([29(0)])
149    
150     d8 1c # tag(28)
151     81 # array(1)
152     d8 1d # tag(29)
153     00 # unsigned(0)
154 root 1.1