1 |
root |
1.1 |
=head1 REGISTRATION INFORMATION |
2 |
|
|
|
3 |
root |
1.2 |
Tag 28 (shareable) |
4 |
root |
1.1 |
Data Item multiple |
5 |
|
|
Semantics mark value as (potentially) shared |
6 |
|
|
Reference http://cbor.schmorp.de/value-sharing |
7 |
|
|
Contact Marc A. Lehmann <cbor@schmorp.de> |
8 |
|
|
|
9 |
root |
1.2 |
Tag 29 (sharedref) |
10 |
root |
1.1 |
Data Item unsigned integer |
11 |
|
|
Semantics reference nth marked value |
12 |
|
|
Reference http://cbor.schmorp.de/value-sharing |
13 |
|
|
Contact Marc A. Lehmann <cbor@schmorp.de> |
14 |
|
|
|
15 |
|
|
=head1 SUMMARY |
16 |
|
|
|
17 |
|
|
These two tags can be used to implement shared value support in CBOR. |
18 |
|
|
|
19 |
|
|
=head1 RATIONALE |
20 |
|
|
|
21 |
|
|
Many serialisable data structures can contain values that are |
22 |
|
|
shared. For example, in Perl, you could have an array with two hash |
23 |
|
|
references pointing to the same object. |
24 |
|
|
|
25 |
|
|
When serialising these data structures to CBOR, these values either |
26 |
|
|
become unshared (duplicated), or, when the structure contains cycles, |
27 |
|
|
they are not serialisable into CBOR at all. |
28 |
|
|
|
29 |
|
|
This extension implements explicit shared value support - encoders need |
30 |
|
|
to explicitly mark values as potentially shared and can later refer to |
31 |
|
|
them. |
32 |
|
|
|
33 |
|
|
This extension is not meant to save space in the CBOR representation by |
34 |
|
|
encoding duplicated values only once - the shared values are supposed |
35 |
|
|
to refer to the same value after decoding (e.g. when implemented as a |
36 |
|
|
pointer, all references to the value should point to the same memory |
37 |
|
|
object). |
38 |
|
|
|
39 |
root |
1.2 |
Space saving is a side effect of encoding data structures with large |
40 |
|
|
shared substructures using this extension - the CBOR representation will |
41 |
|
|
then have similar space requirements as the original dats structure. |
42 |
|
|
|
43 |
root |
1.1 |
=head1 DESCRIPTION |
44 |
|
|
|
45 |
|
|
To share values, the first occurrence of the value must be explicitly |
46 |
root |
1.2 |
tagged with the shareable tag (value C<28>). |
47 |
root |
1.1 |
|
48 |
|
|
Subsequent occurrences can then be encoded by encoding the index |
49 |
|
|
of a previously marked value tagged with the sharedref tag (value |
50 |
root |
1.2 |
C<29>). That is, index 0 refers to the first value marked as |
51 |
root |
1.1 |
shareable in the CBOR stream, index 1 to the second and so on. |
52 |
|
|
|
53 |
|
|
There is no requirement to actually refer to a value marked as |
54 |
|
|
shareable - encoders can mark any value they want without ever |
55 |
|
|
referring to them. |
56 |
|
|
|
57 |
|
|
Implementors are advised that, to be able to encode cyclic structures, |
58 |
|
|
it must be possible to refer to a value before it is completely |
59 |
|
|
decoded. For example, during decoding of a map, some entries can |
60 |
|
|
refer to the map being decoded. Thus an implementation cannot decode |
61 |
|
|
a shareable value and then record it for later references - it has to |
62 |
|
|
record the reference before decoding the value. |
63 |
|
|
|
64 |
|
|
This can be handled in a variety of ways. For example, in Perl, values |
65 |
|
|
not explicitly referenced (hash, array, scalar ref) can not (normally) |
66 |
|
|
be shared, so it only has to handle explicit references. Other shared |
67 |
|
|
values will usually become unshared, for effienciy reasons. |
68 |
|
|
|
69 |
|
|
Implementations that do not support sharing can duplicate the values |
70 |
|
|
after decoding, or they can use fix-up lists to fix shared references |
71 |
|
|
after decoding. |
72 |
|
|
|
73 |
|
|
=head2 IMPLEMENTATION NOTE |
74 |
|
|
|
75 |
|
|
The semantics of shareable/sharedref tags require the decoder to be |
76 |
|
|
aware and the encoder to be under control of the sequence in which data |
77 |
|
|
items are encoded into the CBOR stream. This means these tags cannot be |
78 |
|
|
implemented on top of every generic CBOR encoder/decoder (which might |
79 |
root |
1.2 |
reorder entries in a map); they typically need to be integrated into their |
80 |
|
|
works. |
81 |
root |
1.1 |
|
82 |
|
|
=head1 EXAMPLES |
83 |
|
|
|
84 |
root |
1.2 |
=head2 A simple shared array |
85 |
|
|
|
86 |
|
|
The following Perl fragment creates an array reference with three entries, |
87 |
|
|
all of which array references themselves. All of them contain the same |
88 |
|
|
data, but the first two actually reference the same (shared) data |
89 |
|
|
structure: |
90 |
|
|
|
91 |
|
|
$data = [ ([]) x 2, [] ]; |
92 |
|
|
|
93 |
|
|
This is another way to create it: |
94 |
|
|
|
95 |
|
|
my $shared = []; |
96 |
|
|
$data = [ $shared, $shared, [] ]; |
97 |
|
|
|
98 |
|
|
The shared aspect means that setting an element of the first nested |
99 |
|
|
arrayref also makes it visible inside the second nested arrayref, as it is |
100 |
|
|
the same array: |
101 |
|
|
|
102 |
|
|
$data->[0][0] = "test"; |
103 |
|
|
# results in: [["test"],["test"],[]] |
104 |
|
|
|
105 |
|
|
A standard CBOR en-/decoder will encode three separate arrays, which |
106 |
|
|
will decode into three separate arrays again. So when the original data |
107 |
|
|
structure is en- and then decoded, the arrays will be unshared: |
108 |
|
|
|
109 |
|
|
$unshared = decode_cbor encode_cbor [ ([]) x 2, [] ]; |
110 |
|
|
$unshared->[0][0] = "test"; |
111 |
|
|
# results in: [["test"],[],[]] |
112 |
|
|
|
113 |
|
|
The CBOR encoding might be: |
114 |
|
|
|
115 |
|
|
83 # array(3) |
116 |
|
|
80 # array(0) |
117 |
|
|
80 # array(0) |
118 |
|
|
80 # array(0) |
119 |
|
|
|
120 |
|
|
The value-sharing extension allows an encoder to flag the two arrays and |
121 |
|
|
keep the shared arrays actually shared. For example, the CBOR::XS encoder, |
122 |
|
|
when configured to use the value sharing extension, will emit this CBOR |
123 |
|
|
value: |
124 |
|
|
|
125 |
|
|
[28([]), 29(0), []] |
126 |
|
|
|
127 |
|
|
83 # array(3) |
128 |
|
|
d8 1c # tag(28) |
129 |
|
|
80 # array(0) |
130 |
|
|
d8 1d # tag(29) |
131 |
|
|
00 # unsigned(0) |
132 |
|
|
80 # array(0) |
133 |
|
|
|
134 |
|
|
When decoding it, the first two array references will point to the same array. |
135 |
|
|
|
136 |
|
|
=head2 A cyclic data structure |
137 |
|
|
|
138 |
|
|
The following cyclic Perl data structure references itself from within |
139 |
|
|
itself. Here a decoder will see a reference to the shared value I<before> |
140 |
root |
1.3 |
it has completely decoded the shared value: |
141 |
root |
1.2 |
|
142 |
|
|
$data = []; |
143 |
root |
1.4 |
$data->[0] = $data; # make the first array element refer to the array |
144 |
root |
1.2 |
|
145 |
|
|
This data structure is not representable in standard CBOR. Using the value |
146 |
|
|
sharing extension, it can be encoded as follows: |
147 |
|
|
|
148 |
|
|
28([29(0)]) |
149 |
|
|
|
150 |
|
|
d8 1c # tag(28) |
151 |
|
|
81 # array(1) |
152 |
|
|
d8 1d # tag(29) |
153 |
|
|
00 # unsigned(0) |
154 |
root |
1.1 |
|