1 | =head1 REGISTRATION INFORMATION |
1 | =head1 REGISTRATION INFORMATION |
2 | |
2 | |
3 | Tag <unassigned> (shareable) |
3 | Tag 28 (shareable) |
4 | Data Item multiple |
4 | Data Item multiple |
5 | Semantics mark value as (potentially) shared |
5 | Semantics mark value as (potentially) shared |
6 | Reference http://cbor.schmorp.de/value-sharing |
6 | Reference http://cbor.schmorp.de/value-sharing |
7 | Contact Marc A. Lehmann <cbor@schmorp.de> |
7 | Contact Marc A. Lehmann <cbor@schmorp.de> |
8 | |
8 | |
9 | Tag <unassigned> (sharedref) |
9 | Tag 29 (sharedref) |
10 | Data Item unsigned integer |
10 | Data Item unsigned integer |
11 | Semantics reference nth marked value |
11 | Semantics reference nth marked value |
12 | Reference http://cbor.schmorp.de/value-sharing |
12 | Reference http://cbor.schmorp.de/value-sharing |
13 | Contact Marc A. Lehmann <cbor@schmorp.de> |
13 | Contact Marc A. Lehmann <cbor@schmorp.de> |
14 | |
14 | |
… | |
… | |
34 | encoding duplicated values only once - the shared values are supposed |
34 | encoding duplicated values only once - the shared values are supposed |
35 | to refer to the same value after decoding (e.g. when implemented as a |
35 | to refer to the same value after decoding (e.g. when implemented as a |
36 | pointer, all references to the value should point to the same memory |
36 | pointer, all references to the value should point to the same memory |
37 | object). |
37 | object). |
38 | |
38 | |
|
|
39 | Space saving is a side effect of encoding data structures with large |
|
|
40 | shared substructures using this extension - the CBOR representation will |
|
|
41 | then have similar space requirements as the original dats structure. |
|
|
42 | |
39 | =head1 DESCRIPTION |
43 | =head1 DESCRIPTION |
40 | |
44 | |
41 | To share values, the first occurrence of the value must be explicitly |
45 | To share values, the first occurrence of the value must be explicitly |
42 | tagged with the shareable tag (value <unassigned>). |
46 | tagged with the shareable tag (value C<28>). |
43 | |
47 | |
44 | Subsequent occurrences can then be encoded by encoding the index |
48 | Subsequent occurrences can then be encoded by encoding the index |
45 | of a previously marked value tagged with the sharedref tag (value |
49 | of a previously marked value tagged with the sharedref tag (value |
46 | <unassigned>). That is, index 0 refers to the first value marked as |
50 | C<29>). That is, index 0 refers to the first value marked as |
47 | shareable in the CBOR stream, index 1 to the second and so on. |
51 | shareable in the CBOR stream, index 1 to the second and so on. |
48 | |
52 | |
49 | There is no requirement to actually refer to a value marked as |
53 | There is no requirement to actually refer to a value marked as |
50 | shareable - encoders can mark any value they want without ever |
54 | shareable - encoders can mark any value they want without ever |
51 | referring to them. |
55 | referring to them. |
… | |
… | |
70 | |
74 | |
71 | The semantics of shareable/sharedref tags require the decoder to be |
75 | The semantics of shareable/sharedref tags require the decoder to be |
72 | aware and the encoder to be under control of the sequence in which data |
76 | aware and the encoder to be under control of the sequence in which data |
73 | items are encoded into the CBOR stream. This means these tags cannot be |
77 | items are encoded into the CBOR stream. This means these tags cannot be |
74 | implemented on top of every generic CBOR encoder/decoder (which might |
78 | implemented on top of every generic CBOR encoder/decoder (which might |
75 | reorder entries in a map); they need to be integrated into their works. |
79 | reorder entries in a map); they typically need to be integrated into their |
|
|
80 | works. |
76 | |
81 | |
77 | =head1 EXAMPLES |
82 | =head1 EXAMPLES |
78 | |
83 | |
79 | <TBD> |
84 | =head2 A simple shared array |
80 | |
85 | |
|
|
86 | The following Perl fragment creates an array reference with three entries, |
|
|
87 | all of which array references themselves. All of them contain the same |
|
|
88 | data, but the first two actually reference the same (shared) data |
|
|
89 | structure: |
|
|
90 | |
|
|
91 | $data = [ ([]) x 2, [] ]; |
|
|
92 | |
|
|
93 | This is another way to create it: |
|
|
94 | |
|
|
95 | my $shared = []; |
|
|
96 | $data = [ $shared, $shared, [] ]; |
|
|
97 | |
|
|
98 | The shared aspect means that setting an element of the first nested |
|
|
99 | arrayref also makes it visible inside the second nested arrayref, as it is |
|
|
100 | the same array: |
|
|
101 | |
|
|
102 | $data->[0][0] = "test"; |
|
|
103 | # results in: [["test"],["test"],[]] |
|
|
104 | |
|
|
105 | A standard CBOR en-/decoder will encode three separate arrays, which |
|
|
106 | will decode into three separate arrays again. So when the original data |
|
|
107 | structure is en- and then decoded, the arrays will be unshared: |
|
|
108 | |
|
|
109 | $unshared = decode_cbor encode_cbor [ ([]) x 2, [] ]; |
|
|
110 | $unshared->[0][0] = "test"; |
|
|
111 | # results in: [["test"],[],[]] |
|
|
112 | |
|
|
113 | The CBOR encoding might be: |
|
|
114 | |
|
|
115 | 83 # array(3) |
|
|
116 | 80 # array(0) |
|
|
117 | 80 # array(0) |
|
|
118 | 80 # array(0) |
|
|
119 | |
|
|
120 | The value-sharing extension allows an encoder to flag the two arrays and |
|
|
121 | keep the shared arrays actually shared. For example, the CBOR::XS encoder, |
|
|
122 | when configured to use the value sharing extension, will emit this CBOR |
|
|
123 | value: |
|
|
124 | |
|
|
125 | [28([]), 29(0), []] |
|
|
126 | |
|
|
127 | 83 # array(3) |
|
|
128 | d8 1c # tag(28) |
|
|
129 | 80 # array(0) |
|
|
130 | d8 1d # tag(29) |
|
|
131 | 00 # unsigned(0) |
|
|
132 | 80 # array(0) |
|
|
133 | |
|
|
134 | When decoding it, the first two array references will point to the same array. |
|
|
135 | |
|
|
136 | =head2 A cyclic data structure |
|
|
137 | |
|
|
138 | The following cyclic Perl data structure references itself from within |
|
|
139 | itself. Here a decoder will see a reference to the shared value I<before> |
|
|
140 | it has completely decoded the shared value: |
|
|
141 | |
|
|
142 | $data = []; |
|
|
143 | $data->[0] = $data; # make the first array eleemnt refer to the array |
|
|
144 | |
|
|
145 | This data structure is not representable in standard CBOR. Using the value |
|
|
146 | sharing extension, it can be encoded as follows: |
|
|
147 | |
|
|
148 | 28([29(0)]) |
|
|
149 | |
|
|
150 | d8 1c # tag(28) |
|
|
151 | 81 # array(1) |
|
|
152 | d8 1d # tag(29) |
|
|
153 | 00 # unsigned(0) |
|
|
154 | |