… | |
… | |
6 | protocol which is used to authenticate tunnels and send encrypted data |
6 | protocol which is used to authenticate tunnels and send encrypted data |
7 | packets. This protocol is described in more detail the second part of this |
7 | packets. This protocol is described in more detail the second part of this |
8 | document. |
8 | document. |
9 | |
9 | |
10 | The first part of this document describes the transport protocols which |
10 | The first part of this document describes the transport protocols which |
11 | are used by GVPE to send it's data packets over the network. |
11 | are used by GVPE to send its data packets over the network. |
12 | |
12 | |
13 | =head1 PART 1: Transport protocols |
13 | =head1 PART 1: Transport protocols |
14 | |
14 | |
15 | GVPE offers a wide range of transport protocols that can be used to |
15 | GVPE offers a wide range of transport protocols that can be used to |
16 | interchange data between nodes. Protocols differ in their overhead, speed, |
16 | interchange data between nodes. Protocols differ in their overhead, speed, |
… | |
… | |
54 | It should be used if RAW IP is not available. |
54 | It should be used if RAW IP is not available. |
55 | |
55 | |
56 | =head2 TCP |
56 | =head2 TCP |
57 | |
57 | |
58 | This protocol is a very bad choice, as it not only has high overhead (more |
58 | This protocol is a very bad choice, as it not only has high overhead (more |
59 | than 60 bytes), but the transport also retries on it's own, which leads |
59 | than 60 bytes), but the transport also retries on its own, which leads |
60 | to congestion when the link has moderate packet loss (as both the TCP |
60 | to congestion when the link has moderate packet loss (as both the TCP |
61 | transport and the tunneled traffic will retry, increasing congestion more |
61 | transport and the tunneled traffic will retry, increasing congestion more |
62 | and more). It also has high latency and is quite inefficient. |
62 | and more). It also has high latency and is quite inefficient. |
63 | |
63 | |
64 | It's only useful when tunneling through firewalls that block better |
64 | It's only useful when tunneling through firewalls that block better |
… | |
… | |
142 | |
142 | |
143 | RAND is a sequence of fully random bytes, used to increase the entropy of |
143 | RAND is a sequence of fully random bytes, used to increase the entropy of |
144 | the data for encryption purposes. |
144 | the data for encryption purposes. |
145 | |
145 | |
146 | SEQNO is a 32-bit sequence number. It is negotiated at every connection |
146 | SEQNO is a 32-bit sequence number. It is negotiated at every connection |
147 | initialization and starts at some random 31 bit value. VPE currently uses |
147 | initialization and starts at some random 31 bit value. GVPE currently uses |
148 | a sliding window of 512 packets/sequence numbers to detect reordering, |
148 | a sliding window of 512 packets/sequence numbers to detect reordering, |
149 | duplication and replay attacks. |
149 | duplication and replay attacks. |
150 | |
150 | |
|
|
151 | The encryption is done on RAND+SEQNO+DATA in CBC mode with zero IV (or, |
|
|
152 | equivalently, the IV is RAND+SEQNO, encrypted with the block cipher, |
|
|
153 | unless RAND size is decreased or increased over the default value). |
|
|
154 | |
|
|
155 | The random prefix itself is generated by using AES in CTR mode with a |
|
|
156 | random key and starting value, which should make them unpredictable even |
|
|
157 | before encrypting them again. The sequence number additionally ensures |
|
|
158 | that the IV is unique. |
|
|
159 | |
151 | =head2 The authentication protocol |
160 | =head2 The authentication/key exchange protocol |
152 | |
161 | |
153 | Before nodes can exchange packets, they need to establish authenticity of |
162 | Before nodes can exchange packets, they need to establish authenticity of |
154 | the other side and a key. Every node has a private RSA key and the public |
163 | the other side and a key. Every node has a private RSA key and the public |
155 | RSA keys of all other nodes. |
164 | RSA keys of all other nodes. |
156 | |
165 | |
157 | A host establishes a simplex connection by sending the other node an RSA |
166 | When a node wants to establish a connection to another node, it sends an |
158 | encrypted challenge containing a random challenge (consisting of the |
167 | RSA-OEAP-encrypted challenge and an ECDH (curve25519) key. The other node |
159 | encryption and authentication keys to use when sending packets, more |
168 | replies with its own ECDH key and a HKDF of the challenge and both ECDH |
160 | random data and PKCS1_OAEP padding) and a random 16 byte "challenge-id" |
169 | keys to prove its identity. |
161 | (used to detect duplicate auth packets). The destination node will respond |
|
|
162 | by replying with an (unencrypted) hash of the decrypted challenge, which |
|
|
163 | will authenticate that node. The destination node will also set the |
|
|
164 | outgoing encryption parameters as given in the packet. |
|
|
165 | |
170 | |
166 | When the source node receives a correct auth reply (by verifying the |
171 | The remote node enganges in exactly the same protocol. When both nodes |
167 | hash and the id, which will expire after 120 seconds), it will start to |
172 | have exchanged their challenge and verified the response, they calculate a |
168 | accept data packets from the destination node. |
173 | cipher key and a HMAC key and start exchanging data packets. |
169 | |
174 | |
170 | This means that a node can only initiate a simplex connection, telling the |
175 | In detail, the challenge consist of: |
171 | other side the key it has to use when it sends packets. The challenge |
|
|
172 | reply is only used to set the current IP address of the other side and |
|
|
173 | protocol parameters. |
|
|
174 | |
176 | |
175 | This protocol is completely symmetric, so to be able to send packets the |
177 | RSA-OAEP (SEQNO MAC CIPHER SALT EXTRA-AUTH) ECDH1 |
176 | destination node must send a challenge in the exact same way as already |
178 | |
177 | described (so, in essence, two simplex connections are created per node |
179 | That is, it encrypts (with the public key of the remote node) an initial |
178 | pair). |
180 | sequence number for data packets, key material for the HMAC key, key |
|
|
181 | material for the cipher key, a salt used by the HKDF (as shown later) and |
|
|
182 | some extra random bytes that are unused except for authentication. It also |
|
|
183 | sends the public key of a curve25519 exchange. |
|
|
184 | |
|
|
185 | The remote node decrypts the RSA data, generates its own ECDH key (ECDH2), |
|
|
186 | and replies with: |
|
|
187 | |
|
|
188 | HKDF-Expand (HKDF-Extract (ECDH2, RSA), ECDH1, AUTH_DIGEST_SIZE) ECDH2 |
|
|
189 | |
|
|
190 | That is, it extracts from the decrypted RSA challenge, using its ECDH |
|
|
191 | key as salt, and then expands using the requesting node's ECDH1 key. The |
|
|
192 | resulting hash is returned as a proof that the node could decrypt the RSA |
|
|
193 | challenge data, together with the ECDH key. |
|
|
194 | |
|
|
195 | After both nodes have done this to each other, they calculate the shared |
|
|
196 | ECDH secret, cipher and HMAC keys for the session (each node generates two |
|
|
197 | cipher and HMAC keys, one for sending and one for receiving). |
|
|
198 | |
|
|
199 | The HMAC key for sending is generated as follow: |
|
|
200 | |
|
|
201 | HMAC_KEY = HKDF-Expand (HKDF-Extract (REMOTE_SALT, MAC ECDH_SECRET), info, HMAC_MD_SIZE) |
|
|
202 | |
|
|
203 | It extracts from MAC and ECDH_SECRET using the I<remote> SALT, then |
|
|
204 | expands using a static info string. |
|
|
205 | |
|
|
206 | The cipher key is generated in the same way, except using the CIPHER part |
|
|
207 | of the original challenge. |
|
|
208 | |
|
|
209 | The result of this process is to authenticate each node to the other |
|
|
210 | node, while exchanging keys using both RSA and ECDH, the latter providing |
|
|
211 | perfect forward secrecy. |
|
|
212 | |
|
|
213 | The protocol has been overdesigned where this was possible without |
|
|
214 | increasing implementation complexity, in an attempt to protect against |
|
|
215 | implementation or protocol failures. For example, if the ECDH challenge |
|
|
216 | was found to be flawed, perfect forward secrecy would be lost, but the |
|
|
217 | data would likely still be protected. Likewise, standard algorithms and |
|
|
218 | implementations are used where possible. |
179 | |
219 | |
180 | =head2 Retrying |
220 | =head2 Retrying |
181 | |
221 | |
182 | When there is no response to an auth request, the node will send auth |
222 | When there is no response to an auth request, the node will send auth |
183 | requests in bursts with an exponential back-off. After some time it will |
223 | requests in bursts with an exponential back-off. After some time it will |
… | |
… | |
230 | |
270 | |
231 | =item Failing all that, the packet will be dropped. |
271 | =item Failing all that, the packet will be dropped. |
232 | |
272 | |
233 | =back |
273 | =back |
234 | |
274 | |
235 | A host can usually declare itself unreachable directly by setting it's |
275 | A host can usually declare itself unreachable directly by setting its |
236 | port number(s) to zero. It can declare other hosts as unreachable by using |
276 | port number(s) to zero. It can declare other hosts as unreachable by using |
237 | a config-file that disables all protocols for these other hosts. Another |
277 | a config-file that disables all protocols for these other hosts. Another |
238 | option is to disable all protocols on that host in the other config files. |
278 | option is to disable all protocols on that host in the other config files. |
239 | |
279 | |
240 | If two hosts cannot connect to each other because their IP address(es) |
280 | If two hosts cannot connect to each other because their IP address(es) |
… | |
… | |
246 | connection to the other peer, which is usually possible even when both |
286 | connection to the other peer, which is usually possible even when both |
247 | hosts are behind a NAT gateway. |
287 | hosts are behind a NAT gateway. |
248 | |
288 | |
249 | Routing via other nodes works because the SRCDST field is not encrypted, |
289 | Routing via other nodes works because the SRCDST field is not encrypted, |
250 | so the router can just forward the packet to the destination host. Since |
290 | so the router can just forward the packet to the destination host. Since |
251 | each host uses it's own private key, the router will not be able to |
291 | each host uses its own private key, the router will not be able to |
252 | decrypt or encrypt packets, it will just act as a simple router and |
292 | decrypt or encrypt packets, it will just act as a simple router and |
253 | protocol translator. |
293 | protocol translator. |
254 | |
294 | |
255 | |
295 | |