1 | =head1 The GNU-VPE Protocol |
1 | =head1 The GNU-VPE Protocols |
|
|
2 | |
|
|
3 | =head1 Overview |
|
|
4 | |
|
|
5 | GVPE can make use of a number of protocols. One of them is the GNU VPE |
|
|
6 | protocol which is used to authenticate tunnels and send encrypted data |
|
|
7 | packets. This protocol is described in more detail the second part of this |
|
|
8 | document. |
|
|
9 | |
|
|
10 | The first part of this document describes the transport protocols which |
|
|
11 | are used by GVPE to send its data packets over the network. |
|
|
12 | |
|
|
13 | =head1 PART 1: Transport protocols |
|
|
14 | |
|
|
15 | GVPE offers a wide range of transport protocols that can be used to |
|
|
16 | interchange data between nodes. Protocols differ in their overhead, speed, |
|
|
17 | reliability, and robustness. |
|
|
18 | |
|
|
19 | The following sections describe each transport protocol in more |
|
|
20 | detail. They are sorted by overhead/efficiency, the most efficient |
|
|
21 | transport is listed first: |
|
|
22 | |
|
|
23 | =head2 RAW IP |
|
|
24 | |
|
|
25 | This protocol is the best choice, performance-wise, as the minimum |
|
|
26 | overhead per packet is only 38 bytes. |
|
|
27 | |
|
|
28 | It works by sending the VPN payload using raw IP frames (using the |
|
|
29 | protocol set by C<ip-proto>). |
|
|
30 | |
|
|
31 | Using raw IP frames has the drawback that many firewalls block "unknown" |
|
|
32 | protocols, so this transport only works if you have full IP connectivity |
|
|
33 | between nodes. |
|
|
34 | |
|
|
35 | =head2 ICMP |
|
|
36 | |
|
|
37 | This protocol offers very low overhead (minimum 42 bytes), and can |
|
|
38 | sometimes tunnel through firewalls when other protocols can not. |
|
|
39 | |
|
|
40 | It works by prepending an ICMP header with type C<icmp-type> and a code |
|
|
41 | of C<255>. The default C<icmp-type> is C<echo-reply>, so the resulting |
|
|
42 | packets look like echo replies, which looks rather strange to network |
|
|
43 | administrators. |
|
|
44 | |
|
|
45 | This transport should only be used if other transports (i.e. raw IP) are |
|
|
46 | not available or undesirable (due to their overhead). |
|
|
47 | |
|
|
48 | =head2 UDP |
|
|
49 | |
|
|
50 | This is a good general choice for the transport protocol as UDP packets |
|
|
51 | tunnel well through most firewalls and routers, and the overhead per |
|
|
52 | packet is moderate (minimum 58 bytes). |
|
|
53 | |
|
|
54 | It should be used if RAW IP is not available. |
|
|
55 | |
|
|
56 | =head2 TCP |
|
|
57 | |
|
|
58 | This protocol is a very bad choice, as it not only has high overhead (more |
|
|
59 | than 60 bytes), but the transport also retries on its own, which leads |
|
|
60 | to congestion when the link has moderate packet loss (as both the TCP |
|
|
61 | transport and the tunneled traffic will retry, increasing congestion more |
|
|
62 | and more). It also has high latency and is quite inefficient. |
|
|
63 | |
|
|
64 | It's only useful when tunneling through firewalls that block better |
|
|
65 | protocols. If a node doesn't have direct internet access but a HTTP proxy |
|
|
66 | that supports the CONNECT method it can be used to tunnel through a web |
|
|
67 | proxy. For this to work, the C<tcp-port> should be C<443> (C<https>), as |
|
|
68 | most proxies do not allow connections to other ports. |
|
|
69 | |
|
|
70 | It is an abuse of the usage a proxy was designed for, so make sure you are |
|
|
71 | allowed to use it for GVPE. |
|
|
72 | |
|
|
73 | This protocol also has server and client sides. If the C<tcp-port> is |
|
|
74 | set to zero, other nodes cannot connect to this node directly. If the |
|
|
75 | C<tcp-port> is non-zero, the node can act both as a client as well as a |
|
|
76 | server. |
|
|
77 | |
|
|
78 | =head2 DNS |
|
|
79 | |
|
|
80 | B<WARNING:> Parsing and generating DNS packets is rather tricky. The code |
|
|
81 | almost certainly contains buffer overflows and other, likely exploitable, |
|
|
82 | bugs. You have been warned. |
|
|
83 | |
|
|
84 | This is the worst choice of transport protocol with respect to overhead |
|
|
85 | (overhead can be 2-3 times higher than the transferred data), and latency |
|
|
86 | (which can be many seconds). Some DNS servers might not be prepared to |
|
|
87 | handle the traffic and drop or corrupt packets. The client also has to |
|
|
88 | constantly poll the server for data, so the client will constantly create |
|
|
89 | traffic even if it doesn't need to transport packets. |
|
|
90 | |
|
|
91 | In addition, the same problems as the TCP transport also plague this |
|
|
92 | protocol. |
|
|
93 | |
|
|
94 | Its only use is to tunnel through firewalls that do not allow direct |
|
|
95 | internet access. Similar to using a HTTP proxy (as the TCP transport |
|
|
96 | does), it uses a local DNS server/forwarder (given by the C<dns-forw-host> |
|
|
97 | configuration value) as a proxy to send and receive data as a client, |
|
|
98 | and an C<NS> record pointing to the GVPE server (as given by the |
|
|
99 | C<dns-hostname> directive). |
|
|
100 | |
|
|
101 | The only good side of this protocol is that it can tunnel through most |
|
|
102 | firewalls mostly undetected, iff the local DNS server/forwarder is sane |
|
|
103 | (which is true for most routers, wireless LAN gateways and nameservers). |
|
|
104 | |
|
|
105 | Fine-tuning needs to be done by editing C<src/vpn_dns.C> directly. |
|
|
106 | |
|
|
107 | =head1 PART 2: The GNU VPE protocol |
|
|
108 | |
|
|
109 | This section, unfortunately, is not yet finished, although the protocol |
|
|
110 | is stable (until bugs in the cryptography are found, which will likely |
|
|
111 | completely change the following description). Nevertheless, it should give |
|
|
112 | you some overview over the protocol. |
2 | |
113 | |
3 | =head2 Anatomy of a VPN packet |
114 | =head2 Anatomy of a VPN packet |
4 | |
115 | |
5 | The exact layout and field lengths of a VPN packet is determined at |
116 | The exact layout and field lengths of a VPN packet is determined at |
6 | compiletime and doesn't change. The same structure is used for all |
117 | compile time and doesn't change. The same structure is used for all |
7 | protocols, be it rawip or tcp. |
118 | transport protocols, be it RAWIP or TCP. |
8 | |
119 | |
9 | +------+------+--------+------+ |
120 | +------+------+--------+------+ |
10 | | HMAC | TYPE | SRCDST | DATA | |
121 | | HMAC | TYPE | SRCDST | DATA | |
11 | +------+------+--------+------+ |
122 | +------+------+--------+------+ |
12 | |
123 | |
13 | The HMAC field is present in all packets, even if not used (e.g. in auth |
124 | The HMAC field is present in all packets, even if not used (e.g. in auth |
14 | request packets), in which case it is set to all zeroes. The checksum |
125 | request packets), in which case it is set to all zeroes. The checksum |
15 | itself is over the TYPE, SRCDST and DATA fields in all cases. |
126 | itself is calculated over the TYPE, SRCDST and DATA fields in all cases. |
16 | |
127 | |
17 | The TYPE field is a single byte and determines the purpose of the packet |
128 | The TYPE field is a single byte and determines the purpose of the packet |
18 | (e.g. RESET, COMPRESSED/UNCOMPRESSED DATA, PING, AUTH REQUEST/RESPONSE, |
129 | (e.g. RESET, COMPRESSED/UNCOMPRESSED DATA, PING, AUTH REQUEST/RESPONSE, |
19 | CONNECT REQUEST/INFO etc.). |
130 | CONNECT REQUEST/INFO etc.). |
20 | |
131 | |
21 | SRCDST is a three byte field which contains the source and destination |
132 | SRCDST is a three byte field which contains the source and destination |
22 | node ids (12 bits each). The protocol does not yet scale well beyond 30+ |
133 | node IDs (12 bits each). |
23 | hosts, since all hosts connect to each other on startup. But if restarts |
|
|
24 | are rare or tolerable and most connections are on demand, larger networks |
|
|
25 | are possible. |
|
|
26 | |
134 | |
27 | The DATA portion differs between each packet type, naturally, and is the |
135 | The DATA portion differs between each packet type, naturally, and is the |
28 | only part that can be encrypted. Data packets contain more fields, as |
136 | only part that can be encrypted. Data packets contain more fields, as |
29 | shown: |
137 | shown: |
30 | |
138 | |
… | |
… | |
34 | |
142 | |
35 | RAND is a sequence of fully random bytes, used to increase the entropy of |
143 | RAND is a sequence of fully random bytes, used to increase the entropy of |
36 | the data for encryption purposes. |
144 | the data for encryption purposes. |
37 | |
145 | |
38 | SEQNO is a 32-bit sequence number. It is negotiated at every connection |
146 | SEQNO is a 32-bit sequence number. It is negotiated at every connection |
39 | initialization and starts at some random 31 bit value. VPE currently uses |
147 | initialization and starts at some random 31 bit value. GVPE currently uses |
40 | a sliding window of 512 packets to detect reordering, duplication and |
148 | a sliding window of 512 packets/sequence numbers to detect reordering, |
41 | reply attacks. |
149 | duplication and replay attacks. |
42 | |
150 | |
|
|
151 | The encryption is done on RAND+SEQNO+DATA in CBC mode with zero IV (or, |
|
|
152 | equivalently, the IV is RAND+SEQNO, encrypted with the block cipher, |
|
|
153 | unless RAND size is decreased or increased over the default value). |
|
|
154 | |
|
|
155 | The random prefix itself is generated by using AES in CTR mode with a |
|
|
156 | random key and starting value, which should make them unpredictable even |
|
|
157 | before encrypting them again. The sequence number additionally ensures |
|
|
158 | that the IV is unique. |
|
|
159 | |
43 | =head2 The authentification protocol |
160 | =head2 The authentication/key exchange protocol |
44 | |
161 | |
45 | Before hosts can exchange packets, they need to establish authenticity of |
162 | Before nodes can exchange packets, they need to establish authenticity of |
46 | the other side and a key. Every host has a private RSA key and the public |
163 | the other side and a key. Every node has a private RSA key and the public |
47 | RSA keys of all other hosts. |
164 | RSA keys of all other nodes. |
48 | |
165 | |
49 | A host establishes a simplex connection by sending the other host a |
166 | When a node wants to establish a connection to another node, it sends an |
50 | RSA encrypted challenge containing a random challenge (consisting of |
167 | RSA-OEAP-encrypted challenge and an ECDH (curve25519) key. The other node |
51 | the encryption key to use when sending packets, more random data and |
168 | replies with its own ECDH key and a HKDF of the challenge and both ECDH |
52 | PKCS1_OAEP padding) and a random 16 byte "challenge-id" (used to detect |
169 | keys to prove its identity. |
53 | duplicate auth packets). The destination host will respond by replying |
|
|
54 | with an (unencrypted) RIPEMD160 hash of the decrypted challenge, which |
|
|
55 | will authentify that host. The destination host will also set the outgoing |
|
|
56 | encryption parameters as given in the packet. |
|
|
57 | |
170 | |
58 | When the source host receives a correct auth reply (by verifying the |
171 | The remote node enganges in exactly the same protocol. When both nodes |
59 | hash and the id, which will expire after 120 seconds), it will start to |
172 | have exchanged their challenge and verified the response, they calculate a |
60 | accept data packets from the destination host. |
173 | cipher key and a HMAC key and start exchanging data packets. |
61 | |
174 | |
62 | This means that a host can only initate a simplex connection, telling the |
175 | In detail, the challenge consist of: |
63 | other side the key it has to use when it sends packets. The challenge |
|
|
64 | reply is only used to set the current IP address and protocol parameters. |
|
|
65 | |
176 | |
66 | The protocol here is completely symmetric, so to be able to send packets |
177 | RSA-OAEP (SEQNO MAC CIPHER SALT EXTRA-AUTH) ECDH1 |
67 | the destination host must send a challenge in the exact same way as |
178 | |
68 | already described (so, in essence, two simplex connections are created per |
179 | That is, it encrypts (with the public key of the remote node) an initial |
69 | host pair). |
180 | sequence number for data packets, key material for the HMAC key, key |
|
|
181 | material for the cipher key, a salt used by the HKDF (as shown later) and |
|
|
182 | some extra random bytes that are unused except for authentication. It also |
|
|
183 | sends the public key of a curve25519 exchange. |
|
|
184 | |
|
|
185 | The remote node decrypts the RSA data, generates its own ECDH key (ECDH2), |
|
|
186 | and replies with: |
|
|
187 | |
|
|
188 | HKDF-Expand (HKDF-Extract (ECDH2, RSA), ECDH1, AUTH_DIGEST_SIZE) ECDH2 |
|
|
189 | |
|
|
190 | That is, it extracts from the decrypted RSA challenge, using its ECDH |
|
|
191 | key as salt, and then expands using the requesting node's ECDH1 key. The |
|
|
192 | resulting hash is returned as a proof that the node could decrypt the RSA |
|
|
193 | challenge data, together with the ECDH key. |
|
|
194 | |
|
|
195 | After both nodes have done this to each other, they calculate the shared |
|
|
196 | ECDH secret, cipher and HMAC keys for the session (each node generates two |
|
|
197 | cipher and HMAC keys, one for sending and one for receiving). |
|
|
198 | |
|
|
199 | The HMAC key for sending is generated as follow: |
|
|
200 | |
|
|
201 | HMAC_KEY = HKDF-Expand (HKDF-Extract (REMOTE_SALT, MAC ECDH_SECRET), info, HMAC_MD_SIZE) |
|
|
202 | |
|
|
203 | It extracts from MAC and ECDH_SECRET using the I<remote> SALT, then |
|
|
204 | expands using a static info string. |
|
|
205 | |
|
|
206 | The cipher key is generated in the same way, except using the CIPHER part |
|
|
207 | of the original challenge. |
|
|
208 | |
|
|
209 | The result of this process is to authenticate each node to the other |
|
|
210 | node, while exchanging keys using both RSA and ECDH, the latter providing |
|
|
211 | perfect forward secrecy. |
|
|
212 | |
|
|
213 | The protocol has been overdesigned where this was possible without |
|
|
214 | increasing implementation complexity, in an attempt to protect against |
|
|
215 | implementation or protocol failures. For example, if the ECDH challenge |
|
|
216 | was found to be flawed, perfect forward secrecy would be lost, but the |
|
|
217 | data would likely still be protected. Likewise, standard algorithms and |
|
|
218 | implementations are used where possible. |
70 | |
219 | |
71 | =head2 Retrying |
220 | =head2 Retrying |
72 | |
221 | |
73 | When there is no response to an auth request, the host will send auth |
222 | When there is no response to an auth request, the node will send auth |
74 | requests in bursts with an exponential backoff. After some time it will |
223 | requests in bursts with an exponential back-off. After some time it will |
75 | resort to PING packets, which are very small (8 byte) and lightweight (no |
224 | resort to PING packets, which are very small (8 bytes + protocol header) |
76 | RSA operations). A host that receives ping requests from an unconnected |
225 | and lightweight (no RSA operations required). A node that receives ping |
77 | peer will respond by trying to create a connection. |
226 | requests from an unconnected peer will respond by trying to create a |
|
|
227 | connection. |
78 | |
228 | |
79 | In addition to the exponential backoff, there is a global rate-limit on |
229 | In addition to the exponential back-off, there is a global rate-limit on |
80 | a per-ip base. It allows long bursts but will limit total packet rate to |
230 | a per-IP base. It allows long bursts but will limit total packet rate to |
81 | something like one control packet every ten seconds, to avoid accidental |
231 | something like one control packet every ten seconds, to avoid accidental |
82 | floods due to protocol problems (like a rsa key file mismatch between two |
232 | floods due to protocol problems (like a RSA key file mismatch between two |
83 | hosts). |
233 | nodes). |
|
|
234 | |
|
|
235 | The intervals between retries are limited by the C<max-retry> |
|
|
236 | configuration value. A node with C<connect> = C<always> will always retry, |
|
|
237 | a node with C<connect> = C<ondemand> will only try (and re-try) to connect |
|
|
238 | as long as there are packets in the queue, usually this limits the retry |
|
|
239 | period to C<max-ttl> seconds. |
|
|
240 | |
|
|
241 | Sending packets over the VPN will reset the retry intervals as well, which |
|
|
242 | means as long as somebody is trying to send packets to a given node, GVPE |
|
|
243 | will try to connect every few seconds. |
84 | |
244 | |
85 | =head2 Routing and Protocol translation |
245 | =head2 Routing and Protocol translation |
86 | |
246 | |
87 | The gvpe routing algorithm is easy: there isn't any routing. GVPE always |
247 | The GVPE routing algorithm is easy: there isn't much routing to speak |
88 | tries to establish direct connections, if the protocol abilities of the |
248 | of: When routing packets to another node, GVPE tries the following |
89 | two hosts allow it. |
249 | options, in order: |
90 | |
250 | |
|
|
251 | =over 4 |
|
|
252 | |
91 | If the two hosts should be able to reach each other (common protocol, ip |
253 | =item If the two nodes should be able to reach each other directly (common |
92 | and port all known), but cannot (network down), then there will be no |
254 | protocol, port known), then GVPE will send the packet directly to the |
93 | connection, point. |
255 | other node. |
94 | |
256 | |
|
|
257 | =item If this isn't possible (e.g. because the node doesn't have a |
|
|
258 | C<hostname> or known port), but the nodes speak a common protocol and a |
|
|
259 | router is available, then GVPE will ask a router to "mediate" between both |
|
|
260 | nodes (see below). |
|
|
261 | |
|
|
262 | =item If a direct connection isn't possible (no common protocols) or |
|
|
263 | forbidden (C<deny-direct>) and there are any routers, then GVPE will try |
|
|
264 | to send packets to the router with the highest priority that is connected |
|
|
265 | already I<and> is able (as specified by the config file) to connect |
|
|
266 | directly to the target node. |
|
|
267 | |
|
|
268 | =item If no such router exists, then GVPE will simply send the packet to |
|
|
269 | the node with the highest priority available. |
|
|
270 | |
|
|
271 | =item Failing all that, the packet will be dropped. |
|
|
272 | |
|
|
273 | =back |
|
|
274 | |
95 | A host can usually declare itself unreachable directly by setting it's |
275 | A host can usually declare itself unreachable directly by setting its |
96 | port number(s) to zero. It can declare other hosts as unreachable by using |
276 | port number(s) to zero. It can declare other hosts as unreachable by using |
97 | a config-file that disables all protocols for these other hosts. |
277 | a config-file that disables all protocols for these other hosts. Another |
|
|
278 | option is to disable all protocols on that host in the other config files. |
98 | |
279 | |
99 | If two hosts cannot connect to each other because their IP address(es) |
280 | If two hosts cannot connect to each other because their IP address(es) |
100 | are not known (such as dialup hosts), one side will send a connection |
281 | are not known (such as dial-up hosts), one side will send a I<mediated> |
101 | request to a router (routers must be configured to act as routers!), which |
282 | connection request to a router (routers must be configured to act as |
102 | will send both the originating and the destination host a connection info |
283 | routers!), which will send both the originating and the destination host |
103 | request with protocol information and IP address of the other host (if |
284 | a connection info request with protocol information and IP address of the |
104 | known). Both hosts will then try to establish a connection to the other |
285 | other host (if known). Both hosts will then try to establish a direct |
105 | peer, which is usually possible even when both hosts are behind a NAT |
286 | connection to the other peer, which is usually possible even when both |
106 | gateway. |
287 | hosts are behind a NAT gateway. |
107 | |
288 | |
108 | If the hosts cannot reach each other because they have no common protocol, |
289 | Routing via other nodes works because the SRCDST field is not encrypted, |
109 | the originator instead use the router with highest priority and matching |
|
|
110 | protocol as peer. Since the SRCDST field is not encrypted, the router host |
|
|
111 | can just forward the packet to the destination host. Since each host uses |
290 | so the router can just forward the packet to the destination host. Since |
112 | it's own private key, the router will not be able to decrypt or encrypt |
291 | each host uses its own private key, the router will not be able to |
113 | packets, it will just act as a simple router and protocol translator. |
292 | decrypt or encrypt packets, it will just act as a simple router and |
|
|
293 | protocol translator. |
114 | |
294 | |
115 | When no router is connected, the host will aggressively try to connect to |
|
|
116 | all routers, and if a router is asked for an unconnected host it will try |
|
|
117 | to ask another router to establish the connection. |
|
|
118 | |
295 | |
119 | ... more not yet written about the details of the routing, please bug me |
|
|
120 | ... |
|
|
121 | |
|
|