1 |
=head1 The GNU-VPE Protocols |
2 |
|
3 |
=head1 Overview |
4 |
|
5 |
GVPE can make use of a number of protocols. One of them is the GNU VPE |
6 |
protocol which is used to authenticate tunnels and send encrypted data |
7 |
packets. This protocol is described in more detail the second part of this |
8 |
document. |
9 |
|
10 |
The first part of this document describes the transport protocols which |
11 |
are used by GVPE to send its data packets over the network. |
12 |
|
13 |
=head1 PART 1: Transport protocols |
14 |
|
15 |
GVPE offers a wide range of transport protocols that can be used to |
16 |
interchange data between nodes. Protocols differ in their overhead, speed, |
17 |
reliability, and robustness. |
18 |
|
19 |
The following sections describe each transport protocol in more |
20 |
detail. They are sorted by overhead/efficiency, the most efficient |
21 |
transport is listed first: |
22 |
|
23 |
=head2 RAW IP |
24 |
|
25 |
This protocol is the best choice, performance-wise, as the minimum |
26 |
overhead per packet is only 38 bytes. |
27 |
|
28 |
It works by sending the VPN payload using raw IP frames (using the |
29 |
protocol set by C<ip-proto>). |
30 |
|
31 |
Using raw IP frames has the drawback that many firewalls block "unknown" |
32 |
protocols, so this transport only works if you have full IP connectivity |
33 |
between nodes. |
34 |
|
35 |
=head2 ICMP |
36 |
|
37 |
This protocol offers very low overhead (minimum 42 bytes), and can |
38 |
sometimes tunnel through firewalls when other protocols can not. |
39 |
|
40 |
It works by prepending an ICMP header with type C<icmp-type> and a code |
41 |
of C<255>. The default C<icmp-type> is C<echo-reply>, so the resulting |
42 |
packets look like echo replies, which looks rather strange to network |
43 |
administrators. |
44 |
|
45 |
This transport should only be used if other transports (i.e. raw IP) are |
46 |
not available or undesirable (due to their overhead). |
47 |
|
48 |
=head2 UDP |
49 |
|
50 |
This is a good general choice for the transport protocol as UDP packets |
51 |
tunnel well through most firewalls and routers, and the overhead per |
52 |
packet is moderate (minimum 58 bytes). |
53 |
|
54 |
It should be used if RAW IP is not available. |
55 |
|
56 |
=head2 TCP |
57 |
|
58 |
This protocol is a very bad choice, as it not only has high overhead (more |
59 |
than 60 bytes), but the transport also retries on its own, which leads |
60 |
to congestion when the link has moderate packet loss (as both the TCP |
61 |
transport and the tunneled traffic will retry, increasing congestion more |
62 |
and more). It also has high latency and is quite inefficient. |
63 |
|
64 |
It's only useful when tunneling through firewalls that block better |
65 |
protocols. If a node doesn't have direct internet access but a HTTP proxy |
66 |
that supports the CONNECT method it can be used to tunnel through a web |
67 |
proxy. For this to work, the C<tcp-port> should be C<443> (C<https>), as |
68 |
most proxies do not allow connections to other ports. |
69 |
|
70 |
It is an abuse of the usage a proxy was designed for, so make sure you are |
71 |
allowed to use it for GVPE. |
72 |
|
73 |
This protocol also has server and client sides. If the C<tcp-port> is |
74 |
set to zero, other nodes cannot connect to this node directly. If the |
75 |
C<tcp-port> is non-zero, the node can act both as a client as well as a |
76 |
server. |
77 |
|
78 |
=head2 DNS |
79 |
|
80 |
B<WARNING:> Parsing and generating DNS packets is rather tricky. The code |
81 |
almost certainly contains buffer overflows and other, likely exploitable, |
82 |
bugs. You have been warned. |
83 |
|
84 |
This is the worst choice of transport protocol with respect to overhead |
85 |
(overhead can be 2-3 times higher than the transferred data), and latency |
86 |
(which can be many seconds). Some DNS servers might not be prepared to |
87 |
handle the traffic and drop or corrupt packets. The client also has to |
88 |
constantly poll the server for data, so the client will constantly create |
89 |
traffic even if it doesn't need to transport packets. |
90 |
|
91 |
In addition, the same problems as the TCP transport also plague this |
92 |
protocol. |
93 |
|
94 |
Its only use is to tunnel through firewalls that do not allow direct |
95 |
internet access. Similar to using a HTTP proxy (as the TCP transport |
96 |
does), it uses a local DNS server/forwarder (given by the C<dns-forw-host> |
97 |
configuration value) as a proxy to send and receive data as a client, |
98 |
and an C<NS> record pointing to the GVPE server (as given by the |
99 |
C<dns-hostname> directive). |
100 |
|
101 |
The only good side of this protocol is that it can tunnel through most |
102 |
firewalls mostly undetected, iff the local DNS server/forwarder is sane |
103 |
(which is true for most routers, wireless LAN gateways and nameservers). |
104 |
|
105 |
Fine-tuning needs to be done by editing C<src/vpn_dns.C> directly. |
106 |
|
107 |
=head1 PART 2: The GNU VPE protocol |
108 |
|
109 |
This section, unfortunately, is not yet finished, although the protocol |
110 |
is stable (until bugs in the cryptography are found, which will likely |
111 |
completely change the following description). Nevertheless, it should give |
112 |
you some overview over the protocol. |
113 |
|
114 |
=head2 Anatomy of a VPN packet |
115 |
|
116 |
The exact layout and field lengths of a VPN packet is determined at |
117 |
compile time and doesn't change. The same structure is used for all |
118 |
transport protocols, be it RAWIP or TCP. |
119 |
|
120 |
+------+------+--------+------+ |
121 |
| HMAC | TYPE | SRCDST | DATA | |
122 |
+------+------+--------+------+ |
123 |
|
124 |
The HMAC field is present in all packets, even if not used (e.g. in auth |
125 |
request packets), in which case it is set to all zeroes. The MAC itself is |
126 |
calculated over the TYPE, SRCDST and DATA fields in all cases. |
127 |
|
128 |
The TYPE field is a single byte and determines the purpose of the packet |
129 |
(e.g. RESET, COMPRESSED/UNCOMPRESSED DATA, PING, AUTH REQUEST/RESPONSE, |
130 |
CONNECT REQUEST/INFO etc.). |
131 |
|
132 |
SRCDST is a three byte field which contains the source and destination |
133 |
node IDs (12 bits each). |
134 |
|
135 |
The DATA portion differs between each packet type, naturally, and is the |
136 |
only part that can be encrypted. Data packets contain more fields, as |
137 |
shown: |
138 |
|
139 |
+------+------+--------+-------+------+ |
140 |
| HMAC | TYPE | SRCDST | SEQNO | DATA | |
141 |
+------+------+--------+-------+------+ |
142 |
|
143 |
SEQNO is a 32-bit sequence number. It is negotiated at every connection |
144 |
initialization and starts at some random 31 bit value. GVPE currently uses |
145 |
a sliding window of 512 packets/sequence numbers to detect reordering, |
146 |
duplication and replay attacks. |
147 |
|
148 |
The encryption is done on SEQNO+DATA in CTR mode with IV generated from |
149 |
the seqno (for AES: seqno || seqno || seqno || (u32)0), which ensures |
150 |
uniqueness for a given key. |
151 |
|
152 |
=head2 The authentication/key exchange protocol |
153 |
|
154 |
Before nodes can exchange packets, they need to establish authenticity of |
155 |
the other side and a key. Every node has a private RSA key and the public |
156 |
RSA keys of all other nodes. |
157 |
|
158 |
When a node wants to establish a connection to another node, it sends an |
159 |
RSA-OEAP-encrypted challenge and an ECDH (curve25519) key. The other node |
160 |
replies with its own ECDH key and a HKDF of the challenge and both ECDH |
161 |
keys to prove its identity. |
162 |
|
163 |
The remote node enganges in exactly the same protocol. When both nodes |
164 |
have exchanged their challenge and verified the response, they calculate a |
165 |
cipher key and a HMAC key and start exchanging data packets. |
166 |
|
167 |
In detail, the challenge consist of: |
168 |
|
169 |
RSA-OAEP (SEQNO MAC CIPHER SALT EXTRA-AUTH) ECDH1 |
170 |
|
171 |
That is, it encrypts (with the public key of the remote node) an initial |
172 |
sequence number for data packets, key material for the HMAC key, key |
173 |
material for the cipher key, a salt used by the HKDF (as shown later) and |
174 |
some extra random bytes that are unused except for authentication. It also |
175 |
sends the public key of a curve25519 exchange. |
176 |
|
177 |
The remote node decrypts the RSA data, generates its own ECDH key (ECDH2), |
178 |
and replies with: |
179 |
|
180 |
HKDF-Expand (HKDF-Extract (ECDH2, RSA), ECDH1, AUTH_DIGEST_SIZE) ECDH2 |
181 |
|
182 |
That is, it extracts from the decrypted RSA challenge, using its ECDH |
183 |
key as salt, and then expands using the requesting node's ECDH1 key. The |
184 |
resulting hash is returned as a proof that the node could decrypt the RSA |
185 |
challenge data, together with the ECDH key. |
186 |
|
187 |
After both nodes have done this to each other, they calculate the shared |
188 |
ECDH secret, cipher and HMAC keys for the session (each node generates two |
189 |
cipher and HMAC keys, one for sending and one for receiving). |
190 |
|
191 |
The HMAC key for sending is generated as follow: |
192 |
|
193 |
HMAC_KEY = HKDF-Expand (HKDF-Extract (REMOTE_SALT, MAC ECDH_SECRET), info, HMAC_MD_SIZE) |
194 |
|
195 |
It extracts from MAC and ECDH_SECRET using the I<remote> SALT, then |
196 |
expands using a static info string. |
197 |
|
198 |
The cipher key is generated in the same way, except using the CIPHER part |
199 |
of the original challenge. |
200 |
|
201 |
The result of this process is to authenticate each node to the other |
202 |
node, while exchanging keys using both RSA and ECDH, the latter providing |
203 |
perfect forward secrecy. |
204 |
|
205 |
The protocol has been overdesigned where this was possible without |
206 |
increasing implementation complexity, in an attempt to protect against |
207 |
implementation or protocol failures. For example, if the ECDH challenge |
208 |
was found to be flawed, perfect forward secrecy would be lost, but the |
209 |
data would likely still be protected. Likewise, standard algorithms and |
210 |
implementations are used where possible. |
211 |
|
212 |
=head2 Retrying |
213 |
|
214 |
When there is no response to an auth request, the node will send auth |
215 |
requests in bursts with an exponential back-off. After some time it will |
216 |
resort to PING packets, which are very small (8 bytes + protocol header) |
217 |
and lightweight (no RSA operations required). A node that receives ping |
218 |
requests from an unconnected peer will respond by trying to create a |
219 |
connection. |
220 |
|
221 |
In addition to the exponential back-off, there is a global rate-limit on |
222 |
a per-IP base. It allows long bursts but will limit total packet rate to |
223 |
something like one control packet every ten seconds, to avoid accidental |
224 |
floods due to protocol problems (like a RSA key file mismatch between two |
225 |
nodes). |
226 |
|
227 |
The intervals between retries are limited by the C<max-retry> |
228 |
configuration value. A node with C<connect> = C<always> will always retry, |
229 |
a node with C<connect> = C<ondemand> will only try (and re-try) to connect |
230 |
as long as there are packets in the queue, usually this limits the retry |
231 |
period to C<max-ttl> seconds. |
232 |
|
233 |
Sending packets over the VPN will reset the retry intervals as well, which |
234 |
means as long as somebody is trying to send packets to a given node, GVPE |
235 |
will try to connect every few seconds. |
236 |
|
237 |
=head2 Routing and Protocol translation |
238 |
|
239 |
The GVPE routing algorithm is easy: there isn't much routing to speak |
240 |
of: When routing packets to another node, GVPE tries the following |
241 |
options, in order: |
242 |
|
243 |
=over 4 |
244 |
|
245 |
=item If the two nodes should be able to reach each other directly (common |
246 |
protocol, port known), then GVPE will send the packet directly to the |
247 |
other node. |
248 |
|
249 |
=item If this isn't possible (e.g. because the node doesn't have a |
250 |
C<hostname> or known port), but the nodes speak a common protocol and a |
251 |
router is available, then GVPE will ask a router to "mediate" between both |
252 |
nodes (see below). |
253 |
|
254 |
=item If a direct connection isn't possible (no common protocols) or |
255 |
forbidden (C<deny-direct>) and there are any routers, then GVPE will try |
256 |
to send packets to the router with the highest priority that is connected |
257 |
already I<and> is able (as specified by the config file) to connect |
258 |
directly to the target node. |
259 |
|
260 |
=item If no such router exists, then GVPE will simply send the packet to |
261 |
the node with the highest priority available. |
262 |
|
263 |
=item Failing all that, the packet will be dropped. |
264 |
|
265 |
=back |
266 |
|
267 |
A host can usually declare itself unreachable directly by setting its |
268 |
port number(s) to zero. It can declare other hosts as unreachable by using |
269 |
a config-file that disables all protocols for these other hosts. Another |
270 |
option is to disable all protocols on that host in the other config files. |
271 |
|
272 |
If two hosts cannot connect to each other because their IP address(es) |
273 |
are not known (such as dial-up hosts), one side will send a I<mediated> |
274 |
connection request to a router (routers must be configured to act as |
275 |
routers!), which will send both the originating and the destination host |
276 |
a connection info request with protocol information and IP address of the |
277 |
other host (if known). Both hosts will then try to establish a direct |
278 |
connection to the other peer, which is usually possible even when both |
279 |
hosts are behind a NAT gateway. |
280 |
|
281 |
Routing via other nodes works because the SRCDST field is not encrypted, |
282 |
so the router can just forward the packet to the destination host. Since |
283 |
each host uses its own private key, the router will not be able to |
284 |
decrypt or encrypt packets, it will just act as a simple router and |
285 |
protocol translator. |
286 |
|
287 |
|