1 |
pcg |
1.2 |
=head1 The GNU-VPE Protocols |
2 |
|
|
|
3 |
|
|
=head1 Overview |
4 |
|
|
|
5 |
|
|
GVPE can make use of a number of protocols. One of them is the GNU VPE |
6 |
|
|
protocol which is used to authenticate tunnels and send encrypted data |
7 |
|
|
packets. This protocol is described in more detail the second part of this |
8 |
|
|
document. |
9 |
|
|
|
10 |
|
|
The first part of this document describes the transport protocols which |
11 |
|
|
are used by GVPE to send it's data packets over the network. |
12 |
|
|
|
13 |
pcg |
1.5 |
=head1 PART 1: Transport protocols |
14 |
pcg |
1.2 |
|
15 |
pcg |
1.6 |
GVPE offers a wide range of transport protocols that can be used to |
16 |
|
|
interchange data between nodes. Protocols differ in their overhead, speed, |
17 |
pcg |
1.3 |
reliability, and robustness. |
18 |
|
|
|
19 |
|
|
The following sections describe each transport protocol in more |
20 |
|
|
detail. They are sorted by overhead/efficiency, the most efficient |
21 |
pcg |
1.4 |
transport is listed first: |
22 |
pcg |
1.3 |
|
23 |
pcg |
1.2 |
=head2 RAW IP |
24 |
|
|
|
25 |
pcg |
1.3 |
This protocol is the best choice, performance-wise, as the minimum |
26 |
|
|
overhead per packet is only 38 bytes. |
27 |
|
|
|
28 |
pcg |
1.7 |
It works by sending the VPN payload using raw IP frames (using the |
29 |
pcg |
1.3 |
protocol set by C<ip-proto>). |
30 |
|
|
|
31 |
pcg |
1.7 |
Using raw IP frames has the drawback that many firewalls block "unknown" |
32 |
pcg |
1.3 |
protocols, so this transport only works if you have full IP connectivity |
33 |
|
|
between nodes. |
34 |
|
|
|
35 |
pcg |
1.2 |
=head2 ICMP |
36 |
|
|
|
37 |
pcg |
1.3 |
This protocol offers very low overhead (minimum 42 bytes), and can |
38 |
pcg |
1.6 |
sometimes tunnel through firewalls when other protocols can not. |
39 |
pcg |
1.3 |
|
40 |
pcg |
1.6 |
It works by prepending an ICMP header with type C<icmp-type> and a code |
41 |
pcg |
1.3 |
of C<255>. The default C<icmp-type> is C<echo-reply>, so the resulting |
42 |
|
|
packets look like echo replies, which looks rather strange to network |
43 |
pcg |
1.7 |
administrators. |
44 |
pcg |
1.3 |
|
45 |
pcg |
1.7 |
This transport should only be used if other transports (i.e. raw IP) are |
46 |
pcg |
1.3 |
not available or undesirable (due to their overhead). |
47 |
|
|
|
48 |
pcg |
1.2 |
=head2 UDP |
49 |
|
|
|
50 |
pcg |
1.3 |
This is a good general choice for the transport protocol as UDP packets |
51 |
|
|
tunnel well through most firewalls and routers, and the overhead per |
52 |
|
|
packet is moderate (minimum 58 bytes). |
53 |
|
|
|
54 |
|
|
It should be used if RAW IP is not available. |
55 |
|
|
|
56 |
pcg |
1.2 |
=head2 TCP |
57 |
|
|
|
58 |
pcg |
1.3 |
This protocol is a very bad choice, as it not only has high overhead (more |
59 |
|
|
than 60 bytes), but the transport also retries on it's own, which leads |
60 |
|
|
to congestion when the link has moderate packet loss (as both the TCP |
61 |
|
|
transport and the tunneled traffic will retry, increasing congestion more |
62 |
|
|
and more). It also has high latency and is quite inefficient. |
63 |
|
|
|
64 |
|
|
It's only useful when tunneling through firewalls that block better |
65 |
|
|
protocols. If a node doesn't have direct internet access but a HTTP proxy |
66 |
|
|
that supports the CONNECT method it can be used to tunnel through a web |
67 |
|
|
proxy. For this to work, the C<tcp-port> should be C<443> (C<https>), as |
68 |
|
|
most proxies do not allow connections to other ports. |
69 |
|
|
|
70 |
|
|
It is an abuse of the usage a proxy was designed for, so make sure you are |
71 |
|
|
allowed to use it for GVPE. |
72 |
|
|
|
73 |
pcg |
1.6 |
This protocol also has server and client sides. If the C<tcp-port> is |
74 |
|
|
set to zero, other nodes cannot connect to this node directly. If the |
75 |
|
|
C<tcp-port> is non-zero, the node can act both as a client as well as a |
76 |
|
|
server. |
77 |
pcg |
1.3 |
|
78 |
pcg |
1.2 |
=head2 DNS |
79 |
|
|
|
80 |
pcg |
1.3 |
B<WARNING:> Parsing and generating DNS packets is rather tricky. The code |
81 |
|
|
almost certainly contains buffer overflows and other, likely exploitable, |
82 |
|
|
bugs. You have been warned. |
83 |
|
|
|
84 |
|
|
This is the worst choice of transport protocol with respect to overhead |
85 |
|
|
(overhead can be 2-3 times higher than the transferred data), and latency |
86 |
|
|
(which can be many seconds). Some DNS servers might not be prepared to |
87 |
|
|
handle the traffic and drop or corrupt packets. The client also has to |
88 |
|
|
constantly poll the server for data, so the client will constantly create |
89 |
|
|
traffic even if it doesn't need to transport packets. |
90 |
|
|
|
91 |
|
|
In addition, the same problems as the TCP transport also plague this |
92 |
|
|
protocol. |
93 |
|
|
|
94 |
root |
1.8 |
Its only use is to tunnel through firewalls that do not allow direct |
95 |
pcg |
1.3 |
internet access. Similar to using a HTTP proxy (as the TCP transport |
96 |
|
|
does), it uses a local DNS server/forwarder (given by the C<dns-forw-host> |
97 |
|
|
configuration value) as a proxy to send and receive data as a client, |
98 |
pcg |
1.6 |
and an C<NS> record pointing to the GVPE server (as given by the |
99 |
pcg |
1.3 |
C<dns-hostname> directive). |
100 |
|
|
|
101 |
|
|
The only good side of this protocol is that it can tunnel through most |
102 |
pcg |
1.6 |
firewalls mostly undetected, iff the local DNS server/forwarder is sane |
103 |
pcg |
1.7 |
(which is true for most routers, wireless LAN gateways and nameservers). |
104 |
pcg |
1.6 |
|
105 |
pcg |
1.7 |
Fine-tuning needs to be done by editing C<src/vpn_dns.C> directly. |
106 |
pcg |
1.3 |
|
107 |
pcg |
1.2 |
=head1 PART 2: The GNU VPE protocol |
108 |
|
|
|
109 |
|
|
This section, unfortunately, is not yet finished, although the protocol |
110 |
|
|
is stable (until bugs in the cryptography are found, which will likely |
111 |
|
|
completely change the following description). Nevertheless, it should give |
112 |
|
|
you some overview over the protocol. |
113 |
pcg |
1.1 |
|
114 |
|
|
=head2 Anatomy of a VPN packet |
115 |
|
|
|
116 |
|
|
The exact layout and field lengths of a VPN packet is determined at |
117 |
pcg |
1.7 |
compile time and doesn't change. The same structure is used for all |
118 |
|
|
transport protocols, be it RAWIP or TCP. |
119 |
pcg |
1.1 |
|
120 |
|
|
+------+------+--------+------+ |
121 |
|
|
| HMAC | TYPE | SRCDST | DATA | |
122 |
|
|
+------+------+--------+------+ |
123 |
|
|
|
124 |
|
|
The HMAC field is present in all packets, even if not used (e.g. in auth |
125 |
|
|
request packets), in which case it is set to all zeroes. The checksum |
126 |
pcg |
1.2 |
itself is calculated over the TYPE, SRCDST and DATA fields in all cases. |
127 |
pcg |
1.1 |
|
128 |
|
|
The TYPE field is a single byte and determines the purpose of the packet |
129 |
|
|
(e.g. RESET, COMPRESSED/UNCOMPRESSED DATA, PING, AUTH REQUEST/RESPONSE, |
130 |
|
|
CONNECT REQUEST/INFO etc.). |
131 |
|
|
|
132 |
|
|
SRCDST is a three byte field which contains the source and destination |
133 |
pcg |
1.6 |
node IDs (12 bits each). |
134 |
pcg |
1.1 |
|
135 |
|
|
The DATA portion differs between each packet type, naturally, and is the |
136 |
|
|
only part that can be encrypted. Data packets contain more fields, as |
137 |
|
|
shown: |
138 |
|
|
|
139 |
|
|
+------+------+--------+------+-------+------+ |
140 |
|
|
| HMAC | TYPE | SRCDST | RAND | SEQNO | DATA | |
141 |
|
|
+------+------+--------+------+-------+------+ |
142 |
|
|
|
143 |
|
|
RAND is a sequence of fully random bytes, used to increase the entropy of |
144 |
|
|
the data for encryption purposes. |
145 |
|
|
|
146 |
|
|
SEQNO is a 32-bit sequence number. It is negotiated at every connection |
147 |
|
|
initialization and starts at some random 31 bit value. VPE currently uses |
148 |
pcg |
1.2 |
a sliding window of 512 packets/sequence numbers to detect reordering, |
149 |
pcg |
1.6 |
duplication and replay attacks. |
150 |
pcg |
1.1 |
|
151 |
root |
1.10 |
The encryption is done on RAND+SEQNO+DATA in CBC mode with zero IV (or, |
152 |
|
|
equivalently, the IV is RAND+SEQNO, encrypted with the block cipher, |
153 |
|
|
unless RAND size is decreased or increased over the default value). |
154 |
|
|
|
155 |
root |
1.11 |
=head2 The authentication/key exchange protocol |
156 |
pcg |
1.1 |
|
157 |
pcg |
1.7 |
Before nodes can exchange packets, they need to establish authenticity of |
158 |
|
|
the other side and a key. Every node has a private RSA key and the public |
159 |
|
|
RSA keys of all other nodes. |
160 |
pcg |
1.1 |
|
161 |
root |
1.11 |
When a node wants to establish a connection to another node, it sends an |
162 |
|
|
RSA-OEAP-encrypted challenge and an ECDH key. The other node replies with |
163 |
|
|
it's own ECDH key and a HKDF of the challange and both ECDH keys to proof |
164 |
|
|
it's identity. |
165 |
|
|
|
166 |
|
|
The remote node enganges in exactly the same protocol. When both nodes |
167 |
|
|
have exchanged their challenge and verified the response, they calculate a |
168 |
|
|
cipher key and a HMAC key and start exchanging data packets. |
169 |
|
|
|
170 |
|
|
In detail, the challenge consist of: |
171 |
|
|
|
172 |
|
|
RSA-OAEP (SEQNO MAC CIPHER SALT EXTRA-AUTH) ECDH1 |
173 |
|
|
|
174 |
|
|
That is, it encrypts (with the public key of the remote node) an initial |
175 |
|
|
sequence number for data packets, key material for the HMAC key, key |
176 |
|
|
material for the cipher key, a salt used by the HKDF (as shown later) and |
177 |
|
|
some extra random bytes that are unused except for authentication. It also |
178 |
|
|
sends the public key of a curve25519 exchange. |
179 |
|
|
|
180 |
|
|
The remote node decrypts the RSA data, generates it's own ECDH key (ECDH2), and |
181 |
|
|
replies with: |
182 |
|
|
|
183 |
|
|
HKDF-Expand (HKDF-Extract (ECDH2, RSA), ECDH1, AUTH_DIGEST_SIZE) ECDH2 |
184 |
|
|
|
185 |
|
|
That is, it extracts from the decrypted RSA challenge, using it's ECDH |
186 |
|
|
key as salt, and then expands using the requesting node's ECDH1 key. The |
187 |
|
|
resulting has is returned as a proof that the node could decrypt the RSA |
188 |
|
|
challenge data, together with the ECDH key. |
189 |
|
|
|
190 |
|
|
After both nodes have done this to each other, they calculate the shared |
191 |
|
|
ECDH secrets, cipher and HMAC keys for the session (each |
192 |
|
|
node generates two cipher and HMAC keys, one for sending and one for |
193 |
|
|
receiving). |
194 |
|
|
|
195 |
|
|
The HMAC key for sending is generated as follow: |
196 |
|
|
|
197 |
|
|
HMAC_KEY = HKDF-Expand (HKDF-Extract (REMOTE_SALT, MAC ECDH_SECRET), info, HMAC_MD_SIZE) |
198 |
|
|
|
199 |
|
|
It extracts from MAC and ECDH_SECRET using the I<remote> SALT, then |
200 |
|
|
expands using a static info string. |
201 |
|
|
|
202 |
|
|
The cipher key is generated in the same way, except using the CIPHER part |
203 |
|
|
of the original challenge. |
204 |
|
|
|
205 |
|
|
The result of this process is to authenticate each node to the other |
206 |
|
|
node, while exchanging keys using both RSA and ECDH, the latter providing |
207 |
|
|
perfect forward secrecy. |
208 |
pcg |
1.1 |
|
209 |
|
|
=head2 Retrying |
210 |
|
|
|
211 |
pcg |
1.7 |
When there is no response to an auth request, the node will send auth |
212 |
|
|
requests in bursts with an exponential back-off. After some time it will |
213 |
pcg |
1.6 |
resort to PING packets, which are very small (8 bytes + protocol header) |
214 |
pcg |
1.7 |
and lightweight (no RSA operations required). A node that receives ping |
215 |
pcg |
1.6 |
requests from an unconnected peer will respond by trying to create a |
216 |
|
|
connection. |
217 |
pcg |
1.1 |
|
218 |
pcg |
1.7 |
In addition to the exponential back-off, there is a global rate-limit on |
219 |
pcg |
1.2 |
a per-IP base. It allows long bursts but will limit total packet rate to |
220 |
pcg |
1.1 |
something like one control packet every ten seconds, to avoid accidental |
221 |
pcg |
1.2 |
floods due to protocol problems (like a RSA key file mismatch between two |
222 |
pcg |
1.7 |
nodes). |
223 |
pcg |
1.1 |
|
224 |
pcg |
1.6 |
The intervals between retries are limited by the C<max-retry> |
225 |
|
|
configuration value. A node with C<connect> = C<always> will always retry, |
226 |
|
|
a node with C<connect> = C<ondemand> will only try (and re-try) to connect |
227 |
|
|
as long as there are packets in the queue, usually this limits the retry |
228 |
|
|
period to C<max-ttl> seconds. |
229 |
|
|
|
230 |
|
|
Sending packets over the VPN will reset the retry intervals as well, which |
231 |
|
|
means as long as somebody is trying to send packets to a given node, GVPE |
232 |
|
|
will try to connect every few seconds. |
233 |
|
|
|
234 |
pcg |
1.1 |
=head2 Routing and Protocol translation |
235 |
|
|
|
236 |
pcg |
1.6 |
The GVPE routing algorithm is easy: there isn't much routing to speak |
237 |
root |
1.9 |
of: When routing packets to another node, GVPE tries the following |
238 |
pcg |
1.6 |
options, in order: |
239 |
|
|
|
240 |
|
|
=over 4 |
241 |
|
|
|
242 |
pcg |
1.7 |
=item If the two nodes should be able to reach each other directly (common |
243 |
pcg |
1.6 |
protocol, port known), then GVPE will send the packet directly to the |
244 |
|
|
other node. |
245 |
|
|
|
246 |
|
|
=item If this isn't possible (e.g. because the node doesn't have a |
247 |
|
|
C<hostname> or known port), but the nodes speak a common protocol and a |
248 |
|
|
router is available, then GVPE will ask a router to "mediate" between both |
249 |
|
|
nodes (see below). |
250 |
|
|
|
251 |
|
|
=item If a direct connection isn't possible (no common protocols) or |
252 |
|
|
forbidden (C<deny-direct>) and there are any routers, then GVPE will try |
253 |
|
|
to send packets to the router with the highest priority that is connected |
254 |
|
|
already I<and> is able (as specified by the config file) to connect |
255 |
|
|
directly to the target node. |
256 |
|
|
|
257 |
|
|
=item If no such router exists, then GVPE will simply send the packet to |
258 |
|
|
the node with the highest priority available. |
259 |
|
|
|
260 |
|
|
=item Failing all that, the packet will be dropped. |
261 |
|
|
|
262 |
|
|
=back |
263 |
pcg |
1.1 |
|
264 |
|
|
A host can usually declare itself unreachable directly by setting it's |
265 |
|
|
port number(s) to zero. It can declare other hosts as unreachable by using |
266 |
pcg |
1.6 |
a config-file that disables all protocols for these other hosts. Another |
267 |
|
|
option is to disable all protocols on that host in the other config files. |
268 |
pcg |
1.1 |
|
269 |
|
|
If two hosts cannot connect to each other because their IP address(es) |
270 |
pcg |
1.7 |
are not known (such as dial-up hosts), one side will send a I<mediated> |
271 |
pcg |
1.6 |
connection request to a router (routers must be configured to act as |
272 |
|
|
routers!), which will send both the originating and the destination host |
273 |
|
|
a connection info request with protocol information and IP address of the |
274 |
|
|
other host (if known). Both hosts will then try to establish a direct |
275 |
|
|
connection to the other peer, which is usually possible even when both |
276 |
|
|
hosts are behind a NAT gateway. |
277 |
|
|
|
278 |
|
|
Routing via other nodes works because the SRCDST field is not encrypted, |
279 |
|
|
so the router can just forward the packet to the destination host. Since |
280 |
|
|
each host uses it's own private key, the router will not be able to |
281 |
|
|
decrypt or encrypt packets, it will just act as a simple router and |
282 |
|
|
protocol translator. |
283 |
pcg |
1.1 |
|
284 |
|
|
|