gvpe/doc/gvpe.protocol.7.pod

=head1 The GNU-VPE Protocols

=head1 Overview

GVPE can make use of a number of protocols. One of them is the GNU VPE
protocol which is used to authenticate tunnels and send encrypted data
packets. This protocol is described in more detail the second part of this
document.

The first part of this document describes the transport protocols which
are used by GVPE to send its data packets over the network.

=head1 PART 1: Transport protocols

GVPE offers a wide range of transport protocols that can be used to
interchange data between nodes. Protocols differ in their overhead, speed,
reliability, and robustness.

The following sections describe each transport protocol in more
detail. They are sorted by overhead/efficiency, the most efficient
transport is listed first:

=head2 RAW IP

This protocol is the best choice, performance-wise, as the minimum
overhead per packet is only 38 bytes.

It works by sending the VPN payload using raw IP frames (using the
protocol set by C<ip-proto>).

Using raw IP frames has the drawback that many firewalls block "unknown"
protocols, so this transport only works if you have full IP connectivity
between nodes.

=head2 ICMP

This protocol offers very low overhead (minimum 42 bytes), and can
sometimes tunnel through firewalls when other protocols can not.

It works by prepending an ICMP header with type C<icmp-type> and a code
of C<255>. The default C<icmp-type> is C<echo-reply>, so the resulting
packets look like echo replies, which looks rather strange to network
administrators.

This transport should only be used if other transports (i.e. raw IP) are
not available or undesirable (due to their overhead).

=head2 UDP

This is a good general choice for the transport protocol as UDP packets
tunnel well through most firewalls and routers, and the overhead per
packet is moderate (minimum 58 bytes).

It should be used if RAW IP is not available.

=head2 TCP

This protocol is a very bad choice, as it not only has high overhead (more
than 60 bytes), but the transport also retries on its own, which leads
to congestion when the link has moderate packet loss (as both the TCP
transport and the tunneled traffic will retry, increasing congestion more
and more). It also has high latency and is quite inefficient.

It's only useful when tunneling through firewalls that block better
protocols. If a node doesn't have direct internet access but a HTTP proxy
that supports the CONNECT method it can be used to tunnel through a web
proxy. For this to work, the C<tcp-port> should be C<443> (C<https>), as
most proxies do not allow connections to other ports.

It is an abuse of the usage a proxy was designed for, so make sure you are
allowed to use it for GVPE.

This protocol also has server and client sides. If the C<tcp-port> is
set to zero, other nodes cannot connect to this node directly. If the
C<tcp-port> is non-zero, the node can act both as a client as well as a
server.

=head2 DNS

B<WARNING:> Parsing and generating DNS packets is rather tricky. The code
almost certainly contains buffer overflows and other, likely exploitable,
bugs. You have been warned.

This is the worst choice of transport protocol with respect to overhead
(overhead can be 2-3 times higher than the transferred data), and latency
(which can be many seconds). Some DNS servers might not be prepared to
handle the traffic and drop or corrupt packets. The client also has to
constantly poll the server for data, so the client will constantly create
traffic even if it doesn't need to transport packets.

In addition, the same problems as the TCP transport also plague this
protocol.

Its only use is to tunnel through firewalls that do not allow direct
internet access. Similar to using a HTTP proxy (as the TCP transport
does), it uses a local DNS server/forwarder (given by the C<dns-forw-host>
configuration value) as a proxy to send and receive data as a client,
and an C<NS> record pointing to the GVPE server (as given by the
C<dns-hostname> directive).

The only good side of this protocol is that it can tunnel through most
firewalls mostly undetected, iff the local DNS server/forwarder is sane
(which is true for most routers, wireless LAN gateways and nameservers).

Fine-tuning needs to be done by editing C<src/vpn_dns.C> directly.

=head1 PART 2: The GNU VPE protocol

This section, unfortunately, is not yet finished, although the protocol
is stable (until bugs in the cryptography are found, which will likely
completely change the following description). Nevertheless, it should give
you some overview over the protocol.

=head2 Anatomy of a VPN packet

The exact layout and field lengths of a VPN packet is determined at
compile time and doesn't change. The same structure is used for all
transport protocols, be it RAWIP or TCP.

 +------+------+--------+------+
 | HMAC | TYPE | SRCDST | DATA |
 +------+------+--------+------+

The HMAC field is present in all packets, even if not used (e.g. in auth
request packets), in which case it is set to all zeroes. The checksum
itself is calculated over the TYPE, SRCDST and DATA fields in all cases.

The TYPE field is a single byte and determines the purpose of the packet
(e.g. RESET, COMPRESSED/UNCOMPRESSED DATA, PING, AUTH REQUEST/RESPONSE,
CONNECT REQUEST/INFO etc.).

SRCDST is a three byte field which contains the source and destination
node IDs (12 bits each).

The DATA portion differs between each packet type, naturally, and is the
only part that can be encrypted. Data packets contain more fields, as
shown:

 +------+------+--------+------+-------+------+
 | HMAC | TYPE | SRCDST | RAND | SEQNO | DATA |
 +------+------+--------+------+-------+------+

RAND is a sequence of fully random bytes, used to increase the entropy of
the data for encryption purposes.

SEQNO is a 32-bit sequence number. It is negotiated at every connection
initialization and starts at some random 31 bit value. GVPE currently uses
a sliding window of 512 packets/sequence numbers to detect reordering,
duplication and replay attacks.

The encryption is done on RAND+SEQNO+DATA in CBC mode with zero IV (or,
equivalently, the IV is RAND+SEQNO, encrypted with the block cipher,
unless RAND size is decreased or increased over the default value).

The random prefix itself is generated by using AES in CTR mode with a
random key and starting value, which should make them unpredictable even
before encrypting them again. The sequence number additionally ensures
that the IV is unique.

=head2 The authentication/key exchange protocol

Before nodes can exchange packets, they need to establish authenticity of
the other side and a key. Every node has a private RSA key and the public
RSA keys of all other nodes.

When a node wants to establish a connection to another node, it sends an
RSA-OEAP-encrypted challenge and an ECDH (curve25519) key. The other node
replies with its own ECDH key and a HKDF of the challenge and both ECDH
keys to prove its identity.

The remote node enganges in exactly the same protocol. When both nodes
have exchanged their challenge and verified the response, they calculate a
cipher key and a HMAC key and start exchanging data packets.

In detail, the challenge consist of:

  RSA-OAEP (SEQNO MAC CIPHER SALT EXTRA-AUTH) ECDH1

That is, it encrypts (with the public key of the remote node) an initial
sequence number for data packets, key material for the HMAC key, key
material for the cipher key, a salt used by the HKDF (as shown later) and
some extra random bytes that are unused except for authentication. It also
sends the public key of a curve25519 exchange.

The remote node decrypts the RSA data, generates its own ECDH key (ECDH2),
and replies with:

  HKDF-Expand (HKDF-Extract (ECDH2, RSA), ECDH1, AUTH_DIGEST_SIZE) ECDH2

That is, it extracts from the decrypted RSA challenge, using its ECDH
key as salt, and then expands using the requesting node's ECDH1 key. The
resulting hash is returned as a proof that the node could decrypt the RSA
challenge data, together with the ECDH key.

After both nodes have done this to each other, they calculate the shared
ECDH secret, cipher and HMAC keys for the session (each node generates two
cipher and HMAC keys, one for sending and one for receiving).

The HMAC key for sending is generated as follow:

   HMAC_KEY = HKDF-Expand (HKDF-Extract (REMOTE_SALT, MAC ECDH_SECRET), info, HMAC_MD_SIZE)

It extracts from MAC and ECDH_SECRET using the I<remote> SALT, then
expands using a static info string.

The cipher key is generated in the same way, except using the CIPHER part
of the original challenge.

The result of this process is to authenticate each node to the other
node, while exchanging keys using both RSA and ECDH, the latter providing
perfect forward secrecy.

The protocol has been overdesigned where this was possible without
increasing implementation complexity, in an attempt to protect against
implementation or protocol failures. For example, if the ECDH challenge
was found to be flawed, perfect forward secrecy would be lost, but the
data would likely still be protected. Likewise, standard algorithms and
implementations are used where possible.

=head2 Retrying

When there is no response to an auth request, the node will send auth
requests in bursts with an exponential back-off. After some time it will
resort to PING packets, which are very small (8 bytes + protocol header)
and lightweight (no RSA operations required). A node that receives ping
requests from an unconnected peer will respond by trying to create a
connection.

In addition to the exponential back-off, there is a global rate-limit on
a per-IP base. It allows long bursts but will limit total packet rate to
something like one control packet every ten seconds, to avoid accidental
floods due to protocol problems (like a RSA key file mismatch between two
nodes).

The intervals between retries are limited by the C<max-retry>
configuration value. A node with C<connect> = C<always> will always retry,
a node with C<connect> = C<ondemand> will only try (and re-try) to connect
as long as there are packets in the queue, usually this limits the retry
period to C<max-ttl> seconds.

Sending packets over the VPN will reset the retry intervals as well, which
means as long as somebody is trying to send packets to a given node, GVPE
will try to connect every few seconds.

=head2 Routing and Protocol translation

The GVPE routing algorithm is easy: there isn't much routing to speak
of: When routing packets to another node, GVPE tries the following
options, in order:

=over 4

=item If the two nodes should be able to reach each other directly (common
protocol, port known), then GVPE will send the packet directly to the
other node.

=item If this isn't possible (e.g. because the node doesn't have a
C<hostname> or known port), but the nodes speak a common protocol and a
router is available, then GVPE will ask a router to "mediate" between both
nodes (see below).

=item If a direct connection isn't possible (no common protocols) or
forbidden (C<deny-direct>) and there are any routers, then GVPE will try
to send packets to the router with the highest priority that is connected
already I<and> is able (as specified by the config file) to connect
directly to the target node.

=item If no such router exists, then GVPE will simply send the packet to
the node with the highest priority available.

=item Failing all that, the packet will be dropped.

=back

A host can usually declare itself unreachable directly by setting its
port number(s) to zero. It can declare other hosts as unreachable by using
a config-file that disables all protocols for these other hosts. Another
option is to disable all protocols on that host in the other config files.

If two hosts cannot connect to each other because their IP address(es)
are not known (such as dial-up hosts), one side will send a I<mediated>
connection request to a router (routers must be configured to act as
routers!), which will send both the originating and the destination host
a connection info request with protocol information and IP address of the
other host (if known). Both hosts will then try to establish a direct
connection to the other peer, which is usually possible even when both
hosts are behind a NAT gateway.

Routing via other nodes works because the SRCDST field is not encrypted,
so the router can just forward the packet to the destination host. Since
each host uses its own private key, the router will not be able to
decrypt or encrypt packets, it will just act as a simple router and
protocol translator.


Revision:	1.13
Committed:	Sat Apr 26 19:05:56 2014 UTC (10 years, 1 month ago) by root
Branch:	MAIN
Changes since 1.12:	+15 -16 lines
Log Message:	* empty log message *
#	User	Rev	Content
1	pcg	1.2	=head1 The GNU-VPE Protocols
2
3			=head1 Overview
4
5			GVPE can make use of a number of protocols. One of them is the GNU VPE
6			protocol which is used to authenticate tunnels and send encrypted data
7			packets. This protocol is described in more detail the second part of this
8			document.
9
10			The first part of this document describes the transport protocols which
11	root	1.13	are used by GVPE to send its data packets over the network.
12	pcg	1.2
13	pcg	1.5	=head1 PART 1: Transport protocols
14	pcg	1.2
15	pcg	1.6	GVPE offers a wide range of transport protocols that can be used to
16			interchange data between nodes. Protocols differ in their overhead, speed,
17	pcg	1.3	reliability, and robustness.
18
19			The following sections describe each transport protocol in more
20			detail. They are sorted by overhead/efficiency, the most efficient
21	pcg	1.4	transport is listed first:
22	pcg	1.3
23	pcg	1.2	=head2 RAW IP
24
25	pcg	1.3	This protocol is the best choice, performance-wise, as the minimum
26			overhead per packet is only 38 bytes.
27
28	pcg	1.7	It works by sending the VPN payload using raw IP frames (using the
29	pcg	1.3	protocol set by C<ip-proto>).
30
31	pcg	1.7	Using raw IP frames has the drawback that many firewalls block "unknown"
32	pcg	1.3	protocols, so this transport only works if you have full IP connectivity
33			between nodes.
34
35	pcg	1.2	=head2 ICMP
36
37	pcg	1.3	This protocol offers very low overhead (minimum 42 bytes), and can
38	pcg	1.6	sometimes tunnel through firewalls when other protocols can not.
39	pcg	1.3
40	pcg	1.6	It works by prepending an ICMP header with type C<icmp-type> and a code
41	pcg	1.3	of C<255>. The default C<icmp-type> is C<echo-reply>, so the resulting
42			packets look like echo replies, which looks rather strange to network
43	pcg	1.7	administrators.
44	pcg	1.3
45	pcg	1.7	This transport should only be used if other transports (i.e. raw IP) are
46	pcg	1.3	not available or undesirable (due to their overhead).
47
48	pcg	1.2	=head2 UDP
49
50	pcg	1.3	This is a good general choice for the transport protocol as UDP packets
51			tunnel well through most firewalls and routers, and the overhead per
52			packet is moderate (minimum 58 bytes).
53
54			It should be used if RAW IP is not available.
55
56	pcg	1.2	=head2 TCP
57
58	pcg	1.3	This protocol is a very bad choice, as it not only has high overhead (more
59	root	1.13	than 60 bytes), but the transport also retries on its own, which leads
60	pcg	1.3	to congestion when the link has moderate packet loss (as both the TCP
61			transport and the tunneled traffic will retry, increasing congestion more
62			and more). It also has high latency and is quite inefficient.
63
64			It's only useful when tunneling through firewalls that block better
65			protocols. If a node doesn't have direct internet access but a HTTP proxy
66			that supports the CONNECT method it can be used to tunnel through a web
67			proxy. For this to work, the C<tcp-port> should be C<443> (C<https>), as
68			most proxies do not allow connections to other ports.
69
70			It is an abuse of the usage a proxy was designed for, so make sure you are
71			allowed to use it for GVPE.
72
73	pcg	1.6	This protocol also has server and client sides. If the C<tcp-port> is
74			set to zero, other nodes cannot connect to this node directly. If the
75			C<tcp-port> is non-zero, the node can act both as a client as well as a
76			server.
77	pcg	1.3
78	pcg	1.2	=head2 DNS
79
80	pcg	1.3	B<WARNING:> Parsing and generating DNS packets is rather tricky. The code
81			almost certainly contains buffer overflows and other, likely exploitable,
82			bugs. You have been warned.
83
84			This is the worst choice of transport protocol with respect to overhead
85			(overhead can be 2-3 times higher than the transferred data), and latency
86			(which can be many seconds). Some DNS servers might not be prepared to
87			handle the traffic and drop or corrupt packets. The client also has to
88			constantly poll the server for data, so the client will constantly create
89			traffic even if it doesn't need to transport packets.
90
91			In addition, the same problems as the TCP transport also plague this
92			protocol.
93
94	root	1.8	Its only use is to tunnel through firewalls that do not allow direct
95	pcg	1.3	internet access. Similar to using a HTTP proxy (as the TCP transport
96			does), it uses a local DNS server/forwarder (given by the C<dns-forw-host>
97			configuration value) as a proxy to send and receive data as a client,
98	pcg	1.6	and an C<NS> record pointing to the GVPE server (as given by the
99	pcg	1.3	C<dns-hostname> directive).
100
101			The only good side of this protocol is that it can tunnel through most
102	pcg	1.6	firewalls mostly undetected, iff the local DNS server/forwarder is sane
103	pcg	1.7	(which is true for most routers, wireless LAN gateways and nameservers).
104	pcg	1.6
105	pcg	1.7	Fine-tuning needs to be done by editing C<src/vpn_dns.C> directly.
106	pcg	1.3
107	pcg	1.2	=head1 PART 2: The GNU VPE protocol
108
109			This section, unfortunately, is not yet finished, although the protocol
110			is stable (until bugs in the cryptography are found, which will likely
111			completely change the following description). Nevertheless, it should give
112			you some overview over the protocol.
113	pcg	1.1
114			=head2 Anatomy of a VPN packet
115
116			The exact layout and field lengths of a VPN packet is determined at
117	pcg	1.7	compile time and doesn't change. The same structure is used for all
118			transport protocols, be it RAWIP or TCP.
119	pcg	1.1
120			+------+------+--------+------+
121			\| HMAC \| TYPE \| SRCDST \| DATA \|
122			+------+------+--------+------+
123
124			The HMAC field is present in all packets, even if not used (e.g. in auth
125			request packets), in which case it is set to all zeroes. The checksum
126	pcg	1.2	itself is calculated over the TYPE, SRCDST and DATA fields in all cases.
127	pcg	1.1
128			The TYPE field is a single byte and determines the purpose of the packet
129			(e.g. RESET, COMPRESSED/UNCOMPRESSED DATA, PING, AUTH REQUEST/RESPONSE,
130			CONNECT REQUEST/INFO etc.).
131
132			SRCDST is a three byte field which contains the source and destination
133	pcg	1.6	node IDs (12 bits each).
134	pcg	1.1
135			The DATA portion differs between each packet type, naturally, and is the
136			only part that can be encrypted. Data packets contain more fields, as
137			shown:
138
139			+------+------+--------+------+-------+------+
140			\| HMAC \| TYPE \| SRCDST \| RAND \| SEQNO \| DATA \|
141			+------+------+--------+------+-------+------+
142
143			RAND is a sequence of fully random bytes, used to increase the entropy of
144			the data for encryption purposes.
145
146			SEQNO is a 32-bit sequence number. It is negotiated at every connection
147	root	1.12	initialization and starts at some random 31 bit value. GVPE currently uses
148	pcg	1.2	a sliding window of 512 packets/sequence numbers to detect reordering,
149	pcg	1.6	duplication and replay attacks.
150	pcg	1.1
151	root	1.10	The encryption is done on RAND+SEQNO+DATA in CBC mode with zero IV (or,
152			equivalently, the IV is RAND+SEQNO, encrypted with the block cipher,
153			unless RAND size is decreased or increased over the default value).
154
155	root	1.12	The random prefix itself is generated by using AES in CTR mode with a
156			random key and starting value, which should make them unpredictable even
157			before encrypting them again. The sequence number additionally ensures
158			that the IV is unique.
159
160	root	1.11	=head2 The authentication/key exchange protocol
161	pcg	1.1
162	pcg	1.7	Before nodes can exchange packets, they need to establish authenticity of
163			the other side and a key. Every node has a private RSA key and the public
164			RSA keys of all other nodes.
165	pcg	1.1
166	root	1.11	When a node wants to establish a connection to another node, it sends an
167	root	1.13	RSA-OEAP-encrypted challenge and an ECDH (curve25519) key. The other node
168			replies with its own ECDH key and a HKDF of the challenge and both ECDH
169			keys to prove its identity.
170	root	1.11
171			The remote node enganges in exactly the same protocol. When both nodes
172			have exchanged their challenge and verified the response, they calculate a
173			cipher key and a HMAC key and start exchanging data packets.
174
175			In detail, the challenge consist of:
176
177			RSA-OAEP (SEQNO MAC CIPHER SALT EXTRA-AUTH) ECDH1
178
179			That is, it encrypts (with the public key of the remote node) an initial
180			sequence number for data packets, key material for the HMAC key, key
181			material for the cipher key, a salt used by the HKDF (as shown later) and
182			some extra random bytes that are unused except for authentication. It also
183			sends the public key of a curve25519 exchange.
184
185	root	1.13	The remote node decrypts the RSA data, generates its own ECDH key (ECDH2),
186			and replies with:
187	root	1.11
188			HKDF-Expand (HKDF-Extract (ECDH2, RSA), ECDH1, AUTH_DIGEST_SIZE) ECDH2
189
190	root	1.13	That is, it extracts from the decrypted RSA challenge, using its ECDH
191	root	1.11	key as salt, and then expands using the requesting node's ECDH1 key. The
192	root	1.13	resulting hash is returned as a proof that the node could decrypt the RSA
193	root	1.11	challenge data, together with the ECDH key.
194
195			After both nodes have done this to each other, they calculate the shared
196	root	1.13	ECDH secret, cipher and HMAC keys for the session (each node generates two
197			cipher and HMAC keys, one for sending and one for receiving).
198	root	1.11
199			The HMAC key for sending is generated as follow:
200
201			HMAC_KEY = HKDF-Expand (HKDF-Extract (REMOTE_SALT, MAC ECDH_SECRET), info, HMAC_MD_SIZE)
202
203			It extracts from MAC and ECDH_SECRET using the I<remote> SALT, then
204			expands using a static info string.
205
206			The cipher key is generated in the same way, except using the CIPHER part
207			of the original challenge.
208
209			The result of this process is to authenticate each node to the other
210			node, while exchanging keys using both RSA and ECDH, the latter providing
211			perfect forward secrecy.
212	pcg	1.1
213	root	1.12	The protocol has been overdesigned where this was possible without
214			increasing implementation complexity, in an attempt to protect against
215			implementation or protocol failures. For example, if the ECDH challenge
216	root	1.13	was found to be flawed, perfect forward secrecy would be lost, but the
217			data would likely still be protected. Likewise, standard algorithms and
218	root	1.12	implementations are used where possible.
219
220	pcg	1.1	=head2 Retrying
221
222	pcg	1.7	When there is no response to an auth request, the node will send auth
223			requests in bursts with an exponential back-off. After some time it will
224	pcg	1.6	resort to PING packets, which are very small (8 bytes + protocol header)
225	pcg	1.7	and lightweight (no RSA operations required). A node that receives ping
226	pcg	1.6	requests from an unconnected peer will respond by trying to create a
227			connection.
228	pcg	1.1
229	pcg	1.7	In addition to the exponential back-off, there is a global rate-limit on
230	pcg	1.2	a per-IP base. It allows long bursts but will limit total packet rate to
231	pcg	1.1	something like one control packet every ten seconds, to avoid accidental
232	pcg	1.2	floods due to protocol problems (like a RSA key file mismatch between two
233	pcg	1.7	nodes).
234	pcg	1.1
235	pcg	1.6	The intervals between retries are limited by the C<max-retry>
236			configuration value. A node with C<connect> = C<always> will always retry,
237			a node with C<connect> = C<ondemand> will only try (and re-try) to connect
238			as long as there are packets in the queue, usually this limits the retry
239			period to C<max-ttl> seconds.
240
241			Sending packets over the VPN will reset the retry intervals as well, which
242			means as long as somebody is trying to send packets to a given node, GVPE
243			will try to connect every few seconds.
244
245	pcg	1.1	=head2 Routing and Protocol translation
246
247	pcg	1.6	The GVPE routing algorithm is easy: there isn't much routing to speak
248	root	1.9	of: When routing packets to another node, GVPE tries the following
249	pcg	1.6	options, in order:
250
251			=over 4
252
253	pcg	1.7	=item If the two nodes should be able to reach each other directly (common
254	pcg	1.6	protocol, port known), then GVPE will send the packet directly to the
255			other node.
256
257			=item If this isn't possible (e.g. because the node doesn't have a
258			C<hostname> or known port), but the nodes speak a common protocol and a
259			router is available, then GVPE will ask a router to "mediate" between both
260			nodes (see below).
261
262			=item If a direct connection isn't possible (no common protocols) or
263			forbidden (C<deny-direct>) and there are any routers, then GVPE will try
264			to send packets to the router with the highest priority that is connected
265			already I<and> is able (as specified by the config file) to connect
266			directly to the target node.
267
268			=item If no such router exists, then GVPE will simply send the packet to
269			the node with the highest priority available.
270
271			=item Failing all that, the packet will be dropped.
272
273			=back
274	pcg	1.1
275	root	1.13	A host can usually declare itself unreachable directly by setting its
276	pcg	1.1	port number(s) to zero. It can declare other hosts as unreachable by using
277	pcg	1.6	a config-file that disables all protocols for these other hosts. Another
278			option is to disable all protocols on that host in the other config files.
279	pcg	1.1
280			If two hosts cannot connect to each other because their IP address(es)
281	pcg	1.7	are not known (such as dial-up hosts), one side will send a I<mediated>
282	pcg	1.6	connection request to a router (routers must be configured to act as
283			routers!), which will send both the originating and the destination host
284			a connection info request with protocol information and IP address of the
285			other host (if known). Both hosts will then try to establish a direct
286			connection to the other peer, which is usually possible even when both
287			hosts are behind a NAT gateway.
288
289			Routing via other nodes works because the SRCDST field is not encrypted,
290			so the router can just forward the packet to the destination host. Since
291	root	1.13	each host uses its own private key, the router will not be able to
292	pcg	1.6	decrypt or encrypt packets, it will just act as a simple router and
293			protocol translator.
294	pcg	1.1
295