This XML document describes the KGS protocol. It is also used to automatically generate the perl parser for all the messages and structures in the protocol. Adapting it to other languages should be trivial.
Please note that the author of KGS has told me that he will change the protocol in response to my efforts. Unfortunately, he does this just to make it more difficult to reverse-engineer it, since his changes are neither required, nor useful (they just make the protocol less robust, without adding added value).
If you feel you need to update the visual appearance of this document, feel free to look doc/doc2html.xsl and improve it.
The current version of this document can always be found at here, while the HTML version of it can be found here.
Sorry - I have little time to dissect the protocol, but as far as I can see, there was no deeper need for the protocol change, as the protocol itself didn't change in a significant way. The only significant change was the addition of a linear congruence generator that is xor'ed into the packet length, and some heavy foolery to change receive packets. It seems that wms prefers to lock out many of his own users than to have a few people write their own client. I didn't really expect that from him, but instead expected real changes for the good, as he is claiming all the time.
Well, that is just what he accouned earlier, so he just did what he said...
Anything I know about changes in 2.5.x are reflected in this document already. You can log-in, chat, log-out, but the gamelist is corrupted, and you still cannot watch games.
"Send" means messages send from the client to the server, while "received" means messages send by the server to the client.
Everything on the wire is in little-endian format (what a shame).
Primitive types are mostly integers (signed
"I
<bits>", unsigned "U
<bits>"),
ascii strings ("username
"), or zero-terminated
UCS2-Strings ("STRING
"). Yes, I know java is supposed to
do UTF-16, but no implementation seems to care...
For the rest, go figure or bug me, Marc Lehmann <pcg@goof.com>
After connecting to the server, a handshake byte is sent. It's the major version number of the protocol the client expects to receive. Version 3 and 4 are mostly the same, except that Version 4 clients expect server messages to be compressed, version 3 clients not.
The server sends back his protocol number, which is always 3 in the current protocol. Most of the protocol variation is determined by the server using the client version that is used in the initial login message, not the initial handshake byte.
After the initial handshake, the client sends uncompressed messages, while the server sends back a zlib-compressed stream (rfc1950 and rfc1951).
All messages have the same header:
The length is the length of the full message including the header.
Beginning with version 2.5.x, a number is xored into the low
byte of the length in sent packages only, as given by the
following recurrence: rand[0] = 0; rand[i+1] = msg[i].length
+ (rand[i] * 0x04c2af9b + 0xfffffffb); xorbyte = rand >>
24
, all in 32 bit unsigned iso-c arithmetic.
If the type is >= 0x4000 this is a message for a specific channel. The channel number is always the next U16.
Beginning with version 2.5.x, a number is added on received messages only. The algorithm is as follows:
msglen < 44: type = typefield msglen > 44: type = (typefield + rand[i]) % 0x10000 rand[0] = 0 rand[i+1] = username[type % length username] + rand[i] * (type - 0x6cdd) where username is the user name of the logged-in user. coooool.
Apart from the basic types, I need to define some extra types to
deal with fixed-point values (based on integer types) or fixed-length
strings (either 7-bit-ascii or more limited (A
), or UCS-2
based (S
)).
The basic user or login name, used throughout the protocol as a handle to the user.
Many strings in the protocol are fixed-width for no good reason (maybe this is one reason for using compression in newer versions, as the packets itself are wasting lots of space.
Used in user_record.
A kind of locale specifier. It seems the general format seems to be lowercase language, underscore, uppercase location, e.g. en_US. More fancy specifications don't fit.
Just a simple boolean value. 0 means false, and 1 generally true, but I suggest ccepting != 0 as true.
Komi values are multiplied by 2 to make them integer in the protocol. Well, *most* of the time at least...
The game result is also multiplied by two to give it higher resolution. There are also special values for wins by time etc., either in result or in the score* types, or both :)
A score value (used for displaying the score at the end of a game) are multiplied by four for a change (the 0.25 resolution is not used). In game structures it is encoded by dividing by two, though, so watch out!
Time values are multiplied by 1000, giving them millisecond accuracy.
64 bit timeval, milliseconds since posix epoch, e.g. my
($year, $month, $day) = (gmtime $date * 0.001)[5,4,3];
Password is a number calculated as follows (VERY insecure, basically
plaintext!): password = 0; for char in characters do password ←
password * 1055 + ascii_code (char)
Baaah... not yet.
Everywhere a user + flags is required, even used in some places where only a username is required. I see no general rule on when a complete user and when a partial user is required.
This structure is used for challanges as well as in the special TREE "subprotocol". It tightly encodes the game parameters.
Sent to login, usually the first message sent. The password needs to be set when the guest flag is true. Possible replies: . Followed by:
Request info about a certain user. Possible reply:
Update user info. Message structure is very similar to .
Request user graph data, replied with .
Request a user picture from the server. Results in a or a timeout.
Send a global message. Maybe. Never tried, for obvious reasons :/. Results in a sent to all users.
List the rooms in a specific group/category. Results in a message.
Requests part of the users game record to be sent. Results in a or maybe a timeout.
Joins the given room. messages for yourself and all users in that room, as well as the initial gamelist, are send if the room exists. If not, timeout...
A single game record entry, as seen in .
Not all room messages are for rooms only, and rooms need to parse not only these messages. Orthogonality, what for?