NatT protocol
| DavidXanatos  (Talk | contribs)  (→Appendix) | DavidXanatos  (Talk | contribs)   (→Connection Begin) | ||
| Line 164: | Line 164: | ||
| [uint32 4] MAXFRAGSIZE (optional) | [uint32 4] MAXFRAGSIZE (optional) | ||
| </pre> | </pre> | ||
| + | |||
| + | Note: The sequence number of sent segments starts with 1 and is incremented by one. | ||
| === Communication === | === Communication === | ||
Revision as of 14:11, 19 July 2007
THIS IS NOT FINAL... If you have any suggestions please post them on the discussion page
| Contents | 
Introduction
NAT (Network Address Translation) Traversal allows Low ID to Low ID connections. There are different techniques to achieve this, like STUN (Simple Traversal of UDP through NATs), or TURN (Traversal Using Relay NAT). For obvious reasons TURN is not suitable for any kind of file sharing applications. The NatT for eMule uses there for STUN, as well as a attempt to become a High ID by reusing the Listening port for outgoing connections, if the NAT device is a Full Cone this technique will make the client fully connectible, but it works only in a few cases. The main implementation using STUN can connect even through hardware firewalls, it only fails on Symmetric NAT’s, there are as well methods to penetrate these but they require to much overhead and are there fore like TURN not suitable for file sharing.
Protocol
The implementation is separated in 2 main segments: the Nat Traversal itself and a reliable streaming connection called User Mode TCP.
NAT Tunneling
To establish a communication tunnel from Alice to Bob through a NAT or a firewall Alice (A) sends first an call-back request to Carlo (C) witch records Alice’s UDP port and IP and relays them together with the request to Bob (B).
At this point Bob knows all he needs but Alice does not know enough (she does not know his UDP port). Now Bob tries to contact Alice directly over UDP (now his NAT/FW is open for messages form Alice) if she is behind a Full Cone Nat, she will get the message and will be able to reply. If she don’t answers after a short time Bob sends an own call-back request to Carlo or Dave (this depends on the later described different call-back relay schemes), he does the same as before and tells Alice the UDP port and IP of Bob. Now she can send him messages and he will receive them (this fails only if one of the two have a symmetric NAT, see the appendix for explanation). If this is not the first communication between Alice and Bob, Alice knows Bobs old UDP port and before she issue a call-back request she always tries to ping Bob directly. When the port is still valid, this will not only success on a full Cone NAT, but can also helps to save one call-back request on other NAT Types / Firewalls.
If everything works they now have a working tunnel, to not loose it they exchange ping messages every few seconds as long as they need the tunnel.
A                       C                 B
Callback Req    ----->      ----->        
     [BLOCKED] <-----------------         ping [OPENED]
               <-----      <-----         Callback Req
ping [OPENED]   ----------------->        
               <-----------------         ping
handshake       ................
Keep alive ping is sent from one of the 2 connection participants and contains a flag that determines weather the ping should be answered or not, it also contains additional informations used for connection establishment, and for obfuscation. The packet may be sent completely empty if no answer is requested and no data are appended.
[uint8] pOPS   // bit Field
               // 4 reserved
               // 3 IdKind
               // 1 req answer
[ID]           // variable length determined by the Ping Options above may not be set at all
[Obfiscation]  // uint8    obfuSetings
               // hash128  userHash
There are yet 3 ID Kinds:
- IdKind == 0 :: means no ID field sent
- IdKind == 1 :: [uint32 4] ed2k ID
- IdKind == 2 :: [hash128 16] Kad ID
- IdKind == 3 :: [hash128 16] User Hash
Callback Methods
eServer (Lugdunum)
The eserver call-back is supported since eserver 17.15, it requires both clients to be on the same server. The call-back is sent to the server from the client’s UDP socket for client to client communication (not from the regular one for client to servers communication), the packet must be obfuscated the server way otherwise it won’t be accepted, the reason why this is done over the client socket is that the server have to see the UDP port of the socket used for NatT. The eserver usually forwards the request from Alice to Bob over his TCP connection with the server this may be delayed up to 12 seconds (to send more requests in one TCP frame), but if the eserver knows a very recent client UDP port of Bob (he may have issued a request to some one seconds ago) than he forwards the call-back request imminently using UDP to Bob. The packet arrives on the client socket but is obfuscated as well the server way (not for simplicity but in order to allow Bob to verify that the packet really was sent by his server) so Bob have to look on the packet’s source IP and direct it to the server UDP socket processing function. The eserver may not forward the ID of Alice to Bob instead he sends a call-back request as if it would come from Bob to Alice telling her Bob’s UDP port that way and saving one UDP frame from Bob (Bob have to check if requester ID is 0 and if it is don’t issue an own call-back request to Alice after the ping fails).
The client advertises his support of NatT by setting the flag SRVCAP_NATTRAVERSAL in the server login tag CT_SERVER_FLAGS. The server notifies his support by the SRV_TCPFLG_NATTRAVERSAL and SRV_UDPFLG_NATTRAVERSAL flags.
The login packet OP_IDCHANGE is also slightly modified, the “Your port ” was added:
[ID 4] [flags 4] [Server_tcp_port 4] /* because in case of aux ports, UDP messages must still be sent on Server_tcp_port+4 only */ [Your IP 4] /* Client IP as seen by eserver */ [Server_obfuscation_TCP_port 4] /* if <>0, the TCP port where eserver listen for obfuscated connections */ [Your port 4] /* if eserver gave you a highID : your listening port (migh be different of what was advertized by client in LOGIN frame) */ /* if eserver gave you a LOWID : the tcp port of the client as seen by eserver (if NAT : the port allocated by NAT device) */
The seen port is sent in order to support the attempt described in the top to become a HighID by reusing the listening port, the server tries the connection attempt here form a different IP than his own, this behaviour is optional.
The UDP callback request packet have the following content:
[uint32 4] Target ID (Bob's ID) [uint32 4] Requester ID (Alice's ID)
The forwarded packets the server sends over TCP or UDP have the same content:
[uint32 4] Seen IP
[uint16 4] Seen Client UDP Port
[uint32 4] Requester ID (Alice's ID) (may be 0)
[obfuscation 17]  // optional
                 // [uint8 1] obfuscation settings
                 // [hash128 16] userhash
The rest of the Client server protocol is as usual with the only one exception that Low ID clients that advertise NatT support gets in the source answer packet not only High ID sources but also Low ID sources that support NatT as well, they are to be flagged as NatT enabled.
Kad (*Unofficial*)
The Kad based call-back is not an official Kad feature, but due to a smart design it is 100% compatible with any normal Kad node. This callback method is to be used ONLY when no other callback method is available! The support of NatT is announced in Kad by publishing as TCP port the value 0xffff, this is for Kad a valid port and could have been also set by the user by hand, but for windows this value is usually invalid because without a registry patch ports above 5000 can not be used by applications, there for its extremely unlikely anyone will ever set 0xffff as port. For the call-back the Kad buddy (Carlo or Dave) relayed “file reask ping” (OP_REASKCALLBACKUDP) is used, it is passed by Carlo together with the seen IP and UDP port of Alice to Bob.
The OP_REASKCALLBACKUDP packet send to Carlo (Bob's buddy), have the following content:
[hash128 16]      // Target client Kad ID (Bob)
[hash128 16]      // File Hash of the pinged file
// the part below is the unofficial mod part
[uint8 1]         // Mod Opcode (OP_NAT_CALLBACKREQUEST_KAD)
[hash128 16]      // Requesting client Kad ID (Alice)
[uint32 4]        // IP of Dave (Alice's buddy)
[uint16 2]        // UDP Port of Dave (D)
[obfuscation 17]  // obtional obfuscation of Alice
                  // [uint8 1 ] obfuscation settings
                  // [hash128 16] userhast
The second [hash128] (File Hash of the pinged file) is filled with 0's in order to allow Bob to distinguish this packet from a normal reask ping, the 0 hash is not valid as file hash but Carlo don't looks on anything excepted the "Target client Kad ID" so no problems occur here.
Carlo modifies the packet slightly by replacing the first [hash128] with the IP and UDP Port he saw Alice used.
The packet Bob receives looks like following:
[uint32 4] // IP of Alice [uint16 2] // UDP Port of Alice [hash128 16] // File Hash of the pinged file // the part below is the unofficial mod part (unchanged) ...
XS (Neo & Co)
The XS call-back is a pure mod feature designed to replace the Kad call-back whenever possible. The call-back handling looks similar to the Kad call-back.
The UDP call-back packet Alice sends to Carlo looks like following:
[hash128 16]     // Target client user hash (Bob)
[uint8 1]        // Mod Opcode (OP_NAT_CALLBACKREQUEST_XS)
[hash128 16]     // Requesting client user hash (Alice)
[uint32 4]       // IP of Dave (Alice's XS buddy)
[uint16 2]       // UDP Port of Dave (D)
[obfuscation 1]  // obtional obfuscation of Alice
                 // [uint8 1] obfuscation settings
                 // Hash is not sent here because its already sent above
Carlo modifies the packet slightly by replacing the first [hash128] with the IP and UDP Port he saw Alice used.
The packet Bob receives looks like following:
[uint32 4] // IP of Alice [uint16 2] // UDP Port of Alice [uint8 1] // Mod Opcode (OP_NAT_CALLBACKREQUEST_XS) // rest unchanged like above ...
Please refer to the Additional Features for NatT section for further informations on the XS Buddy feature.
User Mode TCP
Du to the lack of reliability in UDP communication it is necessary to implement a streaming protocol that will be able to handle packet lost and retransmit missing ones. This implementation is basically a kind of self coded TCP with small modifications, it is called User Mode TCP and have almost all relevant features of the real TCP included (like congestion control).
Connection Begin
To establish a connection Alice (A) sends an NAT_SYN packet to Bob (B) over UDP. Bob answers on this request with an NAT_SYN_ACK, on the moment the Packet is received the connection is considered Established.
A B SYN ---> OnConnect(0) OnConnect(0) <--- SYN_ACK OnSend(0) [Connection Established]
Booth packets have the same content, like following:
[uint8 1] version [uint32 4] MAXFRAGSIZE (optional)
Note: The sequence number of sent segments starts with 1 and is incremented by one.
Communication
All send data segments are equipped with a unique sequence number indicating their position in the data stream. When Bob (B) receives a NAT_DATA segment form Alice (A) he acknowledges it with an NAT_DATA_ACK, if Alice don’t get the acknowledgement she will resend the segment, she will also resend it if she gets acknowledgements for 3 segments with an higher sequence Number when the, while the a acknowledgment for the older segment is still missing (Fast retransmission).
A                    B
             ...
DATA         --->   OnReceived(0)
OnSend(0)   <---    DATA_ACK
             ...
The NAT_DATA packet have the following format:
[uint32 4] Sequence Nr. [Data n] (optional)
The NAT_DATA_ACK packet have the following format:
[uint32 4] Sequence Nr. [uint32 4] Receiving Window Size
Lost packets are handled as following:
A                    B
             ...
DATA1        --->   [LOST]
DATA2        --->   
            <---    DATA_ACK2
DATA3        --->   
            <---    DATA_ACK3
DATA4        --->   
            <---    DATA_ACK4
DATA1        --->              // fast retransmission
            <---    DATA_ACK1
DATA5        --->   
            <---    DATA_ACK5
             ...
             ...
             ...
DATAn        --->   [LOST]
           TimeOut
DATAn        --->              // regular retransmission
            <---    DATA_ACKn
Alice never sends more data segments to Bob that would fit in the last Advertised receiving Window Size of Bob. When she can not send any data she sends empty NAT_DATA packets with the Sequence Nr set to 0, as soon as Bob will have some space in his receiving buffer he will answer with an NAT_DATA_ACK with sequence Nr set to 0 and a new > 0 Window Size. Than Alice will resume, to send data segments.
Alice uses some sophisticated techniques to determine a timeout time for the data segments as well as to determine how much segments she can sent before an acknowledgement arrives, this is called congestion control and designed basing on the normal TCP implementations, for further details please look into the NatT source code.
Connection End
To close a connection in the usual way Alice (A) sends an NAT_FIN packet to Bob (B), he replies with an NAT_FIN_ACK, on the moment the Packet is received the connection is considered Closed.
A B FIN ---> OnClose(0) OnClose(0) <--- FIN_ACK [Connection Closed]
Booth Packets have no content.
To terminate an erroneous connection Alice (A) Sends an NAT_RST, packet and considers the connection closed, Bob (B) don’t answers on this packet and as well considers the connection closed at this point.
A B OnClose(err) RST ---> OnClose(err) [Connection Terminated]
The packet may contain an uint32 error code that is parsed to the OnClose function.
[uint32 4] ErrorCode
Additional Features for NatT
This features are recommended to be implemented together with NatT, they brings important additional functionality that is essential for a good working of the NatT feature.
Neo XS
Neo Source Exchange is an improved tag based source exchange protocol; it allows some over head saving as well as high flexibility and extendibility. It sends the IDHybrid, TCP Port and User Hash followed by a tag list with optional additional information’s like server IP, Port, Kad Buddy ID, IP, Port, NatT support Flags, XS Buddy IP, Port, Obfuscation Flags. This data are packed into so called Nano Tags (see the Appendix for Nano Tag specification).
The Source Packet is built the following way:
[hast128 16] // file hash [uint16 2] // source count // Source entries 1: [uint32 4] // HybridID [uint16 2] // TCP Port [hash128 16] // User Hash [uint8 1] len // Tag list length (in bytes *NOT* tag count) if len == 0xff [uint16 2] // long tag list length [Nano Tag 1] ... [Nano Tag m] // Source entries 2: ... .. . .. ... // Source entries n: ...
Note: The parsing function reads the Nano tag content length from the ID/SIZE field and is there for able to skip unknown tags. Because the tag list length is sent instead of the tag count it is also possible to skip totally erroneous tag lists.
XS Callback
The XS call-back is a buddy system that works without Kad, it is used for NatT Low2Low call-backs as well as for normal Low2High call-backs and it is preferred over the Kad call-back system. A High ID client may have more than one Low ID buddy (currently 3), a Low ID Client asks every XS call-back enabled client in his list one by one if he wants to become his buddy until he gets a positive answer. There for he sends an empty packet XS_BUDDY_REQ to the candidate. The remote client answers with XS_BUDDY_ANSWER witch contain an [uint8] result 1 or 0, if 0 (denied) than the xs buddy status of this client is set to DENIDED and he isn’t asked again unless all known clients have this status than its reset and they are asked again one by one. If the answer was 1 (accepted) the status is set to (HIGH_BUDDY) the remote client sets the status of our low ID client to LOW_BUDDY.
To keep the connection alive the Low Buddy sends every 10 minutes an empty ping packet XS_BUDDYPING, the High Buddy answers on this ping with the same empty packet, he don’t ping on its own.
LowID                HighID
B_REQ         --->   SetXBS(L_B)
SetXBS(H_B)  <---    R_ANSW [1]
              ...
B_PING        --->
             <---    B_PING
              ...
A normal Low2High Callback request looks like following:
[hash128 16]     // Target client user hash
[uint8 1]        // Mod Opcode (OP_CALLBACKREQUEST_XS)
[uint16 2]       // Requesting clients TCP Port
[obfuscation 17] // obtional obfuscation 
                 // [uint8 1] obfuscation settings
                 // [hash128 16] userhast
The packet is modified slightly by the buddy and sent as following to the Low ID client:
[uint32 4] // IP of the requester [uint16 2] // UDP Port of requester [uint8 1] // Mod Opcode (OP_CALLBACKREQUEST_XS) // rest unchanged like above ...
Port Reporting
This feature is intended to allow a LowID on a full cone Nat become a HighID by advertising the reused TCP port allocated by the Nat device. Therefore it is necessary to get from a remote client the seen TCP Port. This feature is designed similar to the official IP Report feature.
The Requesting client sends an empty packet PUBLICPORT_REQ. The remote client answers with PUBLICPORT_ANSWER witch contains only an [uint16] seen TCP Port.
Appendix
Nano Tag
The Nano tags used in Neo XS are designed to provide high flexibility at an ultra low Overhead, they are 2 tag types short and long they are distinguished by the first bit and build the following way:
Short tag (1 to 4 data bytes):
[uint8 1] // [bit1] == 0
          // [bits2-3] // Data Len n = (0,1,2,3)+1 (in bytes)
          // [bits4-8] // Tag ID = {0-31}
[data n]  // data bytes
Long tag (5 to 255 data bytes):
[uint8 1] // [bit1] == 1
          // [bits2-8] // Tag ID = {32 - 127} note {0-31} would work but is not recommended
[uint8 1] // Data Len n = (5,...,255) (in bytes) // 0-4 would also work but is not recommended
[data n]  // data bytes
Nano Tags yet used by Neo Xs:
| NAME | TYPE | ID/SIZE | CONTENT | 
|---|---|---|---|
| NT_ServerIPPort | Long | [byte16] | [uint32][uint16] | 
| NT_BuddyID | Long | [byte16] | [hash128] | 
| NT_BuddyIPPort | Long | [byte16] | [uint32][uint16] | 
| NT_NATT | Short | [byte8] | [uint8] | 
| NT_XsBuddyIPPort | Long | [byte16] | [uint32][uint16] | 
| NT_OBFU | Short | [byte8] | [uint8] | 
Additional Modifications
Fix for eServer Connect
Due to the need of having the server's UDP Key it is needed to alternate the connection procedure to first obtain the server infos over UDP by OP_GLOBSERVSTATREQ, by the way this is sense full also for obfuscation to get the obfuscated server port.
Fix for Bandwidth Control
The sending/receiving functions of the Client UDP Socket must not count the size of NatT Data/Ack packets, this bandwidth is already counted by the EMSocket like regular TCP traffic. Counting it on the UDP Socket will result in false values and waisted bandwidth!
NAT Types
from http://www.voip-info.org/wiki-STUN
- Full Cone: A full cone NAT is one where all requests from the same internal IP address and port are mapped to the same external IP address and port. Furthermore, any external host can send a packet to the internal host, by sending a packet to the mapped external address.
- Restricted Cone: A restricted cone NAT is one where all requests from the same internal IP address and port are mapped to the same external IP address and port. Unlike a full cone NAT, an external host (with IP address X) can send a packet to the internal host only if the internal host had previously sent a packet to IP address X.
- Port Restricted Cone: A port restricted cone NAT is like a restricted cone NAT, but the restriction includes port numbers. Specifically, an external host can send a packet, with source IP address X and source port P, to the internal host only if the internal host had previously sent a packet to IP address X and port P.
- Symmetric: A symmetric NAT is one where all requests from the same internal IP address and port, to a specific destination IP address and port, are mapped to the same external IP address and port. If the same host sends a packet with the same source address and port, but to a different destination, a different mapping is used. Furthermore, only the external host that receives a packet can send a UDP packet back to the internal host.
OpCodes
| NAME | VALUE | 
|---|---|
| OP_NAT_PING | '_' | 
| OP_NAT_SYN | '_' | 
| OP_NAT_SYN_ACK | '_' | 
| OP_NAT_DATA | '_' | 
| OP_NAT_DATA_ACK | '_' | 
| OP_NAT_FIN | '_' | 
| OP_NAT_FIN_ACK | '_' | 
| OP_NAT_RST | '_' | 
| CT_EMULE_BUDDYID | '_' | 
| CT_XS_EMULE_BUDDYIP | '_' | 
| CT_XS_EMULE_BUDDYUDP | '_' | 
| OP_NAT_CALLBACKREQUEST_KAD | '_' | 
| OP_XS_BUDDY_REQ | '_' | 
| OP_XS_BUDDYPING | '_' | 
| OP_XS_MULTICALLBACKUDP | '_' | 
| OP_XS_MULTICALLBACKTCP | '_' | 
| OP_CALLBACKREQUEST_XS | '_' | 
| OP_NAT_CALLBACKREQUEST_XS | '_' | 
| OP_PUBLICPORT_REQ | '_' | 
| OP_PUBLICPORT_ANSWER | '_' | 
