Gnutella Development Forum - File Repository / changeset
| author | Arne Babenhauserheide <bab@draketo.de> |
| Thu Mar 20 15:21:22 2008 +0100 (5 months ago) | |
| changeset 11 | 3b3c4d5c4aa2 |
| parent 10 | 3ffa979505af |
| child 12 | 3cc9d58d89c7 |
Added all files from rfc-gnutella.sf.net .
--- /dev/null Thu Jan 01 00:00:00 1970 +0000+++ b/.hgtags Thu Mar 20 15:21:22 2008 +0100@@ -0,0 +1,1 @@+3ffa979505af4cdf6d7ab4908634e56d06bf664e drak1
--- /dev/null Thu Jan 01 00:00:00 1970 +0000+++ b/rfc-gnutella/draft.txt Thu Mar 20 15:21:22 2008 +0100@@ -0,0 +1,2223 @@+test+Network Working Group T. Klingberg+Request for Comments: NNNN R. Manfredi+Category: Informational June 2002++++ Gnutella 0.6+++Status of this Memo++ This is a draft.+++Copyright Notice++ Copyright (C) 2002, Tor Klingberg & Raphael Manfredi+ All Rights Reserved.++ Permission is granted to make verbatim copies of this document,+ provided the Copyright Notice is preserved.+++Rights of Free Implementation++ The authors of the various proposals that make up this document+ grant the rights to anyone to freely implement those proposals.+ Gnutella is an Open Protocol, where the specifications are+ public and free of any patent.+++Table of Contents++ 1 Introduction+ 1.1 Purpose+ 1.2 Requirements+ 1.3 Terminology+ 1.4 Extending the protocol+ 2 Protocol Definition+ 2.1 Initiating a Connection+ 2.2 Gnutella Messages+ 2.2.1 Message Header+ 2.2.2 Ping (0x00)+ 2.2.3 Pong (0x01)+ 2.2.4 Use of Ping and Pong messages+ 2.2.4.1 A simple pong caching scheme+ 2.2.4.2 Other pong caching schemes+ 2.2.5 Query (0x80)+ 2.2.6 Query Hit+ 2.2.7 Use of Query and Query Hit+ 2.2.7.1 Forwarding and routing of Query and Query Hit messages+ 2.2.7.2 When and how to send new Query messages.+ 2.2.7.3 When and how to respond with Query Hit messages.+ 2.2.8 Push (0x40)+ 2.2.9 Bye (0x02)+ 2.3 GGEP Extension blocks+ 2.3.1 GGEP Format+ 2.3.2 Creating Extension IDs+ 3 Protocol Usage+ 3.1 Flow Control+ 3.2 Network Structure+ 3.2.1 Ultrapeer system+ 3.2.2 Query Routing Protocol (unfinished)+ 4 File Transfer+ 4.1 Normal File Transfer+ 4.2 Firewalled servents+ 4.3 Busy Servents+ 4.4 Sharing+ 5 Security Considerations+ 5.1 Threats against individual Gnutella participants+ 5.2 Threats against the Gnutella network+ 5.3 Threats against third parties+ 6 Credits+ Appendix 1 HUGE (Hash/URN Gnutella Extensions)+ Appendix 2 XML+ Appendix 3 Finding a Gnutella host+ Appendix 4 When to open or accept new Gnutella connections+ Appendix 5 Gnutella network traffic compression+++1 Introduction++1.1 Purpose++Gnutella is a decentralized peer-to-peer system. It allows the+participants to share resources from their system for others to+see and get, and locate resources shared by others on the network.++Resources can be anything: mappings to other resources, cryptographic+keys, files of any type, meta-information on keyable resources, etc.+However, the semantics for locating and handling resources other than+plain files are not specified in this document.++Each participant launches a Gnutella program, which will seek out+other Gnutella nodes to connect to. This set of connected nodes+carries the Gnutella traffic, which is essentially made of queries,+replies to those queries, and also other control messages to+facilitate the discovery of other nodes.++Users interact with the nodes by supplying them with the list of+resources they wish to share on the network, can enter searches for+other's resources, will hopefully get results from those searches,+and can then select those resources amongst the results: if those+resources are files, for instance, they can download them. But one+can imagine other types of resources that, once fetched, will bring+more than their content value.++Resource data exchanges between nodes are negotiated using the+standard HTTP protocol. The Gnutella network is only used to locate+the nodes sharing those resources.++This document is intended for readers with a fair knowledge of+network programming, but do not require any previous Gnutella+experience. Still, other implementations of this protocol will give+useful information about implementation techniques that is not+included in this document. A list of Gnutella programs can be found+at http://www.gnutelliums.com+++1.2 Requirements++The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",+"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this+document are to be interpreted as described in RFC 2119 [34].+++1.2.1 The Gnutella Development Forum (the GDF)++The Gnutella Development Forum is a good place to find more Gnutella+documentation, proposals about changes and extensions and to discuss+Gnutella development with other developers. The message archive is+also a good source for information about the protocol and its+implementation. Some of the links in this document requires+membership in the Gnutella Development Forum. Everyone is, of course,+allowed to become a member. The GDF is located at+http://groups.yahoo.com/group/the_gdf++There are many other forums for discussing Gnutella development as+well.+++1.3 Terminology++Servent A program participating in the Gnutella network is+ called a servent. The words "peer", "node" and "host"+ have similar meanings, but refers to a network+ participant rather than a program. When a servent+ have a clear client or server role the words "client"+ or "server" may be used. The word "client" is+ sometimes used as a synonym for servent. This is a+ contraction of "SERVer" and "cliENT", Some other+ documents use the word "servant" instead of servent.++Message Messages are the entity in which information is+ transmitted over the network. Sometimes the word+ "packet" is used with the same meaning. Some other+ documents use the word "descriptor"++GUID Globally Unique IDentifier. This is a 16-byte long+ value made of random bytes, whose purpose it is to+ identify servents and messages. This identification+ is not a signature, just a way to identify network+ entities in a unique manner.++1.4 Extending the protocol++This document is the definition of the Gnutella 0.6 protocol.+Servents MAY extend the protocol or even change parts of it (for+example by compressing or encrypting the messages), but servents+MUST always stay compatible with servents that follow this+specification.++If a servent, for example, wants to compress the Gnutella messages,+it MUST first make sure the remote host of a connection can+decompress the stream (during handshake), and otherwise leave the+messages uncompressed. Servents MAY chose not to accept a connection+with a servent that does not support a feature, but MUST always make+sure that the Gnutella network is not split into separate networks.++Separate networks for special purposes are, of course, allowed but+then it is no longer the Gnutella network, but another network.++This protocol also allows for extensions inside many messages. Such+extensions can pass through servents that do not know about the+extension to reach servents that do.+++2 Protocol Definition++The Gnutella protocol defines the way in which servents communicate+over the network. It consists of a set of messages used for+communicating data between servents and a set of rules governing+the inter-servent exchange of messages. Currently, the following+messages are defined:++Ping Used to actively discover hosts on the network. A+ servent receiving a Ping message is expected to+ respond with one or more Pong messages.++Pong The response to a Ping. Includes the address of a+ connected Gnutella servent, the listening port of+ that servent, and information regarding the amount+ of data it is making available to the network.++Query The primary mechanism for searching the distributed+ network. A servent receiving a Query message will+ respond with a Query Hit if a match is found against+ its local data set.++QueryHit The response to a Query. This message provides the+ recipient with enough information to acquire the data+ matching the corresponding Query.++Push A mechanism that allows a firewalled servent to+ contribute file-based data to the network.++Bye An optional message used to inform the remote host+ that you are closing the connection, and your reason+ for doing so.+++2.1 Initiating a Connection++A Gnutella servent connects itself to the network by establishing a+connection with another servent currently on the network.+Techniques for finding the first host are described in Appendix 3.+Once the first connection is established, the addresses of more hosts+will be supplied over the network. The default Gnutella port is 6346,+but servents MAY use any unused port. If the desired port is used+(probably by another Gnutella servent) the servent SHOULD attempt to+listen on another port. This listening port is advertised by the+servent through the Pong messages.++Techniques and rules for how to select what other Gnutella hosts to+connect to and when to accept connection requests can be found in+Appendix 4.++Once the address of another servent on the network is obtained, a+TCP/IP connection to the servent is created, and a handshaking+sequence is initiated. The client is the host initiating the+connection and the server is the host receiving it. "<cr>" refers+to ASCII character 13 (carriage return), and "<lf>" to 10 (new line).++ 1. The client establishes a TCP connection with the server.+ 2. The client sends "GNUTELLA CONNECT/0.6<cr><lf>".+ 3. The client sends all capability headers--except for+ vendor-specific headers--each terminated by "<cr><lf>", with+ an extra "<cr><lf>" at the end.+ 4. The server responds with "GNUTELLA/0.6 200 <string><cr><lf>".+ <string> SHOULD be "OK", but servents SHOULD just look for the+ "200" code.+ 5. The server sends all its headers, in the same format as in (3).+ 6. The client sends "GNUTELLA/0.6 200 OK<cr><lf>, as in (4) if+ after parsing the server's headers, it still wishes to connect.+ Otherwise, it needs to reply with an error code and close the+ connection.+ 7. The client sends any vendor-specific headers as needed, in the+ same format as (3).+ 8. Both client and server send binary messages at will, using the+ information gained in (3) and (5).++All headers SHOULD be registered with the GDF database at+http://groups.yahoo.com/group/the_gdf/database?method=reportRows&tbl=9+(Requires GDF membership)++Headers follow the standards described in RFC822 and RFC2616. Each+header is made of a field name, followed by a colon, and then the+value. Each line ends with the <cr><lf> sequence, and the end of the+headers is marked by a single <cr><lf> line. Each line normally+starts a new header, unless it begins with a space or an horizontal+tab (ASCII codes 32 and 9 in decimal, respectively), in which case it+continues the preceding header line. The extra spaces and tabs may+be collapsed into a single space as far as the header value goes.+For instance:++ First-Field: this is the value of the first field<cr><lf>+ Second-Field: this is the value<cr><lf>+ of the<cr><lf>+ second field<cr><lf>+ <cr><lf>++The header above is made of two fields, "First-Field" and "Second-+Field" whose values are respectively "this is the value of the first+field" and "this is the value of the second field" (leading spaces of+the continuation were collapsed). Note that the leading space+between the ":" ending the field name and the start of the value+string does not count.++Multiple header lines with the same field name are identical to one+header line where all the values of the fields would be separated by+",". This means:++ Field: first<cr><lf>+ Field: second<cr><lf>++is strictly equivalent to saying:++ Field: first,second<cr><lf>++In other words, order matters in that case.++Here is a sample interaction between a client and a server. Data+sent from client to server is shown on the left; data sent from+server to client is shown on the right.++ Client Server+ -----------------------------------------------------------+ GNUTELLA CONNECT/0.6<cr><lf>+ User-Agent: BearShare/1.0<cr><lf>+ Pong-Caching: 0.1<cr><lf>+ GGEP: 0.5<cr><lf>+ <cr><lf>+ GNUTELLA/0.6 200 OK<cr><lf>+ User-Agent: BearShare/1.0<cr><lf>+ Pong-Caching: 0.1<cr><lf>+ GGEP: 0.5<cr><lf>+ Private-Data: 5ef89a<cr><lf>+ <cr><lf>+ GNUTELLA/0.6 200 OK<cr><lf>+ Private-Data: a04fce<cr><lf>+ <cr><lf>++ [binary messages] [binary messages]++A few notes about the responses: first, the client (server) SHOULD+disconnect if receiving any response other than "200" at step 4+(6). There is no need to define these error codes now. Second,+servents SHOULD ignore higher version numbers in steps (2), (4), and+(6). For example, it is perfectly legal for a future client to+connect to a server and send "GNUTELLA CONNECT/0.7". The server+SHOULD respond with "GNUTELLA/0.7 200 OK" if it supports the 0.7+protocol, or "GNUTELLA/0.6 200 OK" otherwise.++A few notes about the headers: servents SHOULD use standard HTTP+headers whenever appropriate. For example, servents SHOULD use the+standard "User-Agent" header rather than make up a "Servent-Vendor"+header. However, it is perfectly legal to add new headers (e.g.,+"Query-Routing") when no appropriate HTTP header exists, as long as+they follow HTTP syntax. Headers unknown to the servent MUST be+ignored.++Some older servents will initiate the handshake by sending+"GNUTELLA CONNECT/0.4<lf><lf>". The server SHOULD then reply with+"GNUTELLA OK<lf><lf>" followed by binary messages, if it can accept+the connection. Servents MAY retry using the 0.4 connect string if+the 0.6 connection attempt were rejected. No handshaking headers can+be used in 0.4 handshaking.++When rejecting a connection, a servent MUST, if possible, provide the+remote host with a list of other Gnutella hosts, so it can try+connecting to them. This SHOULD be done using the X-Try header.++An X-Try header can look like:++ X-Try:1.2.3.4:1234,3.4.5.6:3456++There MAY be a space after the colon and after each comma. There MAY+be multiple X-Try headers in one header set. The header MAY end with+an extra comma. The header MAY be formatted on several lines using+continuations.++Each item in the X-Try header gives the IP address of a servent+and its listening port number. This is sometimes referred to as+being a "connection pong". If the server sending the X-Try+implements Pong-Caching, then the connection pongs being sent must be+fresh ones.++The normal status code for rejecting a connection because the servent+is busy is "503 " followed by "Busy" or another description string.+++2.2 Gnutella Messages++Once a servent has connected successfully to the network, it+communicates with other servents by sending and receiving Gnutella+protocol messages. Each message is preceded by a Message Header with+the byte structure given below.++Note 1: One IP packet may contain several Gnutella messages, and+one Gnutella message may be split up on multiple IP-packets. This+means one can never assume a Gnutella message ends when the chunk of+data read from the socket ends.++Note 2: All fields in the following structures are in little-endian+byte order unless otherwise specified.++Note 3: All IP addresses in the following structures are in IPv4+format. For example, the IPv4 byte array++ 0xD0 0x11 0x32 0x04+ byte 0 byte 1 byte 2 byte 3++represents the dotted address 208.17.50.4.+++2.2.1 Message Header++The message header is 23 bytes divided into the following fields.++ Bytes: Description:+ 0-15 Message ID/GUID (Globally Unique ID)+ 16 Payload Type+ 17 TTL (Time To Live)+ 18 Hops+ 19-22 Payload Length++Message ID A 16-byte string (GUID) uniquely identifying the+ message on the network.++ Servents SHOULD store all 1's (0xff) in byte 8 of the+ GUID. (Bytes are numbered 0-15, inclusive.) This+ serves to tag the GUID as being from a modern+ servent.++ Servents SHOULD initially store all 0's in byte 15 of+ the GUID. This is reserved for future use.++ The other bytes SHOULD have random values.++Payload Indicates the type of message+Type 0x00 = Ping+ 0x01 = Pong+ 0x02 = Bye+ 0x40 = Push+ 0x80 = Query+ 0x81 = Query Hit++ Other Gnutella messages can be used, but if so the+ servent MUST first make sure that the remote host+ supports this new message type. This can be done+ using handshaking headers.++TTL Time To Live. The number of times the message+ will be forwarded by Gnutella servents before it is+ removed from the network. Each servent will decrement+ the TTL before passing it on to another servent. When+ the TTL reaches 0, the message will no longer be+ forwarded (and MUST not).++Hops The number of times the message has been forwarded.+ As a message is passed from servent to servent, the+ TTL and Hops fields of the header must satisfy the+ following condition:+ TTL(0) = TTL(i) + Hops(i)+ Where TTL(i) and Hops(i) are the value of the TTL and+ Hops fields of the message, and TTL(0) is maximum+ number of hops a message will travel (usually 7).++Payload The length of the message immediately following+Length this header. The next message header is located+ exactly this number of bytes from the end of this+ header i.e. there are no gaps or pad bytes in the+ Gnutella data stream. Messages SHOULD NOT be larger+ than 4 kB.++The Payload Length field is the only reliable way for a servent to+find the beginning of the next message in the input stream.+Therefore, servents SHOULD rigorously validate the Payload Length+field for each message received. If a servent becomes out of synch+with its input stream, it SHOULD close the connection associated with+the stream since the upstream servent is either generating, or+forwarding, invalid messages.++Abuse of the TTL field in broadcasted messages (Query) will lead to+an unnecessary amount of network traffic and poor network+performance. Therefore, servents SHOULD carefully check the TTL+fields of received query messages and lower them as necessary.+Assuming the servent's maximum admissible Query message life is 7+hops, then if TTL + Hops > 7, TTL SHOULD be decreased so that TTL ++Hops = 7. Broadcasted messages with very high TTL values (>15)+SHOULD be dropped.++Immediately following the message header, is a payload consisting+of one of the following messages.+++2.2.2 Ping (0x00)++Ping messages MAY contain a GGEP extension block (see Section 2.3),+but no other payload.+++2.2.3 Pong (0x01)++Pong messages contains information about a Gnutella host. The+message has the following fields++ Bytes: Description:+ 0-1 Port number. The port number on which the responding+ host can accept incoming connections.+ 2-5 IP Address. The IP address of the responding host.+ Note: This field is in big-endian format.+ 6-9 Number of shared files. The number of files that the+ servent with the given IP address and port is sharing+ on the network.+ 10-13 Number of kilobytes shared. The number of kilobytes+ of data that the servent with the given IP address and+ port is sharing on the network.+ 14- OPTIONAL GGEP extension block. (see Section 2.3)++Pong messages are only sent in response to an incoming Ping+message. It is valid for more than one Pong message to be sent in+response to a single Ping message. This enables host caches to send+cached servent address information in response to a Ping request.++The Message ID of a Pong message MUST be the Message ID of the Ping+message it is sent in reply to.++The fields specifying the number of shared files and the number of+kilobytes shared was intended to allow one to measure the amount of+data available on the network. With a very large Gnutella network,+and minimized Ping and Pong message traffic, this can no longer be+done. Still, these fields SHOULD be filled out correctly.+++2.2.4 Use of Ping and Pong messages++In early versions Gnutella, Ping messages were broadcasted over the+network. Pong messages were then routed back to the originator of+the Ping message the same way as Query Hits messages are routed+(se section 2.2.7). That system consumed a lot of network bandwidth,+so modern Gnutella servents cache Pong messages, or use other means+of minimizing the bandwidth used by Ping and Pong messages.++There are different systems for handling Ping and Pong messages,+but what they have in common is:++ * When a Ping message is received (TTL>1 and it was at least one+ second since another Ping was received on that connection), a+ servent MUST, if possible, respond with a number of Pong+ Messages. These pongs MUST have the same message ID as the+ incoming ping, a
