BitTorrent DNA allows the web site to specify a qos parameter which is interpreted as: peer downloads from the web server whenever receiving below this rate. It specifies no upper bound on how quickly one can download from peers. Since it is implemented on the client-side, it can be used with any HTTP server that supports range requests. (It uses GET-right style)
This can of course be hacked by the client, which can only be protected against by setting up the server to rate limit. It is admittedly a compromise.
Slight rewording of 1. GET requests SHOULD be greater than 16KB when possible. There is no advantage to introducing overhead by requesting small chunks.
I am less convinced about 2. A single TCP connection is quite fragile with respect to loss. Look at some of Thinh Nyugen's work on exploiting path diversity from six to seven years ago. If there are multiple servers with mostly disjoint paths between client and server, the client can increase robustness to transient loss and bad routing hiccups by opening connections to each server. I am not sure I want to recommend multiple connections on a single server-client path even though it is common for web browsers-- of course browsers typically operate in the regime of many, small transfers.
Stats can be sanitized: infohashes, ips, and peer ids remapped to unique ids.
In addition to the stats mentioned by Bram, I think we will also have to report piece size and number of pieces in the torrent. Even if infohashes were not sanitized, there is no guarantee we could obtain the info dicts for the infohashes in the log.
From your explanation, I believe the most important point is
Regulating bandwidth usage in the script is straightforward and requires no access to the http daemon's configuration files.
Good implementation solves inefficiency problems with range requests. With Hoffman-style (Shad0w), client implementation may be easier at the expense of implementation complexity on the server. The many small files problem is not the typical case. We can argue these points, but "cheap hosting won't provide you with access to server configuration files while often still allowing PHP [or perl]" is an inarguable advantage of the Hoffman-style approach. Just as "doesn't require server-side scripts" is an inarguable advantage of the GetRight-style approach.
I am not against accepting BOTH.
6. in a future version the peer ID should be moved to the connect packet. it repeats unnecessary in the announce packet.
To which Olaf replied:
If you move it to the connect request, either the peer ID isn't available when you process the announce request or you have to store the peer ID
Why is the peer ID necessary to keep around? The nonce (connection id) can uniquely identify the requester.
Still as suggested by Denis, this issue should be punted to a future version.
1. ... the connect has at least the magic id to detect protocol changes. next version could be 0x0001041727101980 instead of 0x0000041727101980.
+1. This is better then relying on a datagram length change to signal an update to something with versioning and limited extension capability (reserved bits, etc.).
2. I think the client should NOT try again after three tries ... 3. add an error-id to the error packet
+1. error codes --> obviating three retries.
The BEP advices exponential back-off to be used.
+1 (already in BEP)
4...completed downloads also in the announce response...
Not as clearly motivated as the other changes, and why only the completes? If you intend to allocate resources between swarms based on tracker scrapes then number of seeds and downloaders would be quite useful, and if you need these values to decide which swarm to join then you don't want to conflate the announce and the scrape.
I'm not sure it's necessary to have an error code as well.
Hmmm... There is wide variation in how people write error messages. A standardized error code is more precise and less fragile: error messages might change from release to release and between implementations. Standardized error codes enable automated response. Standardized error codes also allow humans to recognize the same error across trackers, across clients, and across web searches for related information.
Those are broad generalizations, but usually correct. So we should start the conversation with "there should be error codes" and then convince us that there shouldn't be.
2 minute time-to-live for the connection id
That should read 1 minute. Excuse me.
If you hash the source IP address with a secret and keep that secret persistent, nothing happens.
Ok. Salted hash (I also mentioned this). To which Olaf responded with motivation for changing the salt periodically. I have no doubt that salts will change for some reason or another. If salts change then client and tracker can be get "out-of-sync."
The BEP currently specifies only the 1 minute TTL, which I would like to remove. Waiting for three failures works as long as connection ID's are typically reused >>3 times. Though waiting for 3 failures is needless if we use error codes.
While it is possible to encrypt tracker traffic via https, and thepiratebay guys consider sniffing ISPs a huge problem for their U.S. users, the udp protocol provides no such protection at all.
We covered this before in the context of lightweight encryption for TCP trackers. If someone identifies a real attack then we should probably start from here:
BitTorrent uses UDP for DHT, LSD, and UDP tracking.
I assume you refer to UDP for file transport. BitTorrent DNA uses UDP, and it is being introduced into uTorrent. It not only improves NAT traversal but allows us the freedom to change the congestion controller.... DNA uses a "less than best-effort" congestion controller so that BitTorrent acts as a background file transfer mechanism.
We are working out how to use UDP as a canary to backoff BitTorrent TCP connections when talking to legacy peers.
I'm sorry. My writing about BEPs was far afield from what I should've been saying.
Localization has been a hot topic in the IETF and in these forums. Please skim through this thread:
We would like to include localization mechanisms in uTorrent, but mainly to minimize inter-ISP transits while retaining enough connections to outside the ISP to aid piece propagation and when possible work around bad ISP policy. Clearly localization would help performance for very large swarms, i.e., on the scale where multiple peers can be found on the same LAN or campus. Still wondering as to whether it makes a difference for the swarms we see today (usually less than 10000s of peers).
BTW do you have good research tools that you could contribute to the community? If so please please post about them in
Why not? You do have the peer IDs.
After several reads, I only noticed no peer id in the connect. I missed the peer id in the announce. With no peer id in the connect, >1 client behind a NAT creates potential ambiguity, but it doesn't happen enough to thwart identifying bad implementations.
I retract my request to add peer ids to the connect. Also ignore my comment about LTEP. That seems like too much of a change for too little advantage. If necessary, we can add an extension mechanism later and trigger it based on datagram length.
we can not identify mis-implementing clients in udp.
This is important. We should have some client id in the connect. It could be "SUGGESTED" for the client but "REQUIRED" for the tracker. This would provide backward compatibility.
Unfortunately the existing BitTorrent protocol has no clean way to specify client id. Instead we have the mess that is the peer id. I am not sure the best way to do this. Maybe put an LTEP-style bencoded dict appended to the end of the connect message (see LTEP http://www.bittorrent.org/beps/bep_0010.html)?
LTEP would solve extensibility and client id in one step.
I think adding protocol features would be cool
If a UDP tracker reboots how does it handle loss of connection id state?
There is an error message in the protocol, but the message includes no response code. If connection id state is lost, we should have a standardized response code that tells the client that it needs to obtain a new connection id. Ummm... Is my interpretation correct?
I don't like the standardized 2 minute time-to-live for the connection id. If we allow longer term connection id's, the overhead would be small to just return an error when the connection id is no longer valid thus forcing the client to update its connection id. It wouldn't require changing any message in the protocol except the error message and it would work with existing clients that update more often.... although adding an error code to the error message would mess up error messages on old clients. Is that acceptable?
3 GB per month. You are likely to hit that limit if you just watch a few TV shows online per week.
I would bet that BitTorrent does count toward that limit.
I'm curious. Who imposes such a low limit? Tell me it's not somewhere inflicted by monopoly control over broadband Internet access.
we (Az) are using autoudp in addition to udp:// support,
The overhead and latency introduced by autoudp for non-udp trackers annoys me, but the overhead and latency are not large. I personally would do as jlouis suggests and try HTTP first, but there is an easy way to solve this impasse: leave the order up to the client implementor.
Auto-udp requires no new messages and thus could be specified as an implementation suggestion to ease adoption.
There should be some explicit way of specifying UDP support, udp:// + multitracker appears satisfactory.
Olaf, could you describe both in the BEP?
From the BEP:
Do not expect packets to be exactly of a certain size. Future extensions could increase the size of packets.
If I understand the writing correctly (reading a bit between the lines), extensibility is supported by changing the size of the payload in the UDP datagram in the handshake. This may lead to packet changes elsewhere.
This is a bit odd. To disambiguate when multiple extensions are present, the first thing we would add is some other extension mechanism (e.g., bitfield, extension headers, bencoded key-value pairs).