BitTorrent.org community
You are not logged in.
I believe that I may have found a serious issue with µTP.
As it currently stands, µTP does not allow repacketization. Once you've sent a packet, you've committed to either sending it unchanged to the other end, or dropping the connection. This is unlike TCP, which allows you to either split a packet when resending it, or to merge an already-sent packet with new data.
The main issue is with PMTU discovery (RFC 1191). When doing PMTU discovery, a node occasionally sends packets that are too big for the path being used; a router will then reply with a "packet too big" ICMP message, requesting that the sender send smaller packets. Reacting to the packet too big message requires repacketising.
Consider for example that your local MTU is 1500 octets (you're on Ethernet), but your PMTU to the Internet is 1480 (there's an ADSL line in the way). You send one 1500 octet packet, followed with a 500 octet one. The ADSL router sends you a "packet too big" message for the 1500 octet packet, but the 500 octet packet gets through, and is SACK-ed by the other endpoint. At this point you're stuck: you need to send a 1500 octet packet to fill the hole, but the router won't allow you to do that.
Assuming that I am right, we'll need to think about a solution. Here are a few ideas.
1. Forcibly disable PMTU on µTP sockets.
This can be done by using the IP_MTU_DISCOVER setsockopt.
This is not a good idea, even in IPv4. See Fragmentation considered harmful, by Kent and Mogul, 1995.
2. Limit UDP datagram payload to 1024 octets.
This fits in 1280-octet packets, which is the minimum MTU in IPv6. Note that anything more than that will fail in a very common case -- Teredo uses 1280.
This is not a good idea. We definitely want BitTorrent to be able to use the full MTU on high-speed links.
3. Extend the µTP protocol to allow for application-layer fragmentation.
Add a new fragment message to µTP that allows sending a single µTP message in multiple UDP datagrams, and thus perform application-layer fragmentation.
This might be a good idea, but it complicates the protocol, and makes an incompatible change.
4. Don't perform PMTU discovery on resent packets
Send each packet initially with PMTU discovery set, and resend with PMTU disabled.
This has the same cost as (1) above, but only on resent packets. It complicates the code. Additionally, toggling PMTU discovery on and off repeatedly might have some unpleasant effects on the performance of the system (it might, for example, flush the system's PMTU cache).
5. Use (3) with new implementations, and (4) with old ones.
Negotiate the use of a protocol extension at connection establishment, then use either (3) or (4). Complicates the code enormously, has all the flaws of (3) and (4), except for the lack of compatibility with deployed implementations.
--Juliusz
Offline
Thinking about it some more, I'm coming to the conclusion that my favourite is
6. Redesign µTP incompatibly.
µTP is a new protocol. We might as well get it right (with byte-range ACKs).
--Juliusz
Offline
you only get the ICMP packet too big if you set the don't-fragment bit, right? Which is not set by default.
The main advantage of acking packets instead of bytes is that the EACK message can be very very tiny and ack many more packets than a TCP SACK can (within reasonable space). It also keeps certain things simple, no need to worry about overlapping ranges for instance. Before allowing packets with arbitrary byte ranges, I would explore all alternatives to solving the PMTU discovery problem.
Even though IP fragmentation should be avoided because of performance reasons, it seems like it could be a reasonable solution to the scenario you talk about, while the PMTU is still being figured out.
Also, keep in mind that uTP starts off using 300 byte packets, and increases it gradually if the serialization delay appears to be below a certain threshold. In uTorrent, the connection does not increase its packet size more often than every 10 seconds. This allows plenty of time to first probe the new packet size before switching to it.
Offline
you only get the ICMP packet too big if you set the don't-fragment bit, right? Which is not set by default.
It is set by default on all modern systems. In the case of Windows, it is set in XP, 2003, 2008 and Vista.
--Juliusz
Offline
Solution:
(1) When the client gets a "packet too big" message, reduce the packet size.
(2) If a connection becomes dysfunctional due to a skipped packet, just let it drop and re-establish.
Done.
Offline
If a connection becomes dysfunctional due to a skipped packet, just let it drop and re-establish.
This is the sort of workaround that is commonly used with a deficient legacy protocol before it is fixed. It wouldn't expect such a hack to be needed with a new protocol.
--Juliusz
Offline
jch wrote:
If a connection becomes dysfunctional due to a skipped packet, just let it drop and re-establish.
This is the sort of workaround that is commonly used with a deficient legacy protocol before it is fixed. It wouldn't expect such a hack to be needed with a new protocol.
It isn't. Because the IP protocol can do fragmentation.
Offline