You are not logged in.
Pages: 1
The Fast Extension is implemented in later python versions of BitTorrent Mainline and partially in uTorrent. uTorrent implements all messages in the Fast Extension but only half of the semantics of the Allowed Fast part of the Fast Extension. uTorrent requests fast pieces, but it will not offer them.
See http://www.bittorrent.org/beps/bep_0006.html
The purpose of the Allowed Fast part of the Fast Extension is to reduce the time it takes for peers to ramp up download rates in the swarm by having existing peers send a few pieces "free" without requiring the peer to be unchoked.
Among developers at BitTorrent, Inc., it is believed that Allowed Fast might significantly improve the speed peers ramp up, but little has been done to characterize the effects of Allowed Fast either in the wild or in large scale simulation.
We call for academics to study this problem or propose modifications that might improve BitTorrent's ramp-up times. This is of particular importance for smaller files that currently gain little advantage from BitTorrent distribution. Until some experience is gained into the benefits or drawbacks of the Allowed Fast Extension, it will not progress beyond Draft status.
Offline
Note: as suggested by theshad0w, this could also be used to implement superseeding more dynamically.
Az dev
Offline
Note: as suggested by theshad0w, this could also be used to implement superseeding more dynamically.
You can see his implementation notes for details.
- dak180
BaenCD Torrents
Offline
I asked John Hoffman (shad0w) to put the suggest-based superseeding in a BEP.
Offline
This is one of the more important and least commented on BEPs, so I'll give my thoughts on it.
This is really a combination of four different extensions, so before going over the extensions themselves I'll go over the philosophy of extension generally, then go over the individual extensions.
Support for fast extensions is specified with a single reserved bit. Other developers have overwhelmingly indicated that they prefer declarations of support to be based on strings rather than reserved bits, so I will reluctantly say that the preferred style for adding extensions in the future should be via those mechanisms, and that maybe an alternative way of declaring support for the fast extensions via such a mechanism should be added as well. However, I would like to emphasize that such string-based methods don't lessen the burden of maintaining semantic compatibility between implementations, which is particularly tricky in BitTorrent because it has more subtle semantics than most networking protocols, nor does it help with the problem of maintaining a reasonably consistent set of supported extensions across different implementations, which can only be achieved by sticking with extensions which are clearly a good idea and which are made as simple to implement as possible. The benefit of string-based declarations in practice is very narrowly limited to avoiding namespace collisions in experimental extensions, something it does very well, but it's important to realize that that's the only thing it really helps with.
The four extensions are: state machine reworking, have all, suggest piece, and allowed fast. All of these are clearly improvements to the protocol which can be trivially supported (or at least null supported, which I'll explain later), making there be little reason to not do them except for implementation difficulty. To that end, it could be that support for have all and suggest piece are being held back by being lumped in with the implementation difficulty of the state machine reworking, which could be an argument for allowing support for them to be declared separately, but it seems dubious to add a complex set of logic for the declaration of support for what should eventually be a ubiquitous set of functionality, much like how non-support for the 'compact' extension is considered outright malbehavior today. The allowed fast extension requires the state machine reworking to even make sense, so supporting it separately would be nonsensical.
Okay, that's enough philosophy.
The state machine reworking is the most fundamental of the changes, so I'll start with that.
Classic BitTorrent protocol is, to put it simply, buggy. Perhaps 'buggy' is a bit strong of a word, because the issue was understood and accepted since inception, but the fact remains that the fast extension state machine is unambiguously better than the classic one. Given that it's fairly trivial to write a backwards compatibility layer to allow a client which is written for the fast extension state machine to talk to one which doesn't (you basically emulate the bug, which I'll explain in a bit) any new codebase written today should clearly be written with the new state machine from the ground up, and existing codebases should be rejiggered to work the same way. Perhaps the protocol specification should be modified to specify using compact and the fast extensions as normal behavior, and support for not using them as a backwards compatibility feature (or in the case of compact, simply something not to be done).
The bug in the classic state machine has to do with what happens when a choke message arrives. As you may recall, at that time any pending requests are assumed to have failed. The problem is that they may not have. If data transfers are happening bidirectionally then it's easy for chokes and requests to be so backlogged behind data that a choke and a later unchoke arrive while a request is still in transit. It's also surprisingly easy to have implementation quirks in choking logic which cause a choke to be sent followed almost instantly by an unchoke. Whatever the cause, the result is that data is sent which isn't expected, and compounding the problem is that the downloading side will generally re-request the exact same data which is already in transit, causing an outright waste of bandwidth. Similar problems plague usage of cancels, because it's hard to guess whether a cancel actually worked. There are tricks for mitigating these problems, but they're hackish, complicated, and unreliable. The new state machine fixes these problems the 'right' way, and results in much cleaner, more reliable code.
So, how is the bug fixed? With the fast extensions, the choke message loses its semantic meaning, and becomes largely advisory. Instead of an implicit rejection of pending requests, there is an explicit rejection of pending requests, implemented with the new reject message. Every request message must now get exactly one response, either a payload carrying message or a reject. A simple way to implement this on the uploading side is after sending a choke to immediately send rejects for all requests you've gotten but haven't responded to so far, although a nice heuristic is to wait a few seconds before actually doing the rejecting, in case you decide to unchoke and don't wish to incur the latency (and potential throughput) issues of requiring re-requests. Likewise any requests while choking can be immediately responded to with rejects, but if you're being clever you can wait a few seconds before sending the rejects. Note that sending the rejects before the choke is clearly broken behavior. Although choke and unchoke are technically advisory, choke meaning 'if you send a request now I'll probably reject it' and unchoke the opposite, they're semantically weighty enough that they need to be taken seriously. For example, having logic to send requests while choked can easily result in a short circuit of re-requesting, and likewise for sending rejects while unchoking. There may be situations in which those behaviors are reasonable (for example requesting pieces from the allowed fast set, described below) but such things should only be done very carefully, with great consideration of their possible consequences.
Please be aware that the fast extensions can help expose bugs in your codebase. If you implement fast extensions and it causes exposure of some bad bugs in your code, don't blame fast extensions, fix your damn code. For example, bugs which cause requests to simply be ignored forever tend to be sort of covered up with the classic state machine because there's almost always a choke eventually, and that caused the request to be implicitly dropped, thus accidentally synching up the two sides. I know of at least one case of a BitTorrent implementation which intentionally ignored requests if there were 'too many' of them. That's extremely broken behavior, don't do it. If the other side really does send so many requests that you think they could only possibly be doing that as a result of malbehavior (like, in the thousands or tens of thousands, less than that and it uses up so little memory that worrying about it is stupid) then you should close the entire connection.
The best way to implement backwards compatibility support for the fast extensions is with a bug emulation layer. Basically, write your higher level code so that it speaks the fast extensions natively, then for connections which go to non-fast peers, have lower level code which keeps track of the state of that connection and sends rejects for all outstanding requests after chokes and throws out piece messages which don't have outstanding requests. Technically speaking, this may harm some extant hackish workaround code which can make use of piece messages for unexpected pieces, so it may be not unreasonable to maintain support for those by not throwing unexpected messages in the bit-bucket but rather by sending back the message with an indication that it came from a classic peer so there's no need to close the connection as a result of the unexpected data, but really the goal here should be to make all peers speak the fast extensions and trying to maintain baroque code to deal with that edge case sounds like more trouble than it's worth.
Phew. Okay, now on to the easier extensions.
Have all/none is a straightforward and unambiguously good extension. It can improve startup times a lot, it turns out that all those bitfields can really add to the time it takes to start some real data transfers, and under some circumstances the bitfields make up a significant fraction of all data sent in a swarm, especially when there are a lot of seeds. I expect this to be uncontroversial, with the most likely complaint being that it's lumped in with the state machine reworking.
Allowed fast is a clearly good extension, with the very tangible benefit that it can speed up time to begin transfers for new peers, especially for underseeded torrents. It may sound like there's some potential implementation difficulty, but that's really a non-issue. A 'proper' implementation isn't all that hard, and if you're at all intimidated by it then a perfectly reasonable stepping stone is to 'implement' allowed fast by never sending any allowed fast messages and ignoring any allowed fast messages which you get. That won't cause any problems.
Suggest pieces is actually the most semantically ambiguous of the extensions. The problem is that it's unclear how strong a suggestion suggest is, and it's also unclear whether it should persist indefinitely. Rather than trying to come up with a more complicated set of semantics to support such functionality, such as a retract suggestion message, we instead decided to simply roll out a least common denominator suggest pieces message, on the assumption that reasonable conventional semantics for it would get settled on over time. I believe that uTorrent's behavior is to go with rarest first for piece selection, but if there's a tie to pick one in the suggested set first as a tiebreak. I believe that this is a reasonable behavior, but am curious to hear others's thoughts on the matter.
Offline
but it seems dubious to add a complex set of logic for the declaration of support
actually, the "complex set of logic" is already there in clients that speak LTEP/AZMP. For us it's actually more of a hassle to do bitfield magic than just adding another message via the extension protocols.
And i must say you're overestimating the impact of the fast extenions. Yes, they do work. Yes, they are a cleaner design. But the workarounds accumulated over the years basically make "classic" bittorrent perform just as well except in edge cases. This lowers the incentive to open the can of worms that is implementing the fast extensions while maintaining legacy support.
What i do like about the new state machine is that introduces fail-fast behavior, which forces devs to be more strict about their implementation.
Last edited by The 8472 (2008-08-25 13:57:22)
Az dev
Offline
actually, the "complex set of logic" is already there in clients that speak LTEP/AZMP. For us it's actually more of a hassle to do bitfield magic than just adding another message via the extension protocols.
I'm okay with these being introduced via an extended extension mechanism, the complex logic *I* was referring to was mix and matching different new features, a more general concern I just posted about on the LTEP thread.
And i must say you're overestimating the impact of the fast extenions. Yes, they do work. Yes, they are a cleaner design. But the workarounds accumulated over the years basically make "classic" bittorrent perform just as well except in edge cases. This lowers the incentive to open the can of worms that is implementing the fast extensions while maintaining legacy support.
Those edge cases really do happen. Some heavily seeded swarms spend 15% of their bandwidth on have bitfields. And bandwidth wastage from resends really does happen. When we went and applied the fast extensios to utorrent, we were surprised how much they cleaned up and how much easier everything was to deal with afterwards. Those work-arounds you refer to are quite clunky and difficult to maintain as new features are added.
Allowed fast in particular is a very new feature, not just a cleanup. In principle one could do a hack-around to the classic state machine to make it support allowed fast, but it's just plain easier to implement by doing it right.
What i do like about the new state machine is that introduces fail-fast behavior, which forces devs to be more strict about their implementation.
Yes, some of the biggest benefits are to developer time.
Offline
A minor point of clarification - in BEP 6 the signature of REJECT is given as
*Reject Request*: <len=0x000D><op=0x10><index><begin><offset>
should <offset> not read <length> ?
Offline
Yes.
Offline
We're considering adding a full-blown allowed fast implementation as a test. To test it, we're considering having the behavior that each peer flips a coin to determine whether to do allowed fast downloading. Some extra data is added to tracker reporting to tell the tracker whether the given peer is doing allowed fast downloading, and how long it took to get transfers ramped up, most likely by reporting how long it took to reach some threshold of uploading and downloading. Trackers could then collect that data and use it to generate histograms of ramp-up times for peers using and not using allowed fast, for various swarms in various stages of their life cycle.
So the questions are:
Would anybody mind if we conduct this test on publically deployed clients? (I'm guessing the answer is 'no', since it doesn't actually make clients behave much if at all outside the bounds of what they normally do.)
Would any trackers be willing to give us stats reported to them, so we could get data on this test? (Presumably yes, but I'd like to hear volunteers.)
Offline
Stats can be sanitized: infohashes, ips, and peer ids remapped to unique ids.
In addition to the stats mentioned by Bram, I think we will also have to report piece size and number of pieces in the torrent. Even if infohashes were not sanitized, there is no guarantee we could obtain the info dicts for the infohashes in the log.
Offline
A minor typo in the BEP: In the `Allowed Fast Set Generation' section, the step numbers in the C++ implementation are out of sync with the step numbers in the canonical algorithm.
Offline
After crawling through the forum again i re-read the whole thread. After some thinking i've come to the following conculsion:
The fast extensions are a cleaner design and basically should have been part of bittorrent from day 0 on. They weren't, thus adding them into a legacy system is work since you have to maintain the old and new logic, which can be bug-prone. Another legacy system (in my eyes) is the basic bittorrent peer-peer protocol with its reserved bitfield on which the fast extensions build.
Personally i'm avoiding anything related to the bitfield like pestilence, if the fast extensions were implemented ontop of LTEP/AZMP i'd be more willing to implement them.
Az dev
Offline
Pages: 1