forum.bittorrent.org

BitTorrent.org community

You are not logged in.

Announcement

Forums are closed. Use the new mailing list! https://groups.google.com/a/bittorrent.com/forum/#!forum/bt-developers

#1 2012-08-19 13:41:34

sg
Member

Extension for exchanging webseed urls

In August of 2012 The Internet Archive started to make millions of legally archived files available to the public. Prior the files where only available via a regular http download. This is not going to be the last archive that starts to make its content accessible through the bittorrent protocol to save bandwidth and costs in distribution.

Those archives rely on the fact that at least one intact copy of the file is available somewhere. If the archive is destroyed all files vanish as well. Bittorrent's core strenght is about distributing copies. If another party wants to mirror an archive to help avoid a disaster if either of one goes down the only way for the second party is to run a regular client. For a handfull of files this might pose a good solution but there is no good way for mirroring millions of files.

As of now webseed urls have to be hardcoded into the torrent file itself which means it is now possible for a client to discover an additinal mirror which might have been created after the torrent file was created. Without creating a new torrent there is no way to add new webseed urls to a torrent file.

This is where the extension for exchanging webseed urls comes into play. Like exchanging additional trackers a client might know about, we exchange additinal webseed urls between clients. This way a client could announce new webseed urls without changing the original torrent.

This ensure that other archives can participate in the original swarm as webseed and without introducing new torrents themselves.

I want to gathering thoughts on this proposal and start writing an official BEP if this addition seems useful.

Offline

#2 2012-08-19 15:50:59

DreadWingKnight
CBTT/BNBT Developer

Re: Extension for exchanging webseed urls

I've been in previous discussions for this.

Every time it gets shot down due to DDoS concerns.


Guy with a few torrent programs under his belt.

Offline

#3 2012-08-19 20:12:54

sg
Member

Re: Extension for exchanging webseed urls

Did I get you correct that passing around urls via clients could become a possible source of a DDoS attack toward the server mentioned in the url?

If yes than BEP-28 (Tracker Exchange) should pose the same weakness then? Clients would start to announce to a non tracker http server resulting in the same effect. Basically I would say that the tracker exchange is usable for the same attack but still it was adopted.

The only counter against a possible DDoS attack from the BEP-28 point of view is:

Trackers discovered through this protocol SHOULD be treated with a certain amount of suspicion. Since the source of a tracker exchange message cannot be trusted, an implementation SHOULD have a lower number of retries before giving up entirely.

Are there any other concerns beside a possible DDoS Attack?
How is this dealt with in BEP-28?
Do the benefits of mirroring for archives outweigh the threat?

Last edited by sg (2012-08-19 20:35:06)

Offline

#4 2012-08-20 04:06:17

DreadWingKnight
CBTT/BNBT Developer

Re: Extension for exchanging webseed urls

BEP28 does have the same weakness, and it's also why the implementation of bep28 is as low as it is. (I'm not aware of any client that implement it).

Archives like the one you listed are actually better suited to have the webseeds coded into the torrent than exchanged through the network of peers, since the list only needs to be sent out once instead of repeated out multiple times through the course of the download.

Without creating a new torrent there is no way to add new webseed urls to a torrent file.

It's actually trivial to add webseeds to a torrent immediately prior to a user downloading the .torrent file without rebuilding it completely because the webseeds aren't in the info dictionary.

I'm not sure if the benefits outweigh the threat, but given the number of low-end hosts out there, they probably don't.


Guy with a few torrent programs under his belt.

Offline

#5 2012-08-20 05:13:53

sg
Member

Re: Extension for exchanging webseed urls

I'm not aware of any client that implement it

libtorrent, qtorrent and uTorrent do. But I also do not know if this can be called widespread. I think BEP-28 serves as a good base how this should be implemented but they target completely different goals. While BEP-28 wants to get you more peers, the webseed exchange extensions wants to keep torrents alive by providing mirrors for existing torrents. Those mirrors do not necessarily need to know about each other. We should not confuse these goals because the implementation might be similar.

It's actually trivial to add webseeds to a torrent immediately prior to a user downloading the .torrent file without rebuilding it completely because the webseeds aren't in the info dictionary.

Thats right. Creating a new torrent would work, but this would mean that every mirror has their own torrent file for the same data. If you download from one archive you would not be able to know about the other mirrors as well. With the extension it would be possible to extend the lifetime of a torrent without recreating the torrent on a site the user might not know about.

I'm not sure if the benefits outweigh the threat, but given the number of low-end hosts out there, they probably don't.

Please explain a bit more what do you mean by low-end hosts and that affects mirroring?

Offline

#6 2012-08-20 06:51:41

DreadWingKnight
CBTT/BNBT Developer

Re: Extension for exchanging webseed urls

libtorrent, qtorrent and uTorrent

I don't think uTorrent actually implements bep28 (Link the specific changelog that does), and it really is bad to implement anyway given how DHT and Peer Exchange cover the role with a lot less DDoS potential.

If you get a popular enough torrent to propagate the webseed address, you can actually permanently disable access to a lot of low-budget hosting packages out there, either by sheer number of requests per second or by the volume of requests.


Guy with a few torrent programs under his belt.

Offline

#7 2012-08-20 08:27:59

sg
Member

Re: Extension for exchanging webseed urls

I don't think uTorrent actually implements bep28

I got the list from a transmission trac ticket.

If you get a popular enough torrent to propagate the webseed address, you can actually permanently disable access to a lot of low-budget hosting packages out there, either by sheer number of requests per second or by the volume of requests.

I do not think that we should let a feature slip because we worry about low-budget hosting packages. Webseeds do not have the same priority while downloading as a regular client (at least in libtorrent). If enough clients exists the webseeds are not touched.

As you said it is possibly to recreate a torrent file with a webseed adress and low-level hosting packages still exists. I see your point with the DDoS Attack but I think the problem comes from the idea of using webservers in the first place to bootstrap a torrent.

Serving millions of torrent simultaneously from a regular host is not an option with a regular client due to RAM and CPU constraints. Bootstraping a million torrents from a webseed is something that can be done via regular http hosts without a problem as the host does not need to be part of the swarm itself.

Without the WEX (webseed exchange) a mirror would need to run a regular client and participate in the swarm (eg. not possible for millions of torrents). With the WEX a mirror would only need to run a http server and a lightweight client that only seeds the additional webseed urls.

Simply dynamically propagating these webseed urls does not pose a bigger threat than recreating the torrent with a new webseed address but opens up a few new options where the protocol could help preserve cultural goods.

Offline

#8 2012-08-20 10:26:33

DreadWingKnight
CBTT/BNBT Developer

Re: Extension for exchanging webseed urls

I already have in-the-wild code that can dynamically update mirrors in a torrent when the torrent host is made aware of them.
Spreading urls like that via exchange adds up to a lot of wasted bandwidth as well, since they get relayed at frequent intervals instead of only once.

And transmission should only be considered authoritative on transmission features and not for other clients. I just checked with my contact in the uTorrent development team and bep28 isn't actually supported by uT.


Guy with a few torrent programs under his belt.

Offline

#9 2012-08-20 10:27:55

DreadWingKnight
CBTT/BNBT Developer

Re: Extension for exchanging webseed urls

And besides, only the new peers that get the torrent after the new webseeds are added actually need the webseed address. Others don't need them because they can just get the pieces from the existing peers.


Guy with a few torrent programs under his belt.

Offline

#10 2012-08-20 12:07:12

sg
Member

Re: Extension for exchanging webseed urls

I already have in-the-wild code that can dynamically update mirrors in a torrent when the torrent host is made aware of them.

This is practical for a centralized system where only one instance gives out the torrent. If you have multiple sources for the torrent and not all are aware of each other this might become impractical since the webseed url is not the url to the services giving out the torrent.

Spreading urls like that via exchange adds up to a lot of wasted bandwidth as well, since they get relayed at frequent intervals instead of only once.

I guess the bandwidth needed to spread a url is neglectable. As stated in BEP-28 the url is only sent once after that only updates are propagated. The url is not sent in intervals.

And besides, only the new peers that get the torrent after the new webseeds are added actually need the webseed address. Others don't need them because they can just get the pieces from the existing peers.

Thats true but we talk about a different kind of files. There is a lot of content in archives that should be kept around but does not have a constant swarm. It is not like one of these files have thousands of clients in the swarm. Here are some stats from the internet archive about its swarm size for different torrents.

Let me sum up your concerns with the webseed exchange:

  • Possible DDoS Attack from clients hammering a webseed

  • Exchanging urls results in wasted bandwidth

Thank you for your valuable input DreadWingKnight. I would also like to here some other voices from the community regarding a webseed exchange extension.

Offline

#11 2012-08-20 21:04:46

sg
Member

Re: Extension for exchanging webseed urls

Another scenario where webseed exchaning would be useful:

BEP-19 (WebSeed - HTTP/FTP Seeding (GetRight style))
In the main area of the metadata file and not part of the "info" section, will be a new key, "url-list".

BEP-9 (Extension for Peers to Send Metadata Files)
This extension only transfers the info-dictionary part of the .torrent file.

This would mean that any client who joins through a magnet link would not see the webseeds at all since the url is not transferred. This would mean that any source who gives out magnet links must have a client present in the swarm to proxy a connection to the webseed. This is a different kind of bandwidth burden than to have a client present to only exchange the metadata.

Offline

#12 2012-08-28 09:24:33

Fredrik Neij
Member

Re: Extension for exchanging webseed urls

My suggestion would be to add director URLs instead of direct URLs, assuming that clients accepts 301 and/or 302 responses from the server to guide it to the correct file. Like this for example:

http://XX.director.domain.com/path/filename.bin
http://X.director.backupserver.com/path/filename.bin
http://XXXX.director.lastresort.com/path/filename.bin

Where X is is the first 1 to 4-ish bytes (or more) of the info_hash in hex, to allow for some DNS trickery to do load balancing, doing something like this:

*.director.domain.com IN CNAME realserver.com.

or

0.director.domain.com IN CNAME 1.realserver.com.
1.director.domain.com IN CNAME 1.realserver.com.
..
7.director.domain.com IN CNAME 2.realserver.com.
8.director.domain.com IN CNAME 2.realserver.com.
..
E.director.domain.com IN CNAME 3.realserver.com.
F.director.domain.com IN CNAME 3.realserver.com.

to spread the load as needed. And keep the backup domains out of DNS until needed to save resources.


But the real solution in my opinion is: PER FILE HASHES!, preferably more than one, like sha1,md5 and what else is popular. Yes md5 is weak and prone to collisions and sha1 is getting there, but it's the most popular hashes on the web to "sign" or "verify" downloads, and therefor the most likely way to locate the same file.

Having the info-dict a client can use range request to offset the start to the first unique piece in the info-dict and start http-downloading and compare, and ofcourse only continue of the data match.

This opens up a world of possibilities to find matching contents from other filesharing technologies to automated web searches of the hashes to find sources. Like searching for .md5 or .sha1 files like these https://www.apache.org/dist/apr/ and trying the underlying files.

And most importantly, just like DHT searches for torrent hashes, it could be used to search for file-hashes to find peers to get the same file from even if they are from different torrents. This will be especially useful when the same contents are available in multiple swarms, like when different creators add idiotic signature files in addition to the contents to each torrent.

Offline

#13 2012-08-28 10:18:02

sg
Member

Re: Extension for exchanging webseed urls

Seems to me that you are suggestion a totally new way how web seeding should be done. This thread should focus on the pros and cons of just "relaying the webseed urls in the same way it is done with tracker links".

Maybe you should start a discussion around your idea by creating an entirely new topic for it.

Offline

#14 2012-10-09 16:39:23

arvid
Administrator

Re: Extension for exchanging webseed urls

tracker exchange is not widely implemented in uTorrent mostly for the reasons dwk mentions. It's also of questionable value to add a lot of trackers.

I don't doubt that it's possible to have a conservative specification to avoid  DDoS attacks. It might be tricky though.

However, I also don't doubt that it's possible to bittorrent-seed a million torrents (especially not in the future).

Offline

Board footer

Powered by FluxBB