forum.bittorrent.org

BitTorrent.org community

You are not logged in.

Announcement

Forums are closed. Use the new mailing list! https://groups.google.com/a/bittorrent.com/forum/#!forum/bt-developers

#1 2008-03-04 13:00:43

dave
Editor

Discussion of BEP 5: DHT

The DHT protocol has been implemented with some variation between BitComet, Azureus, BitTorrent Mainline, uTorrent, and others. 

See http://www.bittorrent.org/beps/bep_0005.html

In particular the Azureus DHT is different from the DHT introduced by mainline.  Would anyone on the Azureus team care to document these differences or post a BEP for the Azureus DHT?

Offline

#2 2008-03-04 15:39:29

The 8472
Azureus Developer

Re: Discussion of BEP 5: DHT

Listing differences would not be adequate as it is a completely different implementention form the ground up, but here we go:
-we use a messaging protocol similar to the UDP tracker protocol
-messages are binary format (not bencoded)
-we also have UDP-based, congstion-controled transports in the DHT for bulk-data-transfer from nodes that advertise such data (such as .torrents for magnet links), which also allows sharing the UDP port with other things, by using different initial connection IDs
-there is the vivaldi coordinate system as RTT distance estimating for nodes we haven't seen before
-there us UDP hole punching for nat traversal (especially for said bulk transports)
-the implementation is closer to the kademlia spec as enhanced features like value propagation and replication on node churn is used
-there are several measures in place to cope with the britney effect, i.e. hotspots due to popular keys
-various precautions to prevent routing table poisoning attacks and other malicious behavior

and... those are just the things that come to mind right now. I recently helped optimizing an implementation of the mainline DHT as an azureus plugin, and i must say the complexity and featureset are vastly different.
Documenting it so sufficiently that others could implement properly would be a difficult task as many of the features build on each other.


Edit:

The draft should include a currently non-standard feature used by µT to my knowledge that indicates vendor (and version) of the used client. This could be used to notify developers of faulty implementations to fix their problems. It's annoying to look at some implementations seriously misbehaving and not being able to do anything about it.

Last edited by The 8472 (2008-03-04 15:55:53)


Az dev

Offline

#3 2008-03-05 15:25:25

dave
Editor

Re: Discussion of BEP 5: DHT

It sounds like a lot of effort has been put into the Azureus DHT.  Admittedly much less effort has been put into the BitTorrent mainline DHT.  Having a single DHT would benefit the community as it would not partition the users downloading the same file.  However replicating all of the features in the Azureus DHT may be quite a hurdle for client developers.  Does anyone on the Azureus team have a desire to formalize a subset of the features that would enable a functional single DHT without breaking your current implementation?

Offline

#4 2008-04-21 12:17:36

The 8472
Azureus Developer

Re: Discussion of BEP 5: DHT

A few suggestions about the DHT spec:

a) gets/announces

The token on getPeers replies should be optional and an algorithm when to provide it should be specified. This algorithm should enforce a minimum announce-interval and closeness to the destination key. This will prevent announce-put-hammering and storing announces in the wrong place.

In addition the either peers or nodes response policy ("If the queried node has no peers for the infohash, a key "nodes" is returned containing the K nodes in the queried nodes routing table closest to the infohash supplied in the query.") should be changed to: always nodes, peers if available. This should improve routing performance, especially if the node cannot supply a sufficient amount of peers to satisfy the requesting peer, although the packet size might become problematic then... any opinions?
It might also be useful to completely withhold any peers information - even if available - if the specific node already sent a getPeers request within a certain min interval.


My rationale for those proposals is that I'm seeing inefficient implementations: after 13 hours of uptime my node sent the following requests:
16.4k find nodes, 1.8k get peers, 0.4k announces

but i received
118k find nodes, 91k get peers, 43k announces

I'm suspecting bitcomet and/or xunlei here.


Another issue i'd like second opinions about is this:

[20:44:27] Node C40ED4C6 92F75BA6 D5C76174 05A4C0B5 6654777D changed address from /89.102.57.229:8488 to /91.145.65.181:22920
[20:44:36] New node /218.170.53.140:24364 claims same Node ID (C41A5BBE 50D93143 EDC08374 230540D6 72F610BF) as /124.13.91.107:8552 ; node dropped as this might be an impersonation attack
[20:45:10] New node /190.47.106.148:16138 claims same Node ID (C42CBC5E 756C404C 800E5657 B5AEFB1C 8AF8F8CD) as /61.15.174.158:25125 ; node dropped as this might be an impersonation attack
[20:45:14] Node C42BC344 D8082284 3F98603B 03CD7A04 A5388C1A changed address from /62.195.43.197:53804 to /79.114.153.213:14050
[20:45:47] Node C802C403 922FD99F 92FE2912 2867B8AA 4E8A0836 changed address from /89.77.104.146:55147 to /89.77.104.146:55170
[20:45:53] New node /212.233.229.17:3203 claims same Node ID (C4EEEC34 8930B75F A44B61B0 FC9CA196 991E58FC) as /81.88.114.195:42804 ; node dropped as this might be an impersonation attack
[20:46:02] New node /125.121.41.5:20639 claims same Node ID (C4EEEC34 8930B75F A44B61B0 FC9CA196 991E58FC) as /81.88.114.195:42804 ; node dropped as this might be an impersonation attack
[20:47:23] New node /87.110.24.186:8815 claims same Node ID (C449ACE1 952A488A 64AA4A3D 59D9D0B4 1F4A7039) as /77.93.13.116:13918 ; node dropped as this might be an impersonation attack
[20:48:41] New node /121.230.227.196:19181 claims same Node ID (C407FA1E 2B46223A 5A72B091 905D868F 5AF429B1) as /123.5.212.236:10796 ; node dropped as this might be an impersonation attack
[20:48:41] New node /89.176.220.30:21473 claims same Node ID (C43A3790 49C294D7 94BC5BAD DBD297B7 B78ACD42) as /88.121.147.159:12865 ; node dropped as this might be an impersonation attack
[20:48:48] New node /87.119.118.34:64469 claims same Node ID (C4173A27 5134D7BB 98916C39 A61513E9 ED7A7067) as /61.141.156.4:20884 ; node dropped as this might be an impersonation attack
[20:48:53] Node C41A5BBE 50D93143 EDC08374 230540D6 72F610BF changed address from /124.13.91.107:8552 to /87.247.249.150:9983
[20:49:20] New node /209.6.18.239:18003 claims same Node ID (C41D4812 04B2CDF9 2BEA5CE5 DDA9DE69 78A6B7A5) as /200.172.105.110:60013 ; node dropped as this might be an impersonation attack
[20:50:52] Node C41A5BBE 50D93143 EDC08374 230540D6 72F610BF changed address from /87.247.249.150:9983 to /218.170.53.140:24364
[20:51:23] Node C41A8EFB C2F7AEB5 8AFACD1E 3E50B5E8 9A53A763 changed address from /89.215.185.163:12836 to /201.235.141.142:19850
[20:52:32] Node C8312C88 BDAFD659 18763B3E 7385EDC9 2A99D2AF changed address from /117.12.213.116:1108 to /117.12.213.116:1031
[20:52:52] New node /156.34.90.179:9544 claims same Node ID (C407FA1E 2B46223A 5A72B091 905D868F 5AF429B1) as /123.5.212.236:10796 ; node dropped as this might be an impersonation attack
[20:53:18] New node /87.126.91.56:24670 claims same Node ID (C4EEEC34 8930B75F A44B61B0 FC9CA196 991E58FC) as /81.88.114.195:42804 ; node dropped as this might be an impersonation attack
[20:53:31] Node C414AE6E F9A99411 501B1257 7BEAFBDB 73B4C549 changed address from /80.7.244.102:1167 to /189.25.67.101:60531
[20:55:42] Node BBFF56AA 114779F6 D175C678 3AA6B2A2 B9A8EEBD changed address from /218.82.81.121:1999 to /58.33.109.207:1219
[20:55:49] Node C449ACE1 952A488A 64AA4A3D 59D9D0B4 1F4A7039 changed address from /77.93.13.116:13918 to /121.19.204.209:55382
[20:55:53] New node /58.8.125.16:19746 claims same Node ID (C43A3790 49C294D7 94BC5BAD DBD297B7 B78ACD42) as /88.121.147.159:12865 ; node dropped as this might be an impersonation attack

This is a trace we generate when a packet with a different source address than the currently stored node arrives. When that happens we send a ping-probe to the original node to see if it's still alive. If it is then it's assumed to be an impersonation attack (to prevent bucket poisoning), otherwise the address is adjusted.

I'm not quite sure what's happening here, but i suspect that some clients are seriously misbehaving, as changing their IP to another continent (i did a few ipwhois lookups) while the old one is still available seems... let's say unlikely.
I'd like others to verify this issue (after all, my detection code could be faulty)... and any ideas why this is happening?


b) defining node timeouts

Unless i missed it the spec does not contain any values when nodes should be considered stale or bad. At least default guidelines should be specified.
For the mainline DHT plugin we're currently using:
last seen > 15 minutes = stale
failed requests > 8 = bad
last seen > 15 minutes && failed requests > 2 = bad

[bad = eligible for immediate replacement, stale = must be pinged before it is replaced]

c) things about kademlia that should be mentioned explicitly

replacement buckets/passive routing table maintenance, especially the parts mentioned in chapter 4.1 in the newer paper version (the BEP links to the older one)

Last edited by The 8472 (2008-04-21 12:43:53)


Az dev

Offline

#5 2008-04-29 12:55:24

dave
Editor

Re: Discussion of BEP 5: DHT

I have asked Drue Loewenstern, Greg Hazel and Arvid Norberg to review your comments about the DHT.

Offline

#6 2008-04-30 14:24:54

camrdale
Member

Re: Discussion of BEP 5: DHT

The 8472 writes:
> In addition the either peers or nodes response policy ("If the queried node
> has no peers for the infohash, a key "nodes" is returned containing the K
> nodes in the queried nodes routing table closest to the infohash supplied in
> the query.") should be changed to: always nodes, peers if available. This
> should improve routing performance, especially if the node cannot supply a
> sufficient amount of peers to satisfy the requesting peer, although the
> packet size might become problematic then... any opinions?

I had a similar idea for another Kademlia-based DHT I'm working on, and packet size did indeed become a problem. I solved it by breaking up the "get_peers" operation into two parts: "find_peers" would always return nodes as well as the number of peers it has for that key, and "get_peers" which would actually return the peers (could also add an optional parameter to "get_peers" specifying the maximum number of peers to return). The finding of peers is then a 2-step operation similar to the find_node/announce_peer operation, but it gives the client much more flexibility in which nodes to request peers from.

Offline

#7 2008-05-03 11:26:29

The 8472
Azureus Developer

Re: Discussion of BEP 5: DHT

I collected further evidence that we should guard more against crappy implementations, for example libtorrent seems to send announces multiple times (already notified arvid about this):

[20:10:42] RPC received message [123.15.0.165] Method:ANNOUNCE_PEER Type:REQ_MSG MessageID:C version:LT targetKey:C41A4833 825A4B8C 26090FBD 053628A0 D22F319C (17)
[20:10:42] RPC send Message: [123.15.0.165] Method:ANNOUNCE_PEER Type:RSP_MSG MessageID:C
[20:10:44] RPC received message [123.15.0.165] Method:ANNOUNCE_PEER Type:REQ_MSG MessageID:t version:LT targetKey:C41A4833 825A4B8C 26090FBD 053628A0 D22F319C (17)
[20:10:44] RPC send Message: [123.15.0.165] Method:ANNOUNCE_PEER Type:RSP_MSG MessageID:t
[20:10:46] RPC received message [124.117.146.157] Method:ANNOUNCE_PEER Type:REQ_MSG MessageID:� version:LT targetKey:C5366CF1 7DBC6933 193CD77C 7EFC732B 6532346A (8)
[20:10:46] RPC send Message: [124.117.146.157] Method:ANNOUNCE_PEER Type:RSP_MSG MessageID:�
[20:10:46] RPC received message [124.117.146.157] Method:ANNOUNCE_PEER Type:REQ_MSG MessageID:� version:LT targetKey:C5366CF1 7DBC6933 193CD77C 7EFC732B 6532346A (8)
[20:10:46] RPC send Message: [124.117.146.157] Method:ANNOUNCE_PEER Type:RSP_MSG MessageID:�
[20:10:47] RPC received message [67.186.193.13] Method:ANNOUNCE_PEER Type:REQ_MSG MessageID:�=�C��� targetKey:C46F4AA4 401C6240 93E588BA 015128D3 A94C963F (10)
[20:10:48] RPC received message [203.218.188.174] Method:ANNOUNCE_PEER Type:REQ_MSG MessageID:�Fz�6�-# targetKey:C4EADE87 F515E739 9BD8E577 28BBE778 7039C268 (9)
[20:10:48] RPC send Message: [203.218.188.174] Method:ANNOUNCE_PEER Type:RSP_MSG MessageID:�Fz�6�-#
[20:10:48] RPC received message [211.92.142.104] Method:ANNOUNCE_PEER Type:REQ_MSG MessageID:U version:LT targetKey:C418A3CE DD8BEB33 5954293B B4A4B3C1 2BAF2991 (15)
[20:10:48] RPC send Message: [211.92.142.104] Method:ANNOUNCE_PEER Type:RSP_MSG MessageID:U
[20:10:48] RPC received message [124.117.146.157] Method:ANNOUNCE_PEER Type:REQ_MSG MessageID:= version:LT targetKey:C5366CF1 7DBC6933 193CD77C 7EFC732B 6532346A (8)
[20:10:48] RPC send Message: [124.117.146.157] Method:ANNOUNCE_PEER Type:RSP_MSG MessageID:=
[20:10:49] RPC received message [85.58.128.39] Method:ANNOUNCE_PEER Type:REQ_MSG MessageID:��r�� targetKey:C42A1205 6D0483D1 B635B95F C4B23B7B 70BFC017 (11)
[20:10:49] RPC send Message: [85.58.128.39] Method:ANNOUNCE_PEER Type:RSP_MSG MessageID:��r��
[20:10:49] RPC received message [61.173.33.127] Method:ANNOUNCE_PEER Type:REQ_MSG MessageID:� version:LT targetKey:C41A4833 825A4B8C 26090FBD 053628A0 D22F319C (17)
[20:10:49] RPC send Message: [61.173.33.127] Method:ANNOUNCE_PEER Type:RSP_MSG MessageID:�
[20:10:51] RPC received message [61.173.33.127] Method:ANNOUNCE_PEER Type:REQ_MSG MessageID:� version:LT targetKey:C41A4833 825A4B8C 26090FBD 053628A0 D22F319C (17)
[20:10:51] RPC send Message: [61.173.33.127] Method:ANNOUNCE_PEER Type:RSP_MSG MessageID:�

and some peers are just ultra-spammy:

[21:04:50.016] {plug} [Mainline DHT] RPC received message [84.30.213.161]  Method:GET_PEERS Type:REQ_MSG MessageID:  targetKey:C43CF44B D5042168 12C9186B 5266A585 45BFB619 (11)
[21:04:50.063] {plug} [Mainline DHT] RPC received message [84.30.213.161]  Method:GET_PEERS Type:REQ_MSG MessageID:  targetKey:C43CF44B D5042168 12C9186B 5266A585 45BFB619 (11)
[21:04:50.063] {plug} [Mainline DHT] RPC received message [84.30.213.161]  Method:GET_PEERS Type:REQ_MSG MessageID:  targetKey:C43CF44B D5042168 12C9186B 5266A585 45BFB619 (11)
[21:04:50.125] {plug} [Mainline DHT] RPC received message [84.30.213.161]  Method:GET_PEERS Type:REQ_MSG MessageID:  targetKey:C43CF44B D5042168 12C9186B 5266A585 45BFB619 (11)
[21:04:50.125] {plug} [Mainline DHT] RPC received message [84.30.213.161]  Method:GET_PEERS Type:REQ_MSG MessageID:  targetKey:C43CF44B D5042168 12C9186B 5266A585 45BFB619 (11)
[21:04:50.250] {plug} [Mainline DHT] RPC received message [58.37.203.202]  Method:GET_PEERS Type:REQ_MSG MessageID:/�e��o�  targetKey:C437E3D6 1594DF78 BD955C49 860B4463 C4B46213 (11)
[21:04:50.359] {plug} [Mainline DHT] RPC received message [84.30.213.161]  Method:GET_PEERS Type:REQ_MSG MessageID:  targetKey:C43CF44B D5042168 12C9186B 5266A585 45BFB619 (11)
[21:04:50.375] {plug} [Mainline DHT] RPC received message [84.30.213.161]  Method:GET_PEERS Type:REQ_MSG MessageID:  targetKey:C43CF44B D5042168 12C9186B 5266A585 45BFB619 (11)
[21:04:50.375] {plug} [Mainline DHT] RPC received message [84.30.213.161]  Method:GET_PEERS Type:REQ_MSG MessageID:  targetKey:C43CF44B D5042168 12C9186B 5266A585 45BFB619 (11)
[21:04:50.391] {plug} [Mainline DHT] RPC received message [84.30.213.161]  Method:GET_PEERS Type:REQ_MSG MessageID:	  targetKey:C43CF44B D5042168 12C9186B 5266A585 45BFB619 (11)
[21:04:50.406] {plug} [Mainline DHT] RPC received message [84.30.213.161]  Method:GET_PEERS Type:REQ_MSG MessageID:
  targetKey:C43CF44B D5042168 12C9186B 5266A585 45BFB619 (11)
[21:04:50.422] {plug} [Mainline DHT] RPC received message [84.30.213.161]  Method:GET_PEERS Type:REQ_MSG MessageID:  targetKey:C43CF44B D5042168 12C9186B 5266A585 45BFB619 (11)
[21:04:50.484] {plug} [Mainline DHT] RPC received message [84.30.213.161]  Method:GET_PEERS Type:REQ_MSG MessageID:  targetKey:C43CF44B D5042168 12C9186B 5266A585 45BFB619 (11)
[21:04:50.484] {plug} [Mainline DHT] RPC received message [125.163.210.230]  Method:GET_PEERS Type:REQ_MSG MessageID:� version:LT  targetKey:C41A4833 825A4B8C 26090FBD 053628A0 D22F319C (17)
[21:04:50.500] {plug} [Mainline DHT] RPC received message [98.165.149.123]  Method:GET_PEERS Type:REQ_MSG MessageID:J+

I've also seen (at least once) a bunch of get peers coming from different source IPs within milliseconds that look for the same target keys and incrementing message IDs, which obviously means someone is using source address spoofing... for what reason soever.

----

Then i'm also seeing announces that seem too far from my local key. The number in brackets behind the target key indicates the bucket number where 0 is the root. I have buckets 0-19 filled and 20 partially filled. Since an announce should only happen on the closest 8 nodes that means announces in the buckets 18-20 would seem reasonable, especially considering that the keyspace covered roughly doubles with each bucket.

But i regularly see announces for buckets as low as 9 and occasionally even further away ones from nodes that don't send a version. Peers sending a version are LT and UT. LT peers somehow often get to bucket 17 - although many of the samples repeated themselves, see above. But i saw at least one LT request going to bucket 7, meaning it probably could use a better route finding algorithm too.
UT peers hit the target bucket 20 with good accuracy.


[21:13:56.922] {plug} [Mainline DHT] RPC received message [89.171.134.250]  Method:ANNOUNCE_PEER Type:REQ_MSG MessageID:'���G�0q  targetKey:C47B550A A73E8CEA B15438DD B2657910 FDC24F02 (10)
[21:13:56.938] {plug} [Mainline DHT] RPC send Message: [89.171.134.250]  Method:ANNOUNCE_PEER Type:RSP_MSG MessageID:'���G�0q  
[21:13:57.188] {plug} [Mainline DHT] RPC received message [220.140.115.7]  Method:ANNOUNCE_PEER Type:REQ_MSG MessageID:"?8&�ע9  targetKey:C417F9B9 F8C98B5B 2AEF4A30 557BE71B C4704980 (13)
[21:13:57.188] {plug} [Mainline DHT] RPC send Message: [220.140.115.7]  Method:ANNOUNCE_PEER Type:RSP_MSG MessageID:"?8&�ע9  
[21:13:57.359] {plug} [Mainline DHT] RPC received message [193.16.157.102]  Method:ANNOUNCE_PEER Type:REQ_MSG MessageID:�NX�w5�3  targetKey:C48D510D 320E3D90 A8583354 07EE3227 A4F5D32D (9)
[21:13:57.359] {plug} [Mainline DHT] RPC send Message: [193.16.157.102]  Method:ANNOUNCE_PEER Type:RSP_MSG MessageID:�NX�w5�3  
[21:13:58.641] {plug} [Mainline DHT] RPC received message [87.220.13.230]  Method:ANNOUNCE_PEER Type:REQ_MSG MessageID:����?t  targetKey:C4013D98 E75BB165 027BF9BC 057B29A6 0022A78E (12)
[21:13:58.641] {plug} [Mainline DHT] RPC send Message: [87.220.13.230]  Method:ANNOUNCE_PEER Type:RSP_MSG MessageID:����?t  
[21:13:58.797] {plug} [Mainline DHT] RPC received message [59.173.207.8]  Method:ANNOUNCE_PEER Type:REQ_MSG MessageID:h

Last edited by The 8472 (2008-05-03 11:33:22)


Az dev

Offline

#8 2008-06-03 01:28:13

The 8472
Azureus Developer

Re: Discussion of BEP 5: DHT

Since things seem to move at glacial speeds.. here's the proposal:


-- enforcing target closeness

To make sure that bad implementations do not store values in inappropriate keyspace locations (i.e. not close enough to the target) the store-token in a getPeers response should only provided when we feel responsible for the incoming getPeers request.

This can be calculated the following way:
With k-sized buckets we take our k * 3 closest nodes and calculate the distance between us and the farthest of these nodes. For an incoming getPeers request we calculate the distance between us and the target. If the distance is smaller (i.e. sharing more prefix bits) we send the token, otherwise we don't.

The rationale behind k*3 is that we cover our local bucket and one to the left and one to the right from the tree perspective.

-- preventing announce hammering (get and put)

this one is more complex, so it probably should be an optional feature if one wants to reduce unnecessary traffic even further. With an announce interval of 30 minutes a peer should not announce more often for a specific key every 15 minutes. Checking who announced within the last 15 minutes can be done either by a recently-seen list or 2 dynamically sized bloom filters that are swapped every 15 minutes.

If a node already announced within the last 15 minutes we have two options:
1a) send an error packet
1b) only send a node list (no token, no peers)

1a) is stricter and should make faulty implementations more obvious to their developers but might result in unexpected behavior due to aggressive error handling

For announce frequency tracking we also have 2 options:
2a) track on a per-node basis
2b) track on a per-<node,key> basis

2a) is cleaner, 2b) would also prevent generic hammering (not key-specific) but should only be combined with 1b) to prevent any potential routing problems. I'm not quite sure if 2b) would be necessary/useful.


---------


Edit:

I've been considering this further: Since a node returns either a peer list or a node list it means that storing a single value in nodes which are not really responsible for that key that this node will only return 1 peer and no nodes. Which basically takes one node out of a getPeers lookup process.
If done correctly "around" the perimeter of the nodes that would actually be responsible this could be seen as a denial of service attack, cutting off routes towards destination keys.
While probably not very effective it would at least lower the overall efficiency of the DHT for that key in particular.

Last edited by The 8472 (2008-06-03 09:56:10)


Az dev

Offline

#9 2009-10-09 03:20:19

The 8472
Azureus Developer

Re: Discussion of BEP 5: DHT

I'll list some ideas on:
1. IPv6 support for DHT routing
2. IPv6 support for DHT announces
3. Generic put and get operations and how announce/getpeers should be integrated with those

In addition to that i already explained how to combat crappy implementations which don't perform put operations on the right target nodes in the previous post but it needs a little fixing of the specification.

Changing the spec to allow peers to deny inappropriate stores

All that's really necessary is changing the spec to allow peers not to send a "token" if they consider the "info_hash"-target ID too far from their own buckets.


IPv6 support for DHT routing

The fundamental problem with IPv6 and IPv4 is that right now there are many v4-only nodes, in the future there'll be many dual-stack nodes and even further in the future there'll be many v6-only nodes.

To solve this and allow us to make a clean cut at some point i propose that a DHT node should maintain 1 routing table per IP version. So a dualstack node will have 2 routing tables. One will only contain v4 and the other v6 addresses.

And here comes the interesting part: To allow smooth transitions between both the node-lists in FIND_NODE and GET_PEERS responses should return a "nodes" and a "nodes2" list (as proposed by libtorrent) from both routing tables, regardless of protocol.

This allows v6 nodes to gather other v6 contacts as a by-product of v4 lookups (and vice versa in the far future).

IPv6 support for DHT announces

Note that in the chapter above only "nodes" lists for FIND_NODE and GET_PEERS responses are covered. Lookups, i.e. outgoing requests should be handled differently to make sure that whichever address type is in the minority does not get drowned out by the majority.

To achieve this
a) the routing table management (PING, FIND_NODE) should be completely separate for both routing tables.
b) ANNOUNCE_PEER and GET_PEERS has to be performed in parallel but separate for each routing table.

Since announces are done once for v4 and once for v6 the source-address can still be used to determine the IP and no special measure has to be taken to announce the v6 address.


Generic put and get operations and how announce/getpeers should be integrated with those

The names "announce_peer" and "get_peers" should be retained for backwards compatibility purposes but their semantics should be changed to generic put/get operations.

To do so all nodes aware of this extension should use "target" in addition to "info_hash" in their GET_PEERS requests.

If an extension-aware node receives such a request and would normally reply with a "values" list (peers) they should reply with 2 lists instead: "orig" (for origin node addresses) and "data" (e.g. ports for announces or whatever you store with a generic put).
If it would normally reply with a "nodes" (and "nodes2") list it should add an empty "orig" and "data" list.


Once a peer has received the GET_PEERS responses they should be able to determine if the other node understands the new new semantics based on the presence of "orig" and "data" keys.

Thus it sould send an ANNOUNCE_PEERS with "target" and "data" instead of "info_hash" and "port" to a compatible node. The target node would then take the originator's IP and the data and store them.
If a non-compatible node asks for the data they'll just pack the originator IP and the data (or if the semantics for "data" change: the port contained within "data") into a "values" list. If a compatible peer asks for the data they'll put them into "data" and "orig" lists instead.




Bittorent announces would be done to target=infohash, anything else could go to sha1("of some key you want to store"), e.g. sha1("metadataXYZ"+info_hash) if you want to store some specific metadata about a torrent.


At some point in the future we could just cut out all the code handling "info_hash", "port" and "values" and only use "target", "orig" and "data" and thus live legacy-free. New semantics for data could be handled by simply doing the lookups to different "target" IDs and/or by using bencoded dicts within the data.


Az dev

Offline

#10 2009-10-12 13:18:03

jch
Member

Re: Discussion of BEP 5: DHT

For those who don't know me -- I'm the author of Transmission's DHT code.

IPv6 support for the DHT

a DHT node should maintain 1 routing table per IP version

This cannot be stressed enough.  Kademlia requires transitive connectivity; this breaks if you put both IPv4 and IPv6 nodes in a single table.  Hence, any double-stack node must maintain two distinct routing tables, one for IPv4 and one for IPv6.

There was some discussion on the subject in http://forum.utorrent.com/viewtopic.php?id=53492 .

To allow smooth transitions between both the node-lists in FIND_NODE and GET_PEERS responses should return a "nodes" and a "nodes2" list

I find that confusing.  I would rather suggest that the lists should be:

* nodes for the current address family (IPv4 in an IPv4 message, IPv6 in an IPv6 message);
* nodes4 or nodes6 for the other address family.

Hence, a message will either have just a nodes list (for legacy peers), both nodes and nodes6 (for an IPv4 message) or both nodes and nodes4 (for an IPv6 message).

Additionally, I would suggest that not all replies carry the nodes4 or nodes6 message; I would suggest that a find_nodes request should carry a boolean flag that explicitly requests nodes from the other family.

Since announces are done once for v4 and once for v6 the source-address can still be used to determine the IP and no special measure has to be taken to announce the v6 address.

Fully agreed.

--Juliusz

Last edited by jch (2009-10-12 13:26:04)

Offline

#11 2009-10-12 13:20:11

jch
Member

Re: Discussion of BEP 5: DHT

Generic put and get operations

As noted in private mail, I'm currently opposed to this proposal (but quite willing to change my mind if new information is provided).

In particular, you should recall that unlike Azureus, Transmission is used on embedded systems with very little memory (notably OpenWRT machines).  Hence, any proposal that causes us to track unbounded amounts of memory is a strict no for me.

Offline

#12 2009-10-12 13:23:05

jch
Member

Re: Discussion of BEP 5: DHT

Peers sending a version are LT and UT

Just to set the record straight -- Transmission also sends a version ("TR" plus the big-endian binary encoding of the SVN revision number). Please drop me a mail if you see any such gross mis-behaviour coming from Transmission peers.

Offline

#13 2009-10-13 10:57:29

The 8472
Azureus Developer

Re: Discussion of BEP 5: DHT

I am convinced that my proposal provides a viable approach to the problem(s). Things like key names are minor issues and can be changed for sure. Although i find nodes + nodes2 (or nodes6 if you wish) better for reasons of parsing simplicity as the message decoder does not have to be aware of which routing table it belongs to.

Including want4/want6 in requests to cut down on the amount of send node infos makes sense.


The only part i'm not certain about is using originators + data to provide maximum similarity in the semantics compared to the current implementation as it would reduce the amount of data you can fit into a single generic get reply if you don't really care about the originators.


Thus i would request review and comments on this post:

http://forum.bittorrent.org/viewtopic.php?pid=644#p644


Az dev

Offline

#14 2009-10-14 12:39:05

jch
Member

Re: Discussion of BEP 5: DHT

Although i find nodes + nodes2 (or nodes6 if you wish) better for reasons of parsing simplicity as the message decoder does not have to be aware of which routing table it belongs to.

I prefer it, but don't care very much -- let's avoid getting into a bikeshed argument for the moment.

Including want4/want6 in requests to cut down on the amount of send node infos makes sense.

It's absolutely essential.  My plan is to only include want* when bootstrapping.  Having nodes send the other address family by default would double traffic with little or no benefit.

Offline

#15 2009-10-14 12:41:54

jch
Member

Re: Discussion of BEP 5: DHT

The only part i'm not certain about is using originators + data

I suggest we stick to IPv6 for now, and not discuss the generic data extension for now.

By the way -- could somebody get some µTorrent developer into the discussion?

Offline

#16 2009-10-14 15:20:16

jch
Member

Re: Discussion of BEP 5: DHT

Okay, here's a mostly complete proposal:

  http://www.pps.jussieu.fr/~jch/software … -ipv6.html

In short:

* I've adopted The8472's semantics for nodes/nodes6;
* I've merged the "want4" and "want6" fields into a single "want" field;
* I've added a note about how this solves the problem of get_peers not returning nodes information in some cases, without breaking compatibility.

Last edited by jch (2009-10-14 15:26:38)

Offline

#17 2009-10-14 17:55:19

The 8472
Azureus Developer

Re: Discussion of BEP 5: DHT

Hrrm, i have some thoughts on this.

I agree that generic storage wouldn't be good at this point since the mainline DHT lacks features to deal with it.


Want flags


We should reconsider the want-field though. requests generally are small as they don't carry any payload, so we have a bit more room to spare. Using a string of characters and decoding individual characters as features sounds just horrible (Ph'nglui mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn!) from a coordination standpoint.

Since those messages are bencoded anyway i would suggest encoding the list as "flags" instead of "want" as a list of bencoded strings.

"rV6n" and "rV4n"  (for requesting v4/6 nodes) as flags in that list should be sufficiently short. and cause a lot less clashes as having only 256 possible characters available.

example: d.....5:flagsl4:rV6n4:rV4Ne......e


Tokens


I know this is completely orthogonal to IPv6 compatibility, but since we're tweaking the protocol i'd really like the spec to be changed nodes can officially omit the token as a way to indicate that they will not service a store request. Silently dropping store requests is already possible, but we have no way for the requesting node to know that and thus they can't adjust their routing properly.


node IDs, pings
You write that they can have separate IDs for the v6 and v4 nodes. I think that unnecessarily complicates the possible combinations and would require us to add something like ID6 to ping responses or do similar things.

Having identical IDs also increases the synergies that one routing tables will have from the node info leaked through the other one


Az dev

Offline

#18 2009-10-15 14:00:07

arvid
Administrator

Re: Discussion of BEP 5: DHT

I agree that the "token" should be an optional way of indicating that the data searched for is ok to store on that node.

I also think that the specification should be updated to always return nodes on a get_peers request.

@jch: you seemed to have some reservation about returning "nodes" together with "values" to a get_peers request. What is your concern with doing that?

I agree that the node-id should be the same for the two routing tables. I put up the draft for this proposal as BEP 32, let's move discussion about this into its own forum thread under BEP 32.

Offline

#19 2009-10-15 14:14:23

jch
Member

Re: Discussion of BEP 5: DHT

> @jch: you seemed to have some reservation about returning "nodes" together with "values" to a get_peers request. What is your concern with doing that?

How do existing implementations parse a get_peers reply?  I can see three reasonable ways:

1. parse both nodes and values if they are present.  (This is what Transmission does.)
2. check for values, and if it is absent, check for nodes.
3. check for nodes, and if it is absent, check for values.

If you unconditionally send nodes together with values, you'll interoperate with implementations that do (1) or (2), but you'll break implementations that do (3).  Before we decide on sending nodes together with values, I want us to check that there is no widely-deployed implementation that does (3).

In particular, there's no way I'm going to implement that in Transmission unless I have a positive answer from somebody who has access to the µTorrent code.

Offline

#20 2009-10-15 14:16:53

jch
Member

Re: Discussion of BEP 5: DHT

> i'd really like the spec to be changed nodes can officially omit the token as a way to indicate that they will not service a store request.

No objection.  I won't be sending token-less replies in Transmission (I prefer to simply drop the announce_peers request), but I will be parsing token-less replies in the next version.

Offline

#21 2009-10-15 14:41:22

arvid
Administrator

Re: Discussion of BEP 5: DHT

uTorrent has done this since a bit over a year now. Both sent "nodes" in "get_peer" requests as well as handle both "nodes" and "values" in the response. So has libtorrent (i.e. Deluge, Qbittorrent, Miro, Limewire, Halite, etc.).

Offline

#22 2009-10-15 15:02:47

The 8472
Azureus Developer

Re: Discussion of BEP 5: DHT


Az dev

Offline

#23 2009-10-15 16:09:22

jch
Member

Re: Discussion of BEP 5: DHT

dht-0.10: unreleased

  * Send nodes even when sending values.  This is a violation of the
    protocol, but I have been assured that it doesn't break any deployed
    implementation.  This is also what both libtorrent and uTorrent do.
  * Give up immediately on a search peer when no token was provided.  This
    is a very reasonable extension to the protocol, and certainly doesn't
    break anything.

This should be in Transmission 1.80, unless I change my mind.

Offline

#24 2009-10-15 16:30:03

arvid
Administrator

Re: Discussion of BEP 5: DHT

Here are my proposed changes:

Index: bep_0005.rst
===================================================================
--- bep_0005.rst	(revision 11179)
+++ bep_0005.rst	(working copy)
@@ -332,18 +332,20 @@
 querying node, and "info_hash" containing the infohash of the torrent.
 If the queried node has peers for the infohash, they are returned in a
 key "values" as a list of strings. Each string containing "compact" format
-peer information for a single peer. If the queried node has no
-peers for the infohash, a key "nodes" is returned containing the K
+peer information for a single peer. A key "nodes" is also returned containing the K
 nodes in the queried nodes routing table closest to the infohash
-supplied in the query. In either case a "token" key is also included in
-the return value. The token value is a required argument for a future
+supplied in the query. If the "nodes" key brings the packet size over the UDP MTU,
+fewer than K nodes may be returned. A "token" key may also be included in
+the return value to signal the requesting node that it can store data in the
+requested peer. The token value is a required argument for a future
 announce_peer query. The token value should be a short binary string.
 
 ::
 
   arguments:  {"id" : "<querying nodes id>", "info_hash" : "<20-byte infohash of target torrent>"}
 
-  response: {"id" : "<queried nodes id>", "token" :"<opaque write token>", "values" : ["<peer 1 info string>", "<peer 2 info string>"]}
+  response: {"id" : "<queried nodes id>", "nodes" : "<compact node info>"
+    , "token" :"<opaque write token>", "values" : ["<peer 1 info string>", "<peer 2 info string>"]}
 
   or: {"id" : "<queried nodes id>", "token" :"<opaque write token>", "nodes" : "<compact node info>"}
 
@@ -357,13 +359,13 @@
 
 ::
 
-  Response with peers = {"t":"aa", "y":"r", "r": {"id":"abcdefghij0123456789", "token":"aoeusnth", "values": ["axje.u", "idhtnm"]}}
-  bencoded = d1:rd2:id20:abcdefghij01234567895:token8:aoeusnth6:valuesl6:axje.u6:idhtnmee1:t2:aa1:y1:re
+  Response with peers = {"t":"aa", "y":"r", "r": {"id":"abcdefghij0123456789", "nodes":"def456...", "token":"aoeusnth", "values": ["axje.u", "idhtnm"]}}
+  bencoded = d1:rd2:id20:abcdefghij01234567895:nodes9:def456...5:token8:aoeusnth6:valuesl6:axje.u6:idhtnmee1:t2:aa1:y1:re
 
 
 ::
 
-  Response with closest nodes = {"t":"aa", "y":"r", "r": {"id":"abcdefghij0123456789", "token":"aoeusnth", "nodes": "def456..."}}
+  Response with closest nodes = {"t":"aa", "y":"r", "r": {"id":"abcdefghij0123456789", "token":"aoeusnth", "nodes":"def456..."}}
   bencoded = d1:rd2:id20:abcdefghij01234567895:nodes9:def456...5:token8:aoeusnthe1:t2:aa1:y1:re

Offline

#25 2009-10-15 17:20:06

The 8472
Azureus Developer

Re: Discussion of BEP 5: DHT

I suggest "should" instead of "may" in

'A "token" key may also be included'

As providing it generally should be the norm (at least for nodes that are close to the target) unless the node has some significant reason to not honor store requests.

The underlying reasoning why it is optional and yet important is not clear otherwise.


Az dev

Offline

Board footer

Powered by FluxBB