forum.bittorrent.org

BitTorrent.org community

You are not logged in.

Announcement

Forums are closed. Use the new mailing list! https://groups.google.com/a/bittorrent.com/forum/#!forum/bt-developers

#1 2009-10-15 14:27:51

arvid
Administrator

BEP 32 IPv6 support for the DHT

The proposal is up at:

  http://www.bittorrent.org/beps/bep_0032.html

My comments are:

I think the specification should mandate dual stack nodes to use the same ID. This would increase the overlap and usefulness of the ability for find_nodes to return nodes from both address families. If every node would be a dual stack node, it would essentially bring down the overhead back to just a single DHT.

Since the "values" already is a list, I don't think we need a "values6" for it. The flags/want could still be used to filter what is returned though (as to not waste bandwidth).

I think that get_peers should always return "nodes", and I don't think this specification should or need to depend on either implementation.

I agree with 8472 that maybe some more generic name like "flags" would be preferred over "want". I'm not sure I like the idea of bloating it too much with actual strings, just because requests have more space to spare.

Offline

#2 2009-10-15 14:43:54

The 8472
Azureus Developer

Re: BEP 32 IPv6 support for the DHT

jch wrote:

> @jch: you seemed to have some reservation about returning "nodes" together with "values" to a get_peers request. What is your concern with doing that?

How do existing implementations parse a get_peers reply?  I can see three reasonable ways:

1. parse both nodes and values if they are present.  (This is what Transmission does.)
2. check for values, and if it is absent, check for nodes.
3. check for nodes, and if it is absent, check for values.

The mainline DHT plugin for azureus uses method 2 atm.

We also have to consider packet sizes. Worst case would be nodes, nodes6, values. nodes + nodes 6 with 16 nodes IDs, IPs and ports + bencoding overhead would be about 576 bytes. that leaves us about 800 bytes for the values, assuming they're all IPv6 then we can still put about 40 peers into a single get_peers response.

Seems reasonable.


jch wrote:

> i'd really like the spec to be changed nodes can officially omit the token as a way to indicate that they will not service a store request.

No objection.  I won't be sending token-less replies in Transmission (I prefer to simply drop the announce_peers request), but I will be parsing token-less replies in the next version.

I would suggest otherwise since it would allow the node performing the store to remove you from the best-matches list and look for other nodes instead to perform the store on. Consider everyone of the 8 closest nodes to a specific key silently dropping an announce. The data would get lost and the peer wouldn't be any wiser.


Az dev

Offline

#3 2009-10-15 15:16:17

The 8472
Azureus Developer

Re: BEP 32 IPv6 support for the DHT

arvid wrote:

I agree with 8472 that maybe some more generic name like "flags" would be preferred over "want". I'm not sure I like the idea of bloating it too much with actual strings, just because requests have more space to spare.

well, i suggested that out of experience with the reserved bitfield vs. named messages in bittorrent. The latter is a lot more flexible. We're using bencoding for a reason. Everytime we don't it bites us in our behind at some point.

Case in point: nodes6 is only necessary because nodes is not a list but a compact byte array.


Az dev

Offline

#4 2009-10-15 16:43:33

arvid
Administrator

Re: BEP 32 IPv6 support for the DHT

Once bootstrap is successful, a node should not request data in both
address families; it should only request "nodes" replies from IPv4 nodes,
and "nodes6" replies from IPv6 nodes.  Requesting both address families is
unlikely to provide any useful information, and unnecessarily increases
network traffic.

This does not seem motivated enough to me. I would imagine that you would be able to decrease the amount of network bandwidth by interleaving and mixing requests to the two DHT networks, by requesting nodes from both address families on each find_node request.

It's also not obvious to me why "values" and "values6" ignores the "want" flags.

Offline

#5 2009-10-15 16:49:45

arvid
Administrator

Re: BEP 32 IPv6 support for the DHT

I would suggest the following change, to discard the "values6" key and not restrict the address families returned in "values" and allow control over that through the "want" key and strongly suggesting dual stack nodes to share the node ID.

Index: bep_0032.rst
===================================================================
--- bep_0032.rst	(revision 11179)
+++ bep_0032.rst	(working copy)
@@ -105,7 +105,14 @@
 network outage, a node may *ocasionally* send a request for "nodes" to an
 IPv6 node, or a request for "nodes6" to an IPv4 node.
 
+Node id
+'''''''
 
+A dual stack node SHOULD have the same node-id for both DHT networks. This
+increases the overlap of IPv4 and IPv6 nodes which improves performance when
+reuesting nodes from multiple address families in the same request.
+
+
 Changes and extensions to existing messages
 ===========================================
 
@@ -121,12 +128,11 @@
 IPv6 routing table.
 
 
-values6
-'''''''
+values
+''''''
 
-The "values6" parameter is allowed in replies to the get_peers message
-being sent over IPv6.  Its value is a list of strings, each of which
-contains compact format IPv6 contact information for a single peer.
+The "values" parameter is allowed in to contain list entries for compact
+IPv6 endpoints.
 
 
 want
@@ -162,28 +168,12 @@
 include "nodes" if the request was sent over IPv4, and include "nodes6" if
 the request was sent over IPv6.
 
-For a get_peers request with no "want" parameter, value should only be
-included if there is no "values" or "values6" parameter.  If no "values" or
-"values6" parameter is being sent, the presence of "nodes" or "nodes6" is
-governed by the network-layer protocol of the request, as above.
+For a get_peers request with no "want" parameter, "values" should only include
+entries of the same address family as the address family the request was
+received over, unless the "want" flags indicates which address family the
+requesting node is interested in.
 
-   **Rationale**: it has been suggested that the reply to a get_peers
-   request should always include a "nodes" or "nodes6" parameter,
-   independently of the presence of a "values" or "values6" parameter.
-   While this is clearly a better semantics, it is not backward-compatible,
-   and it is not known whether it breaks any deployed implementations.
 
-When a node receives a get_peers request and it has contact information for
-the matching address family and info-hash, it should include either
-a "values" parameter (if the request was sent over IPv4) or a "values6"
-parameter in the reply.
-
-A reply sent over IPv4 must not contain a "values6" parameter, and a reply
-sent over IPv6 must not contain a "values" parameter.  In other words, the
-"want" parameter only governs the presence of the "nodes" and "nodes6"
-parameters.
-
-
 announce_peers
 ''''''''''''''

Offline

#6 2009-10-15 17:26:09

The 8472
Azureus Developer

Re: BEP 32 IPv6 support for the DHT

Restricting the address families in values makes sense in so far that the routing table for both DHTs will be of a different size and thus you'll end up with different nodes towards the end of a FIND_PEERS lookup, even if they're dualstack nodes.

Thus the nodes in the IPv6-FIND_PEERS lookup are rather unlikely to have a reasonable amount of v4 peers stored and vice versa.

Another point is that a dualstack node will do 2 lookups in parallel, one for each routing table, so there's no need for supplying both datasets over both routing tables.



But i'm not sure if mandating it is really necessary.


Az dev

Offline

#7 2009-10-15 17:34:10

jch
Member

Re: BEP 32 IPv6 support for the DHT

1. Same id in both DHTs

I agree with arvid's SHOULD, since while using different ids doesn't break anything, it makes things uselessly messy and potentially less efficient. If that's okay, I'll take the most of arvid's prose (I'll tone down the end, since I'm not convinced about the efficiency argument -- see about DHT congruence below).

2. Sending nodes even when sending values

Adopted.

2. Flags vs. want

I'm a little insecure on the issue.  On the one hand, flexibility is good.  On the other hand, flexibility is bad, since it potentially breaks interoperability by creating gazillions of incompatible variants.

Clearly, there's a trade-off there: I'm looking for a syntax that does not prevent flexibility should we actually need it, but does not encourage everyone and his brother to create Yet Another Flag.

A string of single-octet flags seems like the right compromise to me; The8472 would appear to have tastes that are more on the flexibility side.  Is this a bikeshed argument we're having?

3. Receiving IPv6 values over IPv4 and vice-versa

You're only likely to meet an IPv4 node that can provide you with IPv6 values that are of interest to you if the two DHTs are mostly congruent, i.e. if the set of IPv4 nodes roughly coincides with the set of IPv6 nodes.  This is extremely unlikely to happen in practice.

Consider that in the near future, most nodes will have firewalled IPv4 (look up carrier-grade NAT if you're not convinced), and hence will not be participating in the IPv4 DHT; however, many of those nodes will have full IPv6 connectivity (Teredo or 6to4 are fine, you don't need native connectivity in order to accept unsolicited packets).

Hence, my feeling is that allowing both IPv4 and IPv6 values adds complexity uselessly.  I have no objection to adding such an extension, however, but I'd like to suggest that it should be optional -- meaning that (1) it is possible to request both sets of nodes, but only the matching set of values; (2) by default, only the matching values are sent; and (3) a node is free to ignore the requestor and only send the matching values.

4. Requesting both sets of nodes on each find_nodes

This is only useful if you assume that the two DHTs are congruent, which I claim is very unlikely.  I will tone down the language, though.

5. Merging values and values6

Good point.  Adopted.

Offline

#8 2009-10-15 17:53:05

arvid
Administrator

Re: BEP 32 IPv6 support for the DHT

if you post a patch against http://www.bittorrent.org/beps/bep_0032.rst I can apply it to the published version.

If you use the same node ID for both networks, and you're one of the closest nodes to a certain info-hash on the IPv4 network, you're quite likely to be one of the closest nodes on the IPv6 network as well (assuming the IPv6 network is smaller). In which case peers would announce to you over both networks.

If a client doesn't support the DHT over IPv6, it can at least ask for IPv6 peers from the DHT easily (if you can control which address family you receive in a get_peers response).

Offline

#10 2009-10-15 18:33:54

jch
Member

Re: BEP 32 IPv6 support for the DHT

If a client doesn't support the DHT over IPv6, it can at least ask for IPv6 peers from the DHT easily (if you can control which address family you receive in a get_peers response).

Only if the two DHTs are congruent.  If they are mostly disjoint (which, as I've argued before, is most likely to be the case), then not only an IPv4 node cannot find the right nodes for the IPv6 information -- it most likely cannot even reach them.

FWIW, the new version makes the behaviour of "values" a should rather than a must (except in one case), and mandates that all nodes must be able to parse hybrid lists.  Which should be good enough to switch to a different behaviour in the future if it is found desirable (which I doubt).

Offline

#11 2009-10-15 18:45:29

arvid
Administrator

Re: BEP 32 IPv6 support for the DHT

The extension described in this document also solves a long-standing issue with the protocol defined in BEP-5. The presence of contact information in the reply is no longer governed by the presence of a "values" key; this avoids the well-known problem of contact information being masked by peer information. While this is an incompatible change, it is believed to be properly handled by all deployed implementations

I don't think this belongs in this BEP. It should be fixed in BEP-5 (I've posted a proposed change in that thread)

Offline

#12 2009-10-15 18:54:40

arvid
Administrator

Re: BEP 32 IPv6 support for the DHT

Only if the two DHTs are congruent.  If they are mostly disjoint (which, as I've argued before, is most likely to be the case), then not only an IPv4 node cannot find the right nodes for the IPv6 information -- it most likely cannot even reach them.

I still don't see the argument for disallowing asking for IPv6 peers from an IPv4 node.

There's a relatively high probability that at one of the 8 nodes that peers announce to will be a dual stack node. And I believe this probability will increase as windows vista and windows 7 becomes more popular. Starting in vista, teredo is installed and enabled by default. uTorrent and libtorrent both make use of teredo when it's enabled (and I believe most well maintained clients would, simply by supporting IPv6).

FWIW, the new version makes the behaviour of "values" a should rather than a must (except in one case), and mandates that all nodes must be able to parse hybrid lists.  Which should be good enough to switch to a different behaviour in the future if it is found desirable (which I doubt).

I think it's an over-statement to claim that something that an old client doesn't understand is a breakage of compatibility. With the same reasoning, introducing the "want" key would break compatibility.

Offline

#13 2009-10-16 01:35:58

The 8472
Azureus Developer

Re: BEP 32 IPv6 support for the DHT

Considering that for the timing being most IPv6 nodes will actually be dual stack nodes the IPv6 DHT will mostly be a subset of the IPv4 one. Thus congruence is not an issue, just node density, i.e. an ipv4 lookup mostly terminating with more dualstack nodes closer to the key than the IPv6 lookup simply because there are more IPv4 nodes.

I.e. the final set of each lookup will be closer to the key for IPv4 than it is for IPv6.


But the path towards your own ID (i.e. bootstrapping) has a decent chance of seeing dualstack nodes in both routing tables as the bucket spacing should is always the same (i.e. root bucket covers 50%, 2nd bucket covers 25% of the keyspace etc. etc.)


Az dev

Offline

#14 2009-10-16 12:28:24

arvid
Administrator

Re: BEP 32 IPv6 support for the DHT

I would suggest changing all mentions of "double stack" to "dual stack".

I also want to bring up the feasibility of transition "values" to a single string rather than a list. I realized I initially was opposed to introducing "values6", but I think there is some merit in turning the peer-list into a compact representation to save space. Currently, with both "values" and "nodes" in a get_peers response, it's not unlikely to exceed the MTU. using a list instead of a compact list wastes 2 bytes for every 6 bytes returned (in the IPv4 case) and 3 bytes for every 18 bytes stored for IPv6.

For IPv4 it would mean being able to fit 33% more peers.

It could be made backwards compatible by specifying support for the compact format via flags.

Offline

#15 2009-10-16 18:45:48

The 8472
Azureus Developer

Re: BEP 32 IPv6 support for the DHT

Since we're talking about IPv6 here we should also take the less-than-1500 MTUs due to IPv6 transition mechanisms into account.


Az dev

Offline

#16 2009-10-17 06:09:00

jch
Member

Re: BEP 32 IPv6 support for the DHT

I still claim that there is no usage scenario for returning both sets of values:

1. Single stack nodes don't need both sets of values.
2. Double stack nodes perform searches in both DHTs in parallel, hence returning both sets of values will return redundant information.

So the only usage scenario is the one brought up by arvid, that of a double-stack node that hasn't implemented support for the IPv6 DHT, but does support parsing of IPv6 values in replies from the IPv4 DHT.  This is a very particular case, and I'm not particularly keen on complicating the protocol for this case.

(Additionally, The8475, when you argue that the IPv6 DHT is congruent to the IPv4 DHT, just smaller, you're assuming that IPv4 nodes are not behind NAT.  I hence claim that even that last scenario is not likely to work in practice.)

If you really insist on handling this case, please define an extension to optionally allow returning of both sets of values that would suit you.

Other than that, you're right about maximum packet size; I'll add some suitable language.  You're also right about dual stack, I'll change it throughout.

Finally, I'm not really willing to change values to compact form, being compatible with both adds complexity while only saving some 100 bytes at most.  I'm willing to reconsider if you provide a full specification of how to handle backwards compatibility.

Last edited by jch (2009-10-17 08:47:47)

Offline

#17 2009-10-17 08:46:35

jch
Member

Re: BEP 32 IPv6 support for the DHT

Diff:

  http://www.pps.jussieu.fr/~jch/software … -2-3.patch

Full text:

  http://www.pps.jussieu.fr/~jch/software … t-ipv6.rst
  http://www.pps.jussieu.fr/~jch/software … -ipv6.html

The good news is that we appear to be agreeing on single-stack operation, both for IPv4 and for IPv6.  We still need to agree on interaction between the two stacks:

  * name and syntax of the "want"/"flags" parameter;
  * whether to allow mixed values.

Offline

#18 2009-10-17 11:17:28

The 8472
Azureus Developer

Re: BEP 32 IPv6 support for the DHT

I think parsing both ipv4 and ipv6 values is trivial if we use non-compact lists and as such we don't have to mandate that nodes return only v4 or v6 addresses. It's merely a best-practice thing to do considering that clients have to be able to decode both anyway. I bet there'll be some stray packets intended for v4 in the v6 DHT and vice versa due to sub-optimal implementations anyway, so clients will have to deal with it on encoding.


As for the want/flags. I prefer a bencoded list of short strings as flags over some "new" encoding scheme that's nowhere else used in bittorrent for aesthetic and flexibility reasons. Especially considering that request packets are naturally rather small and that those flags are generally optional since the default behavior without such flags still should be reasonable (otherwise our spec would be shoddy).


And another serious issue came to mind: Multihomed IPv6 hosts. This is a serious problem with nodes that have native v6/6to4/tunnels and ipv4/teredo. Most routing configurations will prefer native/6to4/tunnels by default but will use teredo instead if contacting another teredo node. This means that different DHT nodes will see different IPs associated with the same ID, which can generally lead to conflicts in the routing table management.

So i would suggest that DHTv6 ports should be explicitly bound to a single public IPv6 address, a non-teredo one if possible.

I.e. binding to [::] should be discouraged.


Az dev

Offline

#19 2009-10-17 13:11:17

jch
Member

Re: BEP 32 IPv6 support for the DHT

I think parsing both ipv4 and ipv6 values is trivial if we use non-compact lists and as such we don't have to mandate that nodes return only v4 or v6 addresses.

Check the section about "values" -- that's already the case.

As for the want/flags. I prefer a bencoded list of short strings as flags over some "new" encoding scheme that's nowhere else used in bittorrent for aesthetic and flexibility reasons.

Okay, I'll switch to such a list, I don't care that much.  "want" or "flags"?  (I prefer "want", but don't care much.)

I.e. binding to :: should be discouraged.

Could you suggest some suitable prose?

Additionally, we should mention something about the PORT message.

Offline

#20 2009-10-17 19:42:29

The 8472
Azureus Developer

Re: BEP 32 IPv6 support for the DHT

jch wrote:

I think parsing both ipv4 and ipv6 values is trivial if we use non-compact lists and as such we don't have to mandate that nodes return only v4 or v6 addresses.

Check the section about "values" -- that's already the case.

Excellent.

jch wrote:

As for the want/flags. I prefer a bencoded list of short strings as flags over some "new" encoding scheme that's nowhere else used in bittorrent for aesthetic and flexibility reasons.

Okay, I'll switch to such a list, I don't care that much.  "want" or "flags"?  (I prefer "want", but don't care much.)

And i'm leaning towards "flags", but it's just an issue of semantics, not really important.

jch wrote:

Could you suggest some suitable prose?

let's see...


IPv6 address binding

The IPv6 DHT should use a socket bound to one of the host's global unicast IPv6 addresses instead of the "unspecified address" (::/128).
When selecting to which address to bind the client should avoid teredo addresses (2001:0000::/32) if other global unicast addresses are available.

Rationale: The DHT relies on the publicly visible IP address of each node to be transitive among all contacts to maintain its oldest-alive lists in the routing tables and to match replies to requests. Without binding this transitivity can be violated as the operating system may chose different source addresses for outgoing packets based on the destination address, routes and prefix policies.





jch wrote:

Additionally, we should mention something about the PORT message.

Mhhh, peers supporting LTEP also exchange their ipv4 and/or ipv6 addresses. So the port can be assumed to be valid for both if they do. Of course that has to be verified in case the BT implementation is ipv6-aware and the DHT isn't.


Az dev

Offline

#21 2009-10-19 10:02:27

jch
Member

Re: BEP 32 IPv6 support for the DHT

When selecting to which address to bind the client should avoid teredo addresses (2001:0000::/32) if other global unicast addresses are available.

I'll omit this bit, since I happen to disagree with it.  Let's leave this at the implementation's discretion.

Mhhh, peers supporting LTEP also exchange their ipv4 and/or ipv6 addresses. So the port can be assumed to be valid for both if they do. Of course that has to be verified in case the BT implementation is ipv6-aware and the DHT isn't.

And that's the very reason why I'm leaning towards mandating that the PORT message is only valid for the address family it was sent over, but that double-stack implementations are strongly encouraged to run both DHTs on the same port number.

Offline

#22 2009-10-19 12:18:38

The 8472
Azureus Developer

Re: BEP 32 IPv6 support for the DHT

jch wrote:

When selecting to which address to bind the client should avoid teredo addresses (2001:0000::/32) if other global unicast addresses are available.

I'll omit this bit, since I happen to disagree with it.  Let's leave this at the implementation's discretion.

Decisions should be justified and not unilateral, so can you provide any reason for that? Teredo has a lower MTU (more encapsulation layers) and is potentially less reachable due to NAT traversal than other IPv6 mechanisms.

Microsoft even ranks teredo dead last, in its prefix policies.


jch wrote:

Mhhh, peers supporting LTEP also exchange their ipv4 and/or ipv6 addresses. So the port can be assumed to be valid for both if they do. Of course that has to be verified in case the BT implementation is ipv6-aware and the DHT isn't.

And that's the very reason why I'm leaning towards mandating that the PORT message is only valid for the address family it was sent over, but that double-stack implementations are strongly encouraged to run both DHTs on the same port number.

Just have the DHT implementation send 2 pings. one to the IPv4, one to the IPv6. The replies (if any) will automatically add them to the routing tables.

I don't see a reason why they should run on different ports, considering that they operate in tandem. For the sake of simplicity and codifying reasonable assumptions i would say that we could put that into the spec.


Az dev

Offline

#24 2009-10-29 10:50:39

jch
Member

Re: BEP 32 IPv6 support for the DHT

> can you provide any reason for that? Teredo has a lower MTU (more encapsulation layers) and is potentially less reachable due to NAT traversal than other IPv6 mechanisms.

Teredo is potentially more efficient than native, especially when speaking to other Teredo nodes.  (Read the spec, it's a very clever protocol.)

However, Teredo tends to be brittle, which is why I've finally decided to put your suggestion in the draft.  Let me know if you're happy with the rationale.

As to the PORT message -- I'm still not convinced.  Let me know what you think of the rationale I've put in the draft.

--Juliusz

Offline

#25 2009-10-29 17:55:48

The 8472
Azureus Developer

Re: BEP 32 IPv6 support for the DHT

port message: this might be an issue with source address binding. but i can't think of a good fix that doesn't use ltep.

"find_nodes and get_peers" chapter: needs a fix regarding the want flags. they're not bytes anymore.

Teredo is potentially more efficient than native, especially when speaking to other Teredo nodes.

the same could said about 6to4, with less overhead. Although it's less widespread due to all those nat routers.

Anyway, i'm happy with that part as it is.


Az dev

Offline

Board footer

Powered by FluxBB