Thursday, October 30, 2008

Looking up Data in P2P Systems

The authors of this (the same authors who came up with Chord, and perhaps also (Tapestry) present a bit of a "state of the field" in distributed hash tables. Although a short paper in CACM, they actually go into some detail about four different DHT algorithms: Chord, Tapestry, Pastry, and Kademlia. It's an interesting read to hear about the differences, such as the distance functions, where they identify the most important properties as symmetry and unidirectionality. They also look briefly at other lookup ideas, like tree routing and CAN (not really unrelated.)

They note that dealing with concurrent and asynchronous failures and joins is a challenge that is hard to address in all of these systems, although Chord presents some formal proofs of resilience under failures. I think that the most interesting piece of future work that they note is the "proximity routing", which looks like an attempt to choose "fingers" which are nearby so as to minimize lookup latency. This seems like only one way that you might incorporate topological information into the hash table's decision, but the issue of trying to generate an overlay which respects an underlying topology seems like a good research direction.

No comments: