5 Feb 2022

Consistent Hashing: Chord

Reference paper:

Summary

One of the fundamental problems in peer-to-peer (P2P) applications is to efficiently locate a particular node in the network. Stoica et al. present the design for a scalable and efficient P2P distributed lookup protocol named Chord. The authors first give an overview of consistent hashing which is widely used in such scenarios. They highlight the scalability issues of consistent hashing that are caused by each node having to maintain the state of all of the other nodes in the network. In order to overcome this limitation, they propose the use of distributed hash-based indexing and routing algorithms that allow Chord to handle a large number of nodes and data lookups efficiently. In the proposed protocol, each node only needs to maintain a reference to its immediate successor and only a subset of all other nodes in the network, leading to better scalability. Chord is also able to handle node join and leave operations without requiring significant changes to the network’s structure, which contributes to its robustness in dynamic networks. The authors provide a thorough analysis of the performance of Chord and show that it performs well in terms of time complexity that scales logarithmically with the number of Chord nodes. Lastly, the authors also prove the fault-tolerance capabilities of Chord through experiments.

Positive Points

  • The authors prefaced each experiment in section 6 with a motivation followed by experiment details and the results. This provided a deeper understanding of what questions the experiments were meant to answer, for instance, the experiments showing the distribution of keys per node in the server were prefaced by the motivation and hence the results lead to a deeper understanding of the system.
  • The authors explained the protocol design and implementation very clearly by providing sufficient illustrations and pseudo-code. Specifically, the illustration in figure 5 showing how the protocol handles node joins followed by the pseudo-code in figure 6 made the author’s ideas easy to follow.
  • Apart from introducing a novel protocol, the authors are spent effort in discussing/demonstrating the fault-tolerance aspects of their design which made it feasible for their protocol to be adapted into practical applications.

Drawbacks

  • The algorithm used in the paper for generating hashes (SHA-1) has been proven to be insecure, thereby making this version of Chord more vulnerable to security risks. Although, at the time the paper was published SHA-1 was a commonly used hashing algorithm.
  • The paper fails to mention how many messages are required to properly initialize the state of the “ring” of m nodes when it first starts up. Further information about how to properly set-up the protocol would have made it easier to adopt Chord into practical applications.
  • Security, despite being a major consideration in P2P networks where virtually any node can connect and start transmitting messages, was only discussed briefly in the paper. The authors could have conducted experiments to test the solutions to security issues proposed in the last section.

Research Questions

  1. Chord does not use network topology information to setup the connections between peers. Can we incorporate location information for each node to provide better latency? – eg. Tapestry protocol, Pastry protocol etc.
  2. Lookups in chord are asymmetric because of the way connections between nodes is set up. For example, lookup from node n to node p could take a different number of hops than from node p to n. Could making symmetric connections (nodes p and n both point to each other) improve predictability?
  3. The concept of unidirectional routing in Chord can be misused during a routing attack (where a malicious node purposely detours messages to increase path length). Can we modify the algorithm to include bi-directional edges and bi-directional finger table entries to prevent this?

Tags: