cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Cassandra Wiki] Update of "AntiEntropy" by JonathanEllis
Date Thu, 11 Feb 2010 14:53:50 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "AntiEntropy" page has been changed by JonathanEllis.
The comment on this change is: wtf? moin ate Stu's update. restored via gmail backup.
http://wiki.apache.org/cassandra/AntiEntropy?action=diff&rev1=3&rev2=4

--------------------------------------------------

  Cassandra's implementation is modeled on Dynamo's, with [[ArchitectureAntiEntropy|modifications
to support the richer data model]].  Quoting from [[http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html|Amazon's
Dynamo]] section 4.7,
  
-  . To detect the inconsistencies between replicas faster and to minimize the amount of transferred
data, Dynamo uses Merkle trees. A Merkle tree is a hash tree where leaves are hashes of the
values of  individual keys. Parent nodes higher in the tree are hashes of their respective
 children. The principal advantage of Merkle tree is that each branch of the tree  can be
checked independently without requiring nodes to download the entire  tree or the entire data
set. Moreover, Merkle trees help in reducing the amount of  data that needs to be transferred
while checking for inconsistencies among  replicas. For instance, if the hash values of the
root of two trees are equal,  then the values of the leaf nodes in the tree are equal and
the nodes require no synchronization. If not, it implies that the values of some replicas
are different. In such cases, the nodes may exchange the hash values of  children and the
process continues until it reaches the leaves of the trees, at which  point the hosts can
identify the keys that are “out of sync”. Merkle trees  minimize the amount of data that
needs to be transferred for synchronization and  reduce the number of disk reads performed
during the anti-entropy process.
+   . To detect the inconsistencies between replicas faster and to minimize the amount of
transferred data, Dynamo uses Merkle trees. A Merkle tree is a hash tree where leaves are
hashes of the values of  individual keys. Parent nodes higher in the tree are hashes of their
respective  children. The principal advantage of Merkle tree is that each branch of the tree
 can be checked independently without requiring nodes to download the entire [...] data set.
  
-  . Dynamo uses Merkle trees for anti-entropy as follows: Each node maintains a separate
Merkle tree for each key range (the set of keys  covered by a virtual node) it hosts. This
allows nodes to compare whether the keys  within a key range are up-to-date. In this scheme,
two nodes exchange the root  of the Merkle tree corresponding to the key ranges that they
host in common.  Subsequently, using the tree traversal scheme described above the nodes determine
if  they have any differences and perform the appropriate synchronization action.
+ The key difference in Cassandra's implementation of anti-entropy is that the Merkle trees
are built per column family, and they are not maintained for longer than it takes to send
them to neighboring nodes. Instead, the trees are generated as snapshots of the dataset during
major compactions: this means that excess data might be sent across the network, but it saves
local disk IO, and is preferable for very large datasets.
  

Mime
View raw message