cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-192) Load balancing
Date Thu, 28 May 2009 01:45:45 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713827#action_12713827
] 

Jonathan Ellis commented on CASSANDRA-192:
------------------------------------------

Hopefully nobody writes about that because it is the easy part. :)

Here is a scheme I think would work:

if node B is assuming part of the range node A is currently responsible, then we do it as
follows:

Node A notes the range it is transferring and begins anticompaction of its SSTables.  It continues
to accept reads and writes for that range but also forwards writes to B so that anticompaction
doesn't need to worry about these extra writes.  When anticompaction is done, it sends the
appropriate sections over to B.  When complete, B begins to gossip its adjusted token and
A creates a task that will remove the sections B now has some time in the future (after we
can assume the entire cluster has gotten the new token info -- minutes or hours, not days).

To avoid garbage on either side in the event of a crash before the process is complete, we
can add a check during compaction that throws away data that is not part of the current node's
responsibility (or replica ranges).

This glosses over a bunch of details (do we anticompact memtables too, or just flush first
and let the sstable code go over it?  what if B is already part of the ring that gets replicas
for the range in question?) but I think the basic idea is sound.

> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>             Fix For: 0.4
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being
uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly
likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized
by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing
Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured
P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing
for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.
 ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby
an item is stored at the less loaded of two (or more) random alternatives. We then consider
how associating a small constant number of hash values with a key can naturally be extended
to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message