cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] Created: (CASSANDRA-192) Load balancing
Date Thu, 21 May 2009 19:46:45 GMT
Load balancing
--------------

                 Key: CASSANDRA-192
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
             Project: Cassandra
          Issue Type: New Feature
            Reporter: Jonathan Ellis
             Fix For: 0.4


We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly
distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely
to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized
by a hash function.

Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41

Of these, the useful ones are
 http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing
Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
 http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured
P2P Systems by Ananth Rao et al)

The third, 
http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing
for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.
 ("First, we suggest the direct application of the lsquolsquopower of two choicesrsquorsquo
paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives.
We then consider how associating a small constant number of hash values with a key can naturally
be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message