incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <>
Subject Re: Load balancing - Order preserving partition - New Keys prob.
Date Thu, 16 Sep 2010 20:03:57 GMT
If new keys don't follow roughly the same distribution as the ones you
already have, then yes, you're going to be load balancing a lot.

On Thu, Sep 16, 2010 at 2:24 AM, Hien. To Trong <> wrote:
> Hi all,
> Thanks you for your comments.
> I already read some papers about load balancing in P2P which some of them
> allow range query.
> During this time, I find another problem: "NEW KEYS"
> Karger - Simple efficient load balancing algorithms for p2p systems.
> (support OPP)
> John Byers - Simple load balancing for distributed hash table.
> Third: Ananth Rao - Load balancing in structured P2P systems.
> I think the first paper is the best to get the background.
> Prefix Hash Tree: An Indexing Data Structure over Distributed Hash Tables
> and Range queries over DHTs
> These paper use PHT to allow range query.
> James Aspnes - Skip Graphs (most recent as I known)
> and Distributed Balanced Tables, Not making a hash of it All.
> All of these papers deal with load balancing and range query probs by
> creating schemes or strategies
> based on only "load - number of key on each nodes", not care about new keys
> and highly-accessed keys.
> HOWEVER, in fact, "the most recent keys are likely accessed more
> frequently".
> I suppose, we have 400.000 "NEW KEYS" in 2 recent days (are likely accessed
> more frequently). --> A scheme: these new keys are uniformly partitioned and
> divided into some successive nodes. For example, there are 10 node
> (N1...N10), keys in day 1 will be put into node_1_2, keys in day 2 will be
> put into node_3_4 and so on... But, the prob is that number of keys and
> nodes increase/decrease day by day (not fixed) --> the number of node used
> to store keys in each day may increase/decrease (1 or 3 node for example).
> Does any one have any ideas or know papers to deal with this prob (new keys
> and highly-accessed keys)?
> Thanks a lot.

Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support

View raw message