incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Doubleday <daniel.double...@gmx.net>
Subject Re: Second Cassandra users survey
Date Tue, 08 Nov 2011 09:29:01 GMT
Ah cool - thanks for the pointer!

On Nov 7, 2011, at 5:25 PM, Ed Anuff wrote:

> This is basically what entity groups are about -
> https://issues.apache.org/jira/browse/CASSANDRA-1684
> 
> On Mon, Nov 7, 2011 at 5:26 AM, Peter Lin <woolfel@gmail.com> wrote:
>> This feature interests me, so I thought I'd add some comments.
>> 
>> Having used partition features in existing databases like DB2, Oracle
>> and manual partitioning, one of the biggest challenges is keeping the
>> partitions balanced. What I've seen with manual partitioning is that
>> often the partitions get unbalanced. Usually the developers take a
>> best guess and hope it ends up balanced.
>> 
>> Some of the approaches I've used in the past were zip code, area code,
>> state and some kind of hash.
>> 
>> So my question related deterministic sharding is this, "what rebalance
>> feature(s) would be useful or needed once the partitions get
>> unbalanced?"
>> 
>> Without a decent plan for rebalancing, it often ends up being a very
>> painful problem to solve in production. Back when I worked mobile
>> apps, we saw issues with how OpenWave WAP servers partitioned the
>> accounts. The early versions randomly assigned a phone to a server
>> when it is provisioned the first time. Once the phone was associated
>> to that server, it was stuck on that server. If the load on that
>> server was heavier than the others, the only choice was to "scale up"
>> the hardware.
>> 
>> My understanding of Cassandra's current sharding is consistent and
>> random. Does the new feature sit some where in-between? Are you
>> thinking of a pluggable API so that you can provide your own hash
>> algorithm for cassandra to use?
>> 
>> 
>> 
>> On Mon, Nov 7, 2011 at 7:54 AM, Daniel Doubleday
>> <daniel.doubleday@gmx.net> wrote:
>>> Allow for deterministic / manual sharding of rows.
>>> 
>>> Right now it seems that there is no way to force rows with different row keys
will be stored on the same nodes in the ring.
>>> This is our number one reason why we get data inconsistencies when nodes fail.
>>> 
>>> Sometimes a logical transaction requires writing rows with different row keys.
If we could use something like this:
>>> 
>>> prefix.uniquekey and let the partitioner use only the prefix the probability
that only part of the transaction would be written could be reduced considerably.
>>> 
>>> 
>>> 
>>> On Nov 1, 2011, at 11:59 PM, Jonathan Ellis wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> Two years ago I asked for Cassandra use cases and feature requests.
>>>> [1]  The results [2] have been extremely useful in setting and
>>>> prioritizing goals for Cassandra development.  But with the release of
>>>> 1.0 we've accomplished basically everything from our original wish
>>>> list. [3]
>>>> 
>>>> I'd love to hear from modern Cassandra users again, especially if
>>>> you're usually a quiet lurker.  What does Cassandra do well?  What are
>>>> your pain points?  What's your feature wish list?
>>>> 
>>>> As before, if you're in stealth mode or don't want to say anything in
>>>> public, feel free to reply to me privately and I will keep it off the
>>>> record.
>>>> 
>>>> [1] http://www.mail-archive.com/cassandra-dev@incubator.apache.org/msg01148.html
>>>> [2] http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg01446.html
>>>> [3] http://www.mail-archive.com/dev@cassandra.apache.org/msg01524.html
>>>> 
>>>> --
>>>> Jonathan Ellis
>>>> Project Chair, Apache Cassandra
>>>> co-founder of DataStax, the source for professional Cassandra support
>>>> http://www.datastax.com
>>> 
>>> 
>> 


Mime
View raw message