incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyler Hobbs <ty...@datastax.com>
Subject Re: clarification on 100k tombstone limit in indexes
Date Wed, 13 Aug 2014 15:48:26 GMT
On Wed, Aug 13, 2014 at 4:35 AM, DuyHai Doan <doanduyhai@gmail.com> wrote:

> "add an additional integer column to the partition key (making it a
> composite partition key if it isn't already).  When inserting, randomly
> pick a value between, say, 0 and 10 to use for this column"  --> Due to the
> low cardinality of bucket (only 10), there is no guarantee that the
> partitions would be distributed evenly. But it's better than nothing.
>

It's important to think about it probablistically, i.e. "what is the
probability that all ten partitions belong to the same node?"  If you have
a ten node cluster (assume RF=1 for simplicity), there's a 1/10^9 (one in a
billion) chance that a single node is the owner for all partitions.  So
it's quite a bit better than nothing.  If you want to improve your odds,
bump the number up.  But, keep in mind that it's a balance, because reads
become more expensive.


>
> "Alternatively, instead of using a random number, you could hash the
> other key components and use the lowest bits for the value.  This has the
> advantage of being deterministic" --> Does it work with VNodes, where
> tokens are split in 256 ranges and shuffled in all nodes ?
>

Yes, it works perfectly fine with vnodes.


-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Mime
View raw message