incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DuyHai Doan <doanduy...@gmail.com>
Subject Re: clarification on 100k tombstone limit in indexes
Date Wed, 13 Aug 2014 09:35:09 GMT
"add an additional integer column to the partition key (making it a
composite partition key if it isn't already).  When inserting, randomly
pick a value between, say, 0 and 10 to use for this column"  --> Due to the
low cardinality of bucket (only 10), there is no guarantee that the
partitions would be distributed evenly. But it's better than nothing.

"Alternatively, instead of using a random number, you could hash the other
key components and use the lowest bits for the value.  This has the
advantage of being deterministic" --> Does it work with VNodes, where
tokens are split in 256 ranges and shuffled in all nodes ?


On Tue, Aug 12, 2014 at 7:39 PM, Tyler Hobbs <tyler@datastax.com> wrote:

>
> On Mon, Aug 11, 2014 at 4:17 PM, Ian Rose <ianrose@fullstory.com> wrote:
>
>>
>> "You better off create a manuel reverse-index to track modification
>> date, something like this"  --> I had considered an approach like this but
>> my concern is that for any given minute *all* of the updates will be
>> handled by a single node, right?  For example, if the minute_bucket is 2739
>> then for that one minute, every single item update will flow to the node at
>> HASH(2739).  Assuming I am thinking about that right, that seemed like a
>> potential scaling bottleneck, which scared me off that approach.
>>
>
> If you're concerned about bottlenecking on one node (or set of replicas)
> during the minute, add an additional integer column to the partition key
> (making it a composite partition key if it isn't already).  When inserting,
> randomly pick a value between, say, 0 and 10 to use for this column.  When
> reading, read all 10 partitions and merge them.  (Alternatively, instead of
> using a random number, you could hash the other key components and use the
> lowest bits for the value.  This has the advantage of being deterministic.)
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>

Mime
View raw message