incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyler Hobbs <ty...@datastax.com>
Subject Re: clarification on 100k tombstone limit in indexes
Date Tue, 12 Aug 2014 17:39:31 GMT
On Mon, Aug 11, 2014 at 4:17 PM, Ian Rose <ianrose@fullstory.com> wrote:

>
> "You better off create a manuel reverse-index to track modification date,
> something like this"  --> I had considered an approach like this but my
> concern is that for any given minute *all* of the updates will be handled
> by a single node, right?  For example, if the minute_bucket is 2739 then
> for that one minute, every single item update will flow to the node at
> HASH(2739).  Assuming I am thinking about that right, that seemed like a
> potential scaling bottleneck, which scared me off that approach.
>

If you're concerned about bottlenecking on one node (or set of replicas)
during the minute, add an additional integer column to the partition key
(making it a composite partition key if it isn't already).  When inserting,
randomly pick a value between, say, 0 and 10 to use for this column.  When
reading, read all 10 partitions and merge them.  (Alternatively, instead of
using a random number, you could hash the other key components and use the
lowest bits for the value.  This has the advantage of being deterministic.)


-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Mime
View raw message