incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyler Hobbs <>
Subject Re: clarification on 100k tombstone limit in indexes
Date Tue, 12 Aug 2014 17:39:31 GMT
On Mon, Aug 11, 2014 at 4:17 PM, Ian Rose <> wrote:

> "You better off create a manuel reverse-index to track modification date,
> something like this"  --> I had considered an approach like this but my
> concern is that for any given minute *all* of the updates will be handled
> by a single node, right?  For example, if the minute_bucket is 2739 then
> for that one minute, every single item update will flow to the node at
> HASH(2739).  Assuming I am thinking about that right, that seemed like a
> potential scaling bottleneck, which scared me off that approach.

If you're concerned about bottlenecking on one node (or set of replicas)
during the minute, add an additional integer column to the partition key
(making it a composite partition key if it isn't already).  When inserting,
randomly pick a value between, say, 0 and 10 to use for this column.  When
reading, read all 10 partitions and merge them.  (Alternatively, instead of
using a random number, you could hash the other key components and use the
lowest bits for the value.  This has the advantage of being deterministic.)

Tyler Hobbs
DataStax <>

View raw message