incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mick Semb Wever <>
Subject Re: OOM opening bloom filter
Date Sun, 11 Mar 2012 23:44:00 GMT
On Sun, 2012-03-11 at 15:36 -0700, Peter Schuller wrote:
> Are you doing RF=1? 

That is correct. So are you calculations then :-)

> > very small, <1k. Data from this cf is only read via hadoop jobs in batch
> > reads of 16k rows at a time.
> [snip]
> > It's my understanding then for this use case that bloom filters are of
> > little importance and that i can
> Depends. I'm not familiar enough with how the hadoop integration works
> so someone else will have to comment, but if your hadoop jobs are just
> performan normal reads of keys via thrift and the keys they are
> grabbing are not in token order, those reads would be effectively
> random and bloom filters should still be highly relevant to the amount
> of I/O operations you need to perform. 

They are thrift get_range_slice reads of 16k rows per request.
Hadoop reads are based on tokens, but in my use case the keys are also
ordered and this cluster is using BOP.


"Living on Earth is expensive, but it does include a free trip around
the sun every year." Unknown 

| | |

View raw message