cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <>
Subject Re: MapReduce, Timeouts and Range Batch Size
Date Fri, 23 Apr 2010 03:09:00 GMT
That would be an easy win, sure.

On Thu, Apr 22, 2010 at 9:27 PM, Joost Ouwerkerk <> wrote:
> I was getting client timeouts in ColumnFamilyRecordReader.maybeInit() when
> MapReducing.  So I've reduced the Range Batch Size to 256 (from 4096) and
> this seems to have fixed my problem, although it has slowed things down a
> bit -- presumably because there are 16x more calls to get_range_slices.
> While I was in that code I noticed that a new client was being created for
> each batch get.  By decreasing the batch size, I've increased this
> overhead.  I'm thinking of re-writing ColumnFamilyRecordReader to do some
> connection pooling.  Anyone have any thoughts on that?
> joost.

View raw message