cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: Cassandra hadoop Thrift Time out
Date Sat, 25 Sep 2010 01:09:20 GMT
There is some information on the wiki about
a resource leak before 0.6.2 versions that can result in a TimeoutException. But you're on
0.6.5 so should be ok. 

I had a quick look at the Hadoop code and could not see where to change the timeout (that
would be the obvious thing to try). If you have a look in the though it

     * The number of rows to request with each get range slices request.
     * Too big and you can either get timeouts when it takes Cassandra too
     * long to fetch all the data. Too small and the performance
     * will be eaten up by the overhead of each request.
     * @param conf      Job configuration you are about to run
     * @param batchsize Number of rows to request each time
    public static void setRangeBatchSize(Configuration conf, int batchsize)
        conf.setInt(RANGE_BATCH_SIZE_CONFIG, batchsize);

The config item name is ""cassandra.range.batch.size".

Try reducing the batch size first and see if the timeouts go away. Though it does not sound
like you have a lot of data.  

An 0.7 beta2 may be out this week. But it's still beta. 

Hope that helps. 

On 25 Sep 2010, at 07:17, Saket Joshi wrote:

> Hi Experts,
> I need help on an exception integrating cassandra-hadoop. I am getting the following
exception, when running a Hadoop Map reduce job
> I am using cassandra 0.6.5 , 3 node cluster. I don’t get any exception when the data
I am processing is very small  < 5 rows and 100 columns,  but get the error with modest
data is > 5 rows 500 columns. I went though some of the forums where people have experienced
the same issue.
. Is this a bug with Cassandra-hadoop classes and is that fixed in 0.7 for sure? how stable
is 0.7 beta ? In the system.log I see a lot of ”  index has reached its threshold; switching
in a fresh Memtable” messages
> Has Anyone faced a similar issue and solved it? Is migrating to 0.7  the only solution?
> Thanks,
> Saket
> Stack Trace of the Exception:
> {ava.lang.RuntimeException: TimedOutException()
>         at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(
>         at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(
>         at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(
>         at
>         at
>         at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(
>         at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(
>         at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(
>         at
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(
>         at
>         at org.apache.hadoop.mapred.Child.main(
> Caused by: TimedOutException()
>         at org.apache.cassandra.thrift.Cassandra$
>         at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(
>         at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(
>         at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(}

View raw message