cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: Timeout Errors while running Hadoop over Cassandra
Date Wed, 12 Jan 2011 22:08:36 GMT
Whats happening in the cassandra server logs when you get these errors? 

Reading through the hadoop 0.6.6 code it looks like it creates a thrift client with an infinite
timeout. So it may be an internode timeout, which is set in storage-conf.xml.

Aaron


On 13 Jan, 2011,at 07:40 AM, Jairam Chandar <jairam.chandar@imagini.net> wrote:

Hi folks,

We have a Cassandra 0.6.6 cluster running in production. We want to run Hadoop (version 0.20.2)
jobs over this cluster in order to generate reports. 
I modified the word_count example in the contrib folder of the cassandra distribution. While
the program is running fine for small datasets (in the order of 100-200 MB) on small clusters
(2 machines), it starts to give errors while trying to run on a bigger cluster (5 machines)
with much larger dataset (400 GB). Here is the error that we get - 

java.lang.RuntimeException: TimedOutException()
	at org.apache.cassandra.hadoopColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:186)
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:236)
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:104)
	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:135)
	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:130)
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReaderjava:98)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423)
	at org.apache.hadoop.mapreduce.MapContextnextKeyValue(MapContext.java:67)
	at org.apache.hadoop.mapreduce.Mapperrun(Mapper.java:143)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: TimedOutException()
	at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:11094)
	at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:628)
	at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:602)
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:164)
	... 11 more



I came across this page on the Cassandra wiki - http://wiki.apache.org/cassandra/HadoopSupport
and tried modifying the ulimit and changing batch sizes These did not help. Though the number
of successful map tasks increased, it eventually fails since the total number of map tasks
is huge. 

Any idea on what could be causing this? The program we are running is a very slight modification
of the word_count example with respect to reading from Cassandra. The only change being specific
keyspace, columnfamily and columns. The rest of the code for reading is the same as the word_count
example in the source code for Cassandra 0.6.6.

Thanks and regards,
Jairam Chandar
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
    • Unnamed multipart/related (inline, None, 0 bytes)
View raw message