cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Jeltema <>
Subject cassandra/hadoop BulkOutputFormat failures
Date Fri, 14 Sep 2012 18:34:20 GMT
I'm trying to do a bulk load from a Cassandra/Hadoop job using the BulkOutputFormat class.
It appears that the reducers are generating the SSTables, but is failing to load them into
the cluster:

12/09/14 14:08:13 INFO mapred.JobClient: Task Id : attempt_201208201337_0184_r_000004_0, Status
: FAILED Too many hosts failed: [/, /, /, /,
/, /] 
        at org.apache.cassandra.hadoop.BulkRecordWriter.close(
        at org.apache.cassandra.hadoop.BulkRecordWriter.close(
        at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(
        at org.apache.hadoop.mapred.ReduceTask.runNewReducer(
        at org.apache.hadoop.mapred.Child$ 
        at Method)
        at org.apache.hadoop.mapred.Child.main(  

A brief look at the BulkOutputFormat class shows that it depends on SSTableLoader. My Hadoop
and my Cassandra cluster are co-located on the same set of machines. I haven't found any stated
but does this technique only work if the Hadoop cluster is distinct from the Cassandra cluster?
Any suggestions
on how to get past this problem?

Thanks in advance.

View raw message