incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Jeltema <brian.jelt...@digitalenvoy.net>
Subject Re: cassandra/hadoop BulkOutputFormat failures
Date Mon, 17 Sep 2012 10:53:58 GMT
As suggested, it was a version-skew problem. 

Thanks.

Brian

On Sep 14, 2012, at 11:34 PM, Jeremy Hanna wrote:

> A couple of guesses:
> - are you mixing versions of Cassandra?  Streaming differences between versions might
throw this error.  That is, are you bulk loading with one version of Cassandra into a cluster
that's a different version?
> - (shot in the dark) is your cluster overwhelmed for some reason?
> 
> If the temp dir hasn't been cleaned up yet, you are able to retry, fwiw.
> 
> Jeremy
> 
> On Sep 14, 2012, at 1:34 PM, Brian Jeltema <brian.jeltema@digitalenvoy.net> wrote:
> 
>> I'm trying to do a bulk load from a Cassandra/Hadoop job using the BulkOutputFormat
class.
>> It appears that the reducers are generating the SSTables, but is failing to load
them into the cluster:
>> 
>> 12/09/14 14:08:13 INFO mapred.JobClient: Task Id : attempt_201208201337_0184_r_000004_0,
Status : FAILED
>> java.io.IOException: Too many hosts failed: [/10.4.0.6, /10.4.0.5, /10.4.0.2, /10.4.0.1,
/10.4.0.3, /10.4.0.4] 
>>       at org.apache.cassandra.hadoop.BulkRecordWriter.close(BulkRecordWriter.java:242)
>>       at org.apache.cassandra.hadoop.BulkRecordWriter.close(BulkRecordWriter.java:207)
>>       at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:579)
>>       at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:650)
>>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
>>       at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
>>       at java.security.AccessController.doPrivileged(Native Method)
>>       at javax.security.auth.Subject.doAs(Subject.java:396)   
>>       at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>>       at org.apache.hadoop.mapred.Child.main(Child.java:249)  
>> 
>> A brief look at the BulkOutputFormat class shows that it depends on SSTableLoader.
My Hadoop cluster
>> and my Cassandra cluster are co-located on the same set of machines. I haven't found
any stated restrictions,
>> but does this technique only work if the Hadoop cluster is distinct from the Cassandra
cluster? Any suggestions
>> on how to get past this problem?
>> 
>> Thanks in advance.
>> 
>> Brian
> 
> 


Mime
View raw message