incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rahul <rsha...@xebia.com>
Subject SeqFileReaderFactory give exception
Date Mon, 09 Jul 2012 08:55:25 GMT
Guys,

I have a SequenceFile with LogWritable Keys and Text as values . I am 
using SequenceFileSource with MRPipeline. But when I use MemPipeline it 
is giving back the following exception.

3503 [main] INFO  com.cloudera.crunch.io.seq.SeqFileReaderFactory  - Error reading from path:
file:/home/rahul/software/crunch/sampleFile
java.io.IOException: wrong key class: org.apache.hadoop.io.ObjectWritable is not class org.apache.hadoop.io.LongWritable
     at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1895)
     at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1947)
     at com.cloudera.crunch.io.seq.SeqFileReaderFactory$1.hasNext(SeqFileReaderFactory.java:68)
     at com.cloudera.crunch.io.CompositePathIterable$2.hasNext(CompositePathIterable.java:81)

Now this is due to the fact that the file contains LongWritable Keys but 
it is using a NullWritable to read them. This gives error in MemPipline 
only, it works in the MRPipeline because the KeyClass is passed there 
using the MapContext of Hadoop and thus it is the correct one. I 
modified the SeqFileReaderFactory  to pass the KeyClass also but is this 
the correct way of doing so ?

regards
Rahul

Mime
View raw message