incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <josh.wi...@gmail.com>
Subject Re: SeqFileReaderFactory give exception
Date Mon, 09 Jul 2012 14:25:15 GMT
SequenceFileTableSouce will let you read it the file as a PTable,
which is probably the quickest way to get what you want.

On Mon, Jul 9, 2012 at 1:55 AM, Rahul <rsharma@xebia.com> wrote:
> Guys,
>
> I have a SequenceFile with LogWritable Keys and Text as values . I am using
> SequenceFileSource with MRPipeline. But when I use MemPipeline it is giving
> back the following exception.
>
> 3503 [main] INFO  com.cloudera.crunch.io.seq.SeqFileReaderFactory  - Error
> reading from path: file:/home/rahul/software/crunch/sampleFile
> java.io.IOException: wrong key class: org.apache.hadoop.io.ObjectWritable is
> not class org.apache.hadoop.io.LongWritable
>     at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1895)
>     at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1947)
>     at
> com.cloudera.crunch.io.seq.SeqFileReaderFactory$1.hasNext(SeqFileReaderFactory.java:68)
>     at
> com.cloudera.crunch.io.CompositePathIterable$2.hasNext(CompositePathIterable.java:81)
>
> Now this is due to the fact that the file contains LongWritable Keys but it
> is using a NullWritable to read them. This gives error in MemPipline only,
> it works in the MRPipeline because the KeyClass is passed there using the
> MapContext of Hadoop and thus it is the correct one. I modified the
> SeqFileReaderFactory  to pass the KeyClass also but is this the correct way
> of doing so ?
>
> regards
> Rahul

Mime
View raw message