hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject LzopCodec and SequenceFile?
Date Fri, 15 Jun 2012 06:04:26 GMT

I have a sequence of MR Jobs that are using the SequenceFile for their output and input format.
If I run them without any compression enabled they work fine. If I use the LzoCodec they also
work just fine (but then the output is not Lzop compatible which is inconvenient).

If I try using the LzopCodec, then the first MR job (which reads from a TextFile and outputs
to a SequenceFile) runs OK, but when the second job tries to read what the first job wrote,
I get the following exception:

java.io.EOFException: Premature EOF from inputStream
        at com.hadoop.compression.lzo.LzopInputStream.readFully(LzopInputStream.java:75)
        at com.hadoop.compression.lzo.LzopInputStream.readHeader(LzopInputStream.java:114)
        at com.hadoop.compression.lzo.LzopInputStream.<init>(LzopInputStream.java:54)
        at com.hadoop.compression.lzo.LzopCodec.createInputStream(LzopCodec.java:83)
        at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1591)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1493)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1480)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1475)
        at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:50)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:451)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
        at org.apache.ha

Does anyone know why this could be happening? I'm using the latest's Couldera CDH3 distribution
and I'm configuring the compression through the mapred.output.compression.codec property in
the mapred-site.xml file.


Este mensaje se dirige exclusivamente a su destinatario. Puede consultar nuestra pol?tica
de env?o y recepci?n de correo electr?nico en el enlace situado m?s abajo.
This message is intended exclusively for its addressee. We only send and receive email on
the basis of the terms set out at

View raw message