hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ed <hadoopn...@gmail.com>
Subject LZO Compression Libraries don't appear to work properly with MultipleOutputs
Date Thu, 21 Oct 2010 21:52:26 GMT
Hello everyone,

I am having problems using MultipleOutputs with LZO compression (could be a
bug or something wrong in my own code).

In my driver I set

     MultipleOutputs.addNamedOutput(job, "test", TextOutputFormat.class,
NullWritable.class, Text.class);

In my reducer I have:

     MultipleOutputs<NullWritable, Text> mOutput = new
MultipleOutputs<NullWritable, Text>(context);

     public String generateFileName(Key key){
        return "custom_file_name";
     }

Then in the reduce() method I have:

     mOutput.write(mNullWritable, mValue, generateFileName(key));

This results in creating LZO files that do not decompress properly (lzop -d
throws the error "lzop: unexpected end of file: outputFile.lzo")

If I switch back to the regular context.write(mNullWritable, mValue);
everything works fine.

Am I forgetting a step needed when using MultipleOutputs or is this a
bug/non-feature of using LZO compression in Hadoop.

Thank you!


~Ed

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message