Hi Tim,

You can try the following:

1. create a HiveLzoTextOutputFormat which extends HiveIgnoreKeyTextOutputFormat<K, V>, the only thing it does is to set compression output for the jobConf:

  @Override
  public
RecordWriter getHiveRecordWriter(...) {
      FileOutputFormat.setCompressOutput(jc, true);
      return super.getHiveRecordWriter(jc, outPath, valueClass, true, tableProperties, progress);
  }

   @Override
   public org.apache.hadoop.mapred.RecordWriter<K, V> getRecordWriter(...) {
       FileOutputFormat.setCompressOutput(jc, true);
       return super.getRecordWriter(ignored, jc, name, progress);
   }

2. When you create the table:

    set the InputFormat class to DeprecatedLzoTextInputFormat
    set the OutputFormat class to the new HiveLzoTextOutputFormat

Hope this helps.
Feng


On Wed, Jan 30, 2013 at 1:36 PM, Timothy Potter <thelabdude@gmail.com> wrote:
Been struggling with this one for a bit ... Lzo compression is enabled
by default for my Hadoop cluster. If I forget to turn off compression
from my Pig scripts that create data in Hive using HCatalog, then the
partitions get created but I can't read the data back. I don't have
the error handy but it looks like the read-side doesn't treat the data
as compressed.

So I've resorted to adding the following to my scripts:

SET mapreduce.output.compress false;
SET mapred.output.compress false;
SET output.compression.enabled false;

One of the those seems to do the trick ;-)

I'd really like to store my Hive data compressed but haven't figured
out how to enable this with HCatalog. Seems like it's either not
supported yet or I'm missing something simple in my HQL DDL table
declaration.

Cheers,
Tim