hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars George (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2225) Enable compression in HBase Export
Date Wed, 17 Feb 2010 08:10:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834680#action_12834680

Lars George commented on HBASE-2225:

Agreed. I think what Ted meant (and Andrew also touched) is that if a table has compression
enabled then it would make sense to use it for backups too. So that small tables for example
are stored as is. Ted, since the backup reads the KeyValue records compression is not part
of the equation anymore, i.e. the MapReduce job doing the backup does not know if the table
was compressed or not. I'll implement the command line switch and attach a patch today.

> Enable compression in HBase Export
> ----------------------------------
>                 Key: HBASE-2225
>                 URL: https://issues.apache.org/jira/browse/HBASE-2225
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: util
>    Affects Versions: 0.20.1
>         Environment: OS agnostic
>            Reporter: Ted Yu
>            Priority: Minor
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
> org.apache.hadoop.hbase.mapreduce.Export should set compression codec
> In createSubmittableJob(), the following should be added:
>     FileOutputFormat.setCompressOutput(job, true);
>     FileOutputFormat.setOutputCompressorClass(job, org.apache.hadoop.io.compress.GzipCodec.class);
> From my experiment, 10% to 50% reduction in Export output has been observed.
> SequenceFileInputFormat used by the Import tool is able to detect GzipCodec - there is
no change for Import class.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message