hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashish Shinde (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-3474) HFileOutputFormat to use column family's compression algorithm
Date Fri, 18 Mar 2011 05:30:30 GMT

     [ https://issues.apache.org/jira/browse/HBASE-3474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Ashish Shinde updated HBASE-3474:

    Attachment: patch3474.txt

Added unit tests for 
1. compression serialization deserialization to job config
2. testing that the hfile writers use the config correctly.

Minor issues

1. The test parses out HFile.Reader.toString() to retrieve the compression algorithm used
on an HFile. One alternative is to add getCompressionAlgo() to HFile.Reader. 
2. The testColumnFamilyCompression create a mapred job just to get hold of an usable writer.
Not sure if this is the best thing to do.

> HFileOutputFormat to use column family's compression algorithm
> --------------------------------------------------------------
>                 Key: HBASE-3474
>                 URL: https://issues.apache.org/jira/browse/HBASE-3474
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>    Affects Versions: 0.92.0
>         Environment: All
>            Reporter: Ashish Shinde
>             Fix For: 0.92.0
>         Attachments: patch3474.txt, patch3474.txt, patch3474.txt
>   Original Estimate: 48h
>  Remaining Estimate: 48h
> HFileOutputFormat  currently creates HFile writer's using a compression algorithm set
as configuration "hbase.hregion.max.filesize" with default as no compression. The code does
not take into account the compression algorithm configured for the table's column family.
 As a result bulk uploaded tables are not compressed until a major compaction is run on them.
This could be fixed by using the column family descriptors while creating HFile writers.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message