hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Disk space usage of HFilev1 vs HFilev2
Date Tue, 28 Aug 2012 14:37:16 GMT
On Mon, Aug 27, 2012 at 8:30 PM, anil gupta <anilgupta84@gmail.com> wrote:
> Hi All,
>
> Here are the steps i followed to load the table with HFilev1 format:
> 1. Set the property hfile.format.version to 1.
> 2. Updated the conf across the cluster.
> 3. Restarted the cluster.
> 4. Ran the bulk loader.
>
> Table has 34 million records and one column family.
> Results:
> HDFS space for one replica of table in HFilev2:39.8 GB
> HDFS space for one replica of table in HFilev1:38.4 GB
>
> Ironically, as per the above results HFileV1 is taking 3.5% lesser space
> than HFileV2 format. I also skimmed through the code and i saw references
> to "hfile.format.version" in HFile.java class.
>

It would be interesting to know what makes up the 3.5% difference?
More metadata on the end of the file on v2?

St.Ack

Mime
View raw message