hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Kellerman <...@powerset.com>
Subject RE: compression in HBase
Date Thu, 10 Jul 2008 15:09:56 GMT
Record compression means that exactly one row/family:member/ts is compressed.

Block compression means that blocks in HDFS are compressed. A block may
contain multiple records if they are shorter than one HDFS block or may
only contain part of a record if the record is longer than a HDFS block.

---
Jim Kellerman, Senior Engineer; Powerset


> -----Original Message-----
> From: Rong-en Fan [mailto:grafan@gmail.com]
> Sent: Thursday, July 10, 2008 7:52 AM
> To: hbase-user@hadoop.apache.org
> Subject: compression in HBase
>
> I'm reading
>
> http://jimbojw.com/wiki/index.php?title=Understanding_HBase_co
> lumn-family_performance_options
>
> but get confused about BLOCK and RECORD compression. In my
> understanding, the these two options govern the underlying
> MapFile's data file, which is a SequenceFile. In HBase, each
> key in the SequenceFile is actually row/column/ts. So,
> specifying RECORD means each value in *one* row/column/ts is
> compressed. With BLOCK, it may cover the same row (since one
> row may have more than one row/column/ts keys in the
> underlying MapFile). If this is correct, then I don't get the
> point mentioned in the wiki above.
>
> Any ideas?
>
> Thanks,
> Rong-En Fan
>
> No virus found in this incoming message.
> Checked by AVG - http://www.avg.com
> Version: 8.0.138 / Virus Database: 270.4.7/1542 - Release
> Date: 7/9/2008 6:50 AM
>
No virus found in this outgoing message.
Checked by AVG - http://www.avg.com
Version: 8.0.138 / Virus Database: 270.4.7/1542 - Release Date: 7/9/2008 6:50 AM

Mime
View raw message