hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ramkrishna vasudevan <ramkrishna.s.vasude...@gmail.com>
Subject Re: Column family names and data size on disk
Date Wed, 28 Nov 2012 14:32:03 GMT
Hi Matan

Yes for every cell in the Hfile the rowkey, col family will be repeating.
That is because every cell is unique wrt to the rowkey, colfamily and the
qualifier.
To encode this we have some encoding algos available in 0.94 and above.
 Try them !!!
Hope this helps.
Regards
Ram

On Wed, Nov 28, 2012 at 7:54 PM, matan <matan@cloudaloe.org> wrote:

> Hi,
>
> I am sort of wondering why does the Column Family name repeat inside the
> HFile for every Key/Value pair. This repetition presumably implies that
> column family names should be kept short and cryptic. Is it because an
> HFile
> may contain Key/Value pairs for more than a single column family?
>
> Just trying to improve my understanding of why and how HBase works... your
> knowledgeable insight is most welcome.
>
> Thanks,
> Matan
>
>
>
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/Column-family-names-and-data-size-on-disk-tp4034507.html
> Sent from the HBase User mailing list archive at Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message