hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Rutherglen <jason.rutherg...@gmail.com>
Subject Re: Question from HBase book: "HBase currently does not do well with anything about two or three column families"
Date Mon, 13 Jun 2011 21:44:32 GMT
> Table 2 provides some actual CF/table numbers.  One of the crawl tables has
> 16 CFs and one of the Google Base tables had 29 CFs

What's Google doing in BigTable that enables so many CFs?

Is the cost in HBase the seek to each individual key in the CFs, or is
it the cost of loading each block into RAM (?), which could be
alleviated though bypassing the block cache and accessing the blocks
as if they're local.

On Mon, Jun 13, 2011 at 2:35 PM, Leif Wickland <leifwickland@gmail.com> wrote:
> Thanks for replying, J-D.
>
> My interpretation is that they try to keep that number low, from page 2:
>>
>> "It is our intent that the number of distinct column families in a
>> table be small (in the hundreds at most)"
>>
>
> Table 2 provides some actual CF/table numbers.  One of the crawl tables has
> 16 CFs and one of the Google Base tables had 29 CFs.
>
>
>> Could you just store that in the same family?
>>
>
> Yup.  I could.  Their would be a little weirdness to it, but I think it's
> doable.  It seems like that's the consensus suggestion.
>
>
>> Row locking is rarely a good idea, it doesn't scale and they currently
>> aren't persisted anywhere except the RS memory (so if it dies...).
>> Using a single family might be better for you.
>
>
> Thanks for the pointer.
>
> Leif
>

Mime
View raw message