hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Namkyu Chang <namchan...@gmail.com>
Subject HBase Column Family Limit Reasoning
Date Thu, 20 Jun 2013 01:01:47 GMT
Hi everyone,

I'm a newcomer to HBase, and as I was reading the documentation I wanted to
learn more about the reasoning behind the limit on the number of column
family that HBase supports.

I understand that currently HBase can only support at most 2-3 column
families due to the flushing and compaction issues, and the excessive i/o
loading it may cause for some smaller column families. Since flushing and
compaction is done on a per region basis, and each region contains most of
the column families, 1 filled column family can trigger a flushing, but the
other non-filled column families will also have to participate when really
they could wait.

However, is this the only reason? I see that this is "To be addressed by
changing flushing and compaction to work on a per column family basis", and
would this mean we can have as much CFs as we'd like after this fix? In
Google's Bigtable paper, they also limit the number of their CFs to around
100 at most. As such, are there any other factors to this limitation?

As well, are there any other ways of getting around this problem? I feel as
if there is still a limit after flushing/compaction issue is fixed, there
must be some other way of doing this. But then would it change the entire
architecture of HBase?

I've been trying to find out more about this problem online and in print,
but there seems to be very limited discussion on this topic.

Thank you in advance.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message