hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Srikanth P. Shreenivas" <Srikanth_Shreeni...@mindtree.com>
Subject Tall-Narrow vs. Flat-Wide Tables
Date Thu, 01 Sep 2011 18:52:43 GMT

HBase: The Definitive Guide book's chapter 9 talks about Tall-Narrow vs Flat-wide tables.

It seems to propose that Tall-Narrow tables (more rows, less columns) is better design.  One
of the issue it talks about with "Flat-wide" tables (less rows and more columns) is
In addition, HBase can only split at row boundaries, which also enforces the recommendation
to go with tall-narrow tables. Imagine you have all emails of a user in a single row. This
will work for the majority of users, but there will be outliers that will have magnitudes
of emails more in their inbox. So much so that a single row could outgrow the maximum file/region
size and work against the region split facility.

So, my query is that is it a bad idea to have a table as given in above example wherein emails
are stored by adding columns.   I seem to have a similar table in my application, wherein
I have a region size of 1GB and cell value of 10KB.  So, will I run into region-split issue
mentioned above after 100000 (1GB / 10KB = 100000)  columns.




View raw message