hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: How many column families in one table ?
Date Fri, 28 Jun 2013 12:42:43 GMT
Beyond the physical limitations (cost constraints) there's a logical one in terms of design.

I just did a talk at the CHUG on schema design and the key was to understand how and why one
should use column families. 

From a logical design perspective you would want to limit data within a CF to data that you
grab all at once. Meaning that when you do your scan / get, you want to minimize the column
families that you have to hit. 

So you need to think about how you approach organizing your data. 

The best example of this is to look at an order entry system where the column families are
broken out in to Order Entry, Pick Slips, Shipping and Invoices. 

While they all use the same key (customer number | order number) the data for each part of
the order entry through fulfillment is accessed separately. 

So even in this example, you have 4 column families in use for this one table. 



On Jun 28, 2013, at 7:27 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fb&subj=Re+HBase+Column+Family+Limit+Reasoning

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message