hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stuti Awasthi <stutiawas...@hcl.com>
Subject RE: Using multiple column families
Date Mon, 12 Sep 2011 07:05:49 GMT
Hi,

I am also looking answer for similar question. In my scenario we will be having petabytes
of data to handle. Currently I am working with schema which has 3-4 column family with them.
What the major issues we can face if we have multiple column family.

I have read that each column family will be stored as separate Hfile in regionserver and if
we search by row-id and column family that will be useful as client will go to Hfile for specific
column family.
If we have flat table structure then we will land up either having more tables with data replication
because of the data dependencies on each other.

Please suggest


-----Original Message-----
From: Imran M Yousuf [mailto:imyousuf@gmail.com]
Sent: Saturday, September 10, 2011 6:55 AM
To: user@hbase.apache.org
Subject: Re: Using multiple column families

Hi J-D,

Thanks for your feedback.

(replies inline)
On Sat, Sep 10, 2011 at 5:39 AM, Jean-Daniel Cryans <jdcryans@apache.org> wrote:
> 20k rows? If this is your only use case, you don't need HBase :)
>

Its one of several others

> If it's 20k rows times a gazillion columns per row, then I would
> recommend flattening out the rows instead.
>

Well, our guess is at the moment their would not be more than 500 cells per family to start
with.

> If it's just one small table among others, then you probably won't be
> bothered by the multiple families.
>

We actually have many other tables which are flattened out to a single column family and this
is one table for which we are using more than 1 column family.

Thanks once again.

Imran

> J-D
>
> On Thu, Sep 8, 2011 at 10:07 PM, Imran M Yousuf <imyousuf@gmail.com> wrote:
>> Hi,
>>
>> Firstly, I have read in the mailing list before that having more than
>> 1 column family is not recommended. I am more interested to know
>> whether it is a problem in my use case as well or not.
>>
>> I have a strong entitly and it has 6 weak entities all with 1-to-many
>> cardinal relationship to the strong entity. Furthermore, they are all
>> loaded in mutually exclusive manner, i.e. if A is strong entity and
>> its weak entities are P, Q, R, S, T, U in that case no 2 weak
>> entities are accessed at once. Moreover their lifecycles are
>> independent of each other. My current implementation is I have one
>> column family for the strong entity and one for each weak entities.
>> So for a given row I only load one column family at a time. The
>> obvious advantages are that
>> - deleting strong entity automatically deletes the weak entities as
>> they are a single row, delete all of a kind weak entity for a
>> specific weak entity is as simple as deleting all cells in a column
>> family for a row. Our assumption (pretty high than what we expect) is
>> that we will not have more than 20k rows in that table. Under these
>> circumstance how bad is it to have 7 column families?
>>
>> We would be glad if you would kindly share thoughts and feedback on this issue.
>>
>> Thank you,
>>
>> --
>> Imran M Yousuf
>> Entrepreneur & CEO
>> Smart IT Engineering Ltd.
>> Dhaka, Bangladesh
>> Twitter: @imyousuf - http://twitter.com/imyousuf
>> Blog: http://imyousuf-tech.blogs.smartitengineering.com/
>> Mobile: +880-1711402557
>>
>



--
Imran M Yousuf
Entrepreneur & CEO
Smart IT Engineering Ltd.
Dhaka, Bangladesh
Twitter: @imyousuf - http://twitter.com/imyousuf
Blog: http://imyousuf-tech.blogs.smartitengineering.com/
Mobile: +880-1711402557

::DISCLAIMER::
-----------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended for the named
recipient(s) only.
It shall not attach any liability on the originator or HCL or its affiliates. Any views or
opinions presented in
this email are solely those of the author and may not necessarily reflect the opinions of
HCL or its affiliates.
Any form of reproduction, dissemination, copying, disclosure, modification, distribution and
/ or publication of
this message without the prior written consent of the author of this e-mail is strictly prohibited.
If you have
received this email in error please delete it and notify the sender immediately. Before opening
any mail and
attachments please check them for viruses and defect.

-----------------------------------------------------------------------------------------------------------------------

Mime
View raw message