hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ramasubramanian <ramasubramanian.naraya...@gmail.com>
Subject Re: Regarding Rowkey and Column Family
Date Mon, 24 Dec 2012 14:37:42 GMT
Hi,

Thanks for your detailed explanation. 

The address will be multiple ones for a single customer. For example a same customer can hold
home address, office address, etc., hence I grouped into different column family. 

1. Is my approach is correct?

2. What can we have as a rowkey for both these column families?

3. I think customer Number is sequence hence planning to include YYYYMMDD along with customer
number in the rowkey. Is that fine?

Regards,
Rams

On 24-Dec-2012, at 7:54 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.org> wrote:

> Hi Rams,
> 
> How are you going to access you data?
> 
> HBase will create one cell (Which mean rowkey+timestamp+...+data) for
> eache cell.
> 
> Are you really going to sometime access Address Line1 without
> accessing Address Line2?
> 
> Are you really going to access the City wihtout accessing the State?
> 
> If not, why not just put a JSon object with all this data in a single cell?
> 
> So at the end your table will look llike:
> 
> *Table Name : Customer*
> *
> *
> *Field Name         Column Family*
> Customer Information CF1
> Address CF1
> 
> 
> In Customer Information you bundle:
> Customer Number      CF1
> DOB                  CF1
> FName                CF1
> MName                CF1
> LName                CF1
> 
> And in Address you bundle:
> Address Type         CF2
> Address Line1        CF2
> Address Line2        CF2
> Address Line3        CF2
> Address Line4        CF2
> State                CF2
> City                 CF2
> Country              CF2
> 
> But if you always access the address when you access the customer
> information, then the best way might be to just put all those field in
> a single JSon object, and have just one CF and on C in your table...
> 
> Regarding the key, if you customer number is sequential and you insert
> based on this field, you will hotspot one server at a time... If the
> number is "random", then it's ok.
> 
> HTH.
> 
> JM
> 
> 2012/12/24, Mohammad Tariq <dontariq@gmail.com>:
>> it is. but why do you want  to do that? you will run into issues once your
>> data starts growing. each cell, along with the actual value stores few
>> additional things, *row, column *and the *version. *as a result you will
>> loose space if you do that.
>> 
>> Best Regards,
>> Tariq
>> +91-9741563634
>> https://mtariq.jux.com/
>> 
>> 
>> On Mon, Dec 24, 2012 at 5:00 PM, Ramasubramanian Narayanan <
>> ramasubramanian.narayanan@gmail.com> wrote:
>> 
>>> Hi,
>>> 
>>> Is it ok to have same column into different column familes?
>>> 
>>> regards,
>>> Rams
>>> 
>>> On Mon, Dec 24, 2012 at 4:06 PM, Mohammad Tariq <dontariq@gmail.com>
>>> wrote:
>>> 
>>>> you are creating 2 different rows here. cf means how column are clubbed
>>>> together as a single entity which is represented by that cf. but here
>>>> you
>>>> are creating 2 different rows having one cf each, CF1 and CF2
>>> respectively.
>>>> if you want to have 1 row with 2 cf, you have to do use same rowkey for
>>>> both the cf.
>>>> 
>>>> 
>>>> 
>>>> Best Regards,
>>>> Tariq
>>>> +91-9741563634
>>>> https://mtariq.jux.com/
>>>> 
>>>> 
>>>> On Mon, Dec 24, 2012 at 3:41 PM, Ramasubramanian Narayanan <
>>>> ramasubramanian.narayanan@gmail.com> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> *Table Name : Customer*
>>>>> *
>>>>> *
>>>>> *Field Name         Column Family*
>>>>> Customer Number      CF1
>>>>> DOB                  CF1
>>>>> FName                CF1
>>>>> MName                CF1
>>>>> LName                CF1
>>>>> Address Type         CF2
>>>>> Address Line1        CF2
>>>>> Address Line2        CF2
>>>>> Address Line3        CF2
>>>>> Address Line4        CF2
>>>>> State                CF2
>>>>> City                 CF2
>>>>> Country              CF2
>>>>> 
>>>>> Is it good to have rowkey as follows for the same table?
>>>>> 
>>>>> Rowkey Design:
>>>>> --------------
>>>>> For CF1 : Customer Number + YYYYMMD (business date)
>>>>> For CF2 : Customer Number + Address Type
>>>>> 
>>>>> Note :
>>>>> Address Type can be any of HOME/OFFICE/OTHERS
>>>>> 
>>>>> regards,
>>>>> Rams
>> 

Mime
View raw message