hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Angus He <angu...@gmail.com>
Subject Re: Column-oriented data modal
Date Fri, 31 Jul 2009 07:42:24 GMT
Hi Ryan,

You cannot equate the "column" in that article of wikipedia to the
"column" in HBase.

We should assume that the word "column" in "column-oriented" is
predefined, otherwise, it is meaningless.

So we should consider the "column" in wikipedia as "column-family" in
HBase.  In this way, the article can answer 宏明's question.


On Fri, Jul 31, 2009 at 3:18 PM, Ryan Rawson<ryanobjc@gmail.com> wrote:
> Hey,
>
> The bigtable paper talks more about column families, but in HBase each
> column family is stored in it's own file.  That means there is disk
> locality for different column families.  The canonical use is to put
> web crawl data in one family, and meta data (like derived meta data)
> in another.  That way scanning just the meta data is not as expensive
> as scanning the web page crawl dump.
>
> Column families are pre-defined - the "schema" for what it's worth -
> but the 'qualifier' within a family is dynamically determined by the
> client.
>
> In the terminology of the article, hbase would be more 'row oriented',
> but with the column family snag, it isnt that simple.  Since rows from
> different families are stored in different files, reading efficiency
> is related to which column families you are reading in a query.
>
> -ryan
>
> On Fri, Jul 31, 2009 at 12:02 AM, Angus He<angushe@gmail.com> wrote:
>> Hi Ryan,
>>
>> 1. If it is not the case , what is the purpose of introduction of
>> "column family"?
>> Does the contents from different column family stored in different
>> files in HBase?
>>
>> BTW, in the bigtable paper, we can find the following text:
>> "Access control and both disk and memory accounting are performed at
>> the column-family level."
>>
>> 2. I was wondering if HBase shares the benefits described in the
>> "Benefits" sections of wikipedia article. If not, what is the meaning
>> of  "column-stores" in HBase?
>>
>>
>>
>>
>>
>> On Fri, Jul 31, 2009 at 2:30 PM, Ryan Rawson<ryanobjc@gmail.com> wrote:
>>> HBase and bigtable are referred to column-stores, but we arent a
>>> 'column oriented dbms' as described in the wikipedia.
>>>
>>> At the storage level, hbase stores key-values, where the key is a
>>> triple of row / column / timestamp.  Files are ordered lists of these
>>> key/values, and they are sorted in that order, hence rows are stored
>>> together, then sorted by column then reverse by timestamp (newest on
>>> top).
>>>
>>> Thus hbase is not a 'column store' in the sense listed in the wikipedia entry.
>>>
>>> On Thu, Jul 30, 2009 at 11:23 PM, Angus He<angushe@gmail.com> wrote:
>>>> Why don't you try to google it first?
>>>> After googling with the keyword "Column-oriented", the first result is
>>>> exactly what you want.
>>>> http://en.wikipedia.org/wiki/Column-oriented_DBMS
>>>>
>>>>
>>>>
>>>> 2009/7/31  <y_823910@tsmc.com>:
>>>>> Hi,
>>>>> Does anyone can tell me the benefit of Column-oriented data modal?
>>>>> Thank you
>>>>>
>>>>> Fleming
>>>>> 宏明
>>>>>  ---------------------------------------------------------------------------
>>>>>                                              
          TSMC PROPERTY
>>>>>  This email communication (and any attachments) is proprietary information
>>>>>  for the sole use of its
>>>>>  intended recipient. Any unauthorized review, use or distribution by
anyone
>>>>>  other than the intended
>>>>>  recipient is strictly prohibited.  If you are not the intended recipient,
>>>>>  please notify the sender by
>>>>>  replying to this email, and then delete this email and any copies of
it
>>>>>  immediately. Thank you.
>>>>>  ---------------------------------------------------------------------------
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Regards
>>>> Angus
>>>>
>>>
>>
>>
>>
>> --
>> Regards
>> Angus
>>
>



-- 
Regards
Angus

Mime
View raw message