hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lin Ma <lin...@gmail.com>
Subject Re: column based or row based storage for HBase?
Date Sun, 12 Aug 2012 10:17:07 GMT
Hi Jason,

This is very good reference. I read it from begin to the end and learned a
lot. Thanks and have a good weekend.

regards,
Lin

On Tue, Aug 7, 2012 at 2:00 AM, Jason Frantz <jfrantz@maprtech.com> wrote:

> Lin,
>
> Looks like your questions may already be answered, but you might find the
> following link comparing "traditional" columnar databases against
> HBase/BigTable interesting:
>
>
> http://dbmsmusings.blogspot.com/2010/03/distinguishing-two-major-types-of_29.html
>
> -Jason
>
> On Sun, Aug 5, 2012 at 8:03 PM, Lin Ma <linlma@gmail.com> wrote:
>
> > Thank you for the informative reply, Mohit!
> >
> > Some more comments,
> >
> > 1. actually my confusion about column based storage is from the book
> "HBase
> > The Definitive Guide", chapter 1, section "the Dawn of Big Data", which
> > draw a picture showing HBase store the same column of all different rows
> > continuously physically in storage. Any comments?
> >
> > 2. I want to confirm my understanding is correct -- supposing I have only
> > one column family with 10 columns, the physical storage is row (with all
> > related columns) after row, other than store 1st column of all rows, then
> > store 2nd columns of all rows, etc?
> >
> > 3. It seems when we say column based storage, there are two meanings, (1)
> > column oriented database => en.wikipedia.org/wiki/Column-oriented_DBMS,
> > where the same column of different rows stored together, (2) and column
> > oriented architecture, e.g. how Hbase is designed, which is used to
> > describe the pattern to store sparse, large number of columns (with NULL
> > for free). Any comments?
> >
> > regards,
> > Lin
> >
> > On Mon, Aug 6, 2012 at 12:08 AM, Mohit Anchlia <mohitanchlia@gmail.com
> > >wrote:
> >
> > > On Sun, Aug 5, 2012 at 6:04 AM, Lin Ma <linlma@gmail.com> wrote:
> > >
> > > > Hi guys,
> > > >
> > > > I am wondering whether HBase is using column based storage or row
> based
> > > > storage?
> > > >
> > > >    - I read some technical documents and mentioned advantages of
> HBase
> > is
> > > >    using column based storage to store similar data together to
> foster
> > > >    compression. So it means same columns of different rows are stored
> > > > together;
> > >
> > >
> > > Probably what you read was in context of Column Families. HBase has
> > concept
> > > of column family similar to Google's bigtable. And the store files on
> > disk
> > > is per column family. All columns of a given column family are in one
> > store
> > > file and columns of different column family is a different file.
> > >
> > >
> > > >    - But I also learned HBase is a sorted key-value map in underlying
> > > >    HFile. It uses key to address all related columns for that key
> > (row),
> > > > so it
> > > >    seems to be a row based storage?
> > > >
> > > HBase stores entire row together along with columns represented by
> > > KeyValue. This is also called cell in HBase.
> > >
> > >
> > > > It is appreciated if anyone could clarify my confusions. Any related
> > > > documents or code for more details are welcome.
> > > >
> > > > thanks in advance,
> > > >
> > > > Lin
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message