hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From N Kapshoo <nkaps...@gmail.com>
Subject Re: HBase schema option consideration
Date Thu, 15 Apr 2010 17:52:10 GMT
I guess I should have explained the whole situation.

I want to store this by user. So actually I am looking at a userblog table
like this:

Table: USER
Row ID: UserId
Column Family: blogs
Column : blogId
Value: JSON object

But if I were to separate out the blog table, what you are saying makes
perfect sense. I didnt realize that the column can be split up within the
family.

Thank you!

On Thu, Apr 15, 2010 at 12:40 PM, Michael Segel
<michael_segel@hotmail.com>wrote:

>
>
> Your formatting seems to have gotten messed up.
>
> Why do you want to separate this out in to multiple column families instead
> of a single column family in a table?
> Table: BLOGO
> Row ID: blog_id
> Column Family: blogs
> Columns in blogs:
> blog_posted_ts,
> blog_author,
> blog_subject,
> blog_content
>
> Then you get everything in one fetch.
>
>
> Or am I missing something?
>
> -Mike
>
> > Date: Thu, 15 Apr 2010 12:32:48 -0500
> > Subject: HBase schema option consideration
> > From: nkapshoo@gmail.com
> > To: hbase-user@hadoop.apache.org
> >
> > This is an HBase schema design question. Suppose I store blog enty
> details
> > in a HBase table:
> > blogid, blog_content, blog_author, blog_subject.
> >
> > My query is such that it always retrieves all this data at the same time.
> >
> > So is it a better idea to store all this in a single json/protobuf object
> or
> > actually separate out the details into column families?
> >
> > Option1:
> >
> > Table          RowKey          Column Family          Value
> > Blogs          BlogId                   Details
>  JSON(Content,
> > Author, Subject)
> >
> > Option2:
> >
> > Table          RowKey          Column Family
> > Blogs           BlogId                   Content
> >                                                 Author
> >                                                 Subject
> >
> >
> > I was thinking of option1 because it seems it might be faster since all
> > details will be physically stored together. But option2 is what seems to
> be
> > the trend when I look at other basic HBase schema examples out there.
> >
> > Please let me know opinions and if I am on the right track...
> >
> > Thanks in advance.
>
> _________________________________________________________________
> Hotmail has tools for the New Busy. Search, chat and e-mail from your
> inbox.
>
> http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message