hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Bates <christopher.andrew.ba...@gmail.com>
Subject Re: newbie: need help on understanding HBase
Date Thu, 12 Nov 2009 15:50:09 GMT
Hi Imran,

I'm a new user as well.  I found these presentations helpful in answering
most of your questions:
http://wiki.apache.org/hadoop/HBase/HBasePresentations

There are HBase schema designs in there.

You might also want to read the original BigTable paper and the chapter on
HBase in OReilly's Hadoop book.

But to answer one of your questions--"Big Data" usually refers to a dataset
that is millions to billions in length.  But "Big Data" doesn't mean you
have to use a tool like HBase.  We have some MySQL tables that are 100
million rows and work fine.  You have to identify what works best for your
use and use the most appropriate tool.

On Thu, Nov 12, 2009 at 9:13 AM, Imran M Yousuf <imyousuf@gmail.com> wrote:

> Hi!
>
> I am absolutely new to HBase. All I have done is to read up
> documentation, presentation and getting a single instance up and
> running. I am starting on a Content Management System which will be
> used as a backend for multiple web applications of different natures.
> In the CMS:
> * User can define their content known as content type.
> * Content can have  one-2-many one-2-one and many-2-many relationship
> with other contents.
> * Content fields should be versioned
> * Content type can change in runtime, i.e. fields (a.k.a. columns in
> HBase) added and removal will not be allowed just yet.
> * Every content type will have a corresponding grammer to validate
> content of its type.
> * It will have authentication and authorization
> * It will have full text search based on Lucene/Katta.
>
> Based on these requirements I have the following questions that I
> would like feedback on:
> * Reading articles and presentations it looks to be HBase is a perfect
> match as it supports multi-dimensional rows, versioned cells, dynamic
> schema modification. But I could not understand what is the definition
> of "Big Data" - that is if a content size is roughly 1~100kB
> (field/cell size 0~100kB), is HBase meant for such uses?
> * Since I am not sure how much load the site will have, I am planning
> to setup DN+RS on Rackspace cloud instances with 2GB/80GB HDD with a
> view of with revenue and pageviews increasing, more moderate
> "commodity" hardware can be added progressively. Any
> comments/suggestions on this strategy?
> * Where can I read up on or checkout samples RDBMS schemas converted
> to HBase schema? Basically, I want to read up efficient schema design
> for different cardinal relationships between objects.
>
> Thank you,
>
> --
> Imran M Yousuf
> Entrepreneur & Software Engineer
> Smart IT Engineering
> Dhaka, Bangladesh
> Email: imran@smartitengineering.com
> Blog: http://imyousuf-tech.blogs.smartitengineering.com/
> Mobile: +880-1711402557
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message