hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Angeles <patrickange...@gmail.com>
Subject Re: newbie: need help on understanding HBase
Date Fri, 13 Nov 2009 18:31:52 GMT

'Big Data' refers to the overall data-set size, not the size of an
individual row / field. One of HBase's main strengths is the ability to run
in a clustered environment where you can have, say, 100s of machines acting
as your 'database' server.

HBase relies on Hadoop to do it's work. If you don't think you'll need to
scale, this solution might be overkill. You might have a look at document
oriented databases (CouchDB, etc).

On Thu, Nov 12, 2009 at 9:13 AM, Imran M Yousuf <imyousuf@gmail.com> wrote:

> Hi!
> I am absolutely new to HBase. All I have done is to read up
> documentation, presentation and getting a single instance up and
> running. I am starting on a Content Management System which will be
> used as a backend for multiple web applications of different natures.
> In the CMS:
> * User can define their content known as content type.
> * Content can have  one-2-many one-2-one and many-2-many relationship
> with other contents.
> * Content fields should be versioned
> * Content type can change in runtime, i.e. fields (a.k.a. columns in
> HBase) added and removal will not be allowed just yet.
> * Every content type will have a corresponding grammer to validate
> content of its type.
> * It will have authentication and authorization
> * It will have full text search based on Lucene/Katta.
> Based on these requirements I have the following questions that I
> would like feedback on:
> * Reading articles and presentations it looks to be HBase is a perfect
> match as it supports multi-dimensional rows, versioned cells, dynamic
> schema modification. But I could not understand what is the definition
> of "Big Data" - that is if a content size is roughly 1~100kB
> (field/cell size 0~100kB), is HBase meant for such uses?
> * Since I am not sure how much load the site will have, I am planning
> to setup DN+RS on Rackspace cloud instances with 2GB/80GB HDD with a
> view of with revenue and pageviews increasing, more moderate
> "commodity" hardware can be added progressively. Any
> comments/suggestions on this strategy?
> * Where can I read up on or checkout samples RDBMS schemas converted
> to HBase schema? Basically, I want to read up efficient schema design
> for different cardinal relationships between objects.
> Thank you,
> --
> Imran M Yousuf
> Entrepreneur & Software Engineer
> Smart IT Engineering
> Dhaka, Bangladesh
> Email: imran@smartitengineering.com
> Blog: http://imyousuf-tech.blogs.smartitengineering.com/
> Mobile: +880-1711402557

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message