hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Varley <ivar...@salesforce.com>
Subject Re: HBase Stack
Date Mon, 14 Nov 2011 14:40:06 GMT

To add to what Joey said, consider that there are very significant trade-offs you make when
building something on HBase (or any of the new generation of non-relational databases). For
starters, you don't get:

 - A declarative query language like SQL that can build optimal physical access plans from
arbitrarily complex logical queries
 - Secondary indexing (so if you want to look things up by something other than the primary
key, you can't do it without a full table scan)
 - Multi-row or multi-object suspended transactions (so you can't just "roll back" a set of
changes like you can in a relational database, nor can you keep operations isolated from other
concurrent readers until they commit)

Scalable data storage systems like HBase may eventually make up for these deficiencies, but
that hasn't happened yet. Today, using HBase is only appropriate if you have a really large
amount of data and you can predict and design for pretty much all of your access patterns
up front.


On Nov 14, 2011, at 8:14 AM, Joey Echeverria wrote:

> I don't think I would try to use a single-node HBase cluster to
> replace a MySQL database. HBase has a sweet spot, both in terms of
> scale and data access patterns. In general, it should not be viewed as
> a drop in replacement to MySQL. My questions to you would be:
> 1) How much data do you need to store?
> 2) What are your access patterns? Lots of joins, individual row
> lookups, range scans, etc.
> -Joey
> On Mon, Nov 14, 2011 at 1:54 AM, Em <mailformailinglists@yahoo.de> wrote:
>> Hello list,
>> I was asked whether it is a good idea to replace the M in LAMP with
>> Hbase as well as the P with Java-Servlet (i.e. Tomcat) so that you run
>> your webserver, your hbase-instance, hadoop etc. on the same machine.
>> Are the differences compared to a LAMP-Stack in terms of performance large?
>> It is clear that a lot of benefits like redundancy etc. are not
>> available in this setup. However if the idea and userbase grows you can
>> quickly add these features to the environment by just setting up new
>> machines and connect them with eachother.
>> When I was asked about this I had no answer.
>> Hopefully you can bring light into this!
>> Kind regards,
>> Em
> -- 
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434

View raw message