hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rose, Joseph" <Joseph.R...@childrens.harvard.edu>
Subject Re: Standalone == Dev Only?
Date Fri, 06 Mar 2015 21:50:51 GMT
So, I think Nick, St.Ack and Wilm have all made some excellent points, but
this last email more or less hit it on the head. Like I said, I¹m working
with patient data and while the volume is small now, it¹s not going to
stay that way. And the cell-level security is a *huge* win ‹ I¹m sure you
folks have some idea how happy that feature makes me. I¹d also rather be
writing coprocessors than triggers or ‹ heaven forbid ‹ PL/SQL.

But there¹s another, more fundamental thing: we¹re exploring other DB
architectures because classical RDBMS systems haven¹t always worked out so
well. In fact, we¹re having a bit of a hard time with the current project
because we¹ve been constrained (thus far) to a relational system and it
doesn¹t seem to be a clean fit. A key/val store, on the other hand, will
have enough flexibility to get the job done, I think. It¹s all being
prototyped now, so we¹ll see.

I think the final issue with hadoop-common (re: unimplemented sync for
local filesystems) is the one showstopper for us. We have to have assured
durability. I¹m willing to devote some cycles to get it done, so maybe I¹m
the one that says this problem is worthwhile.

Thanks for chiming in. I¹d love to hear more.


On 3/6/15, 3:02 PM, "Wilm Schumacher" <wilm.schumacher@gmail.com> wrote:

>Am 06.03.2015 um 19:18 schrieb Stack:
>> Why not use an RDBMS then?
>When I first read the hbase documentation I also stumbled about the
>"only use for large datasets" or "standalone only in dev mode" etc. In
>my point of view there are some arguments against RDBMSs and for e.g.
>hbase, although we talk about a single node application.
>* scalability is a future investment. Even if the dataset is small now,
>it doesn't mean that it is in the future, too. Scalabilty in size and
>computing power is always a good idea.
>* query language: for a user hbase is more of a database library than a
>"DBMS". For me this is a big plus, as it forces the user to do it the
>right way. Just think of SQL-injection. Or CQL-injection for that
>matter. Query languages are like scripting languages. Makes easy stuff
>easier and hard stuff harder.
>* fancy features: hbase has fancy features RDBMSs doesn't have. E.g.
>coprocessors. I know that e.g. mysql has "triggers", but they are not
>nearly as powerful as coprocessors. And don't forget that you have to
>write most of the triggers in this *curse word* SQ-language if you don't
>want to use evil hacks.
>* schema-less: another HUGE plus is the possibility to use it without a
>fixed schema. In SQL you would need several tables and do a lot of
>joins. And the output is way harder to get and to parse.
>* ecosystem: when you use hbase you automatically get the whole hadoop,
>or better apache foundation, ecosystem right away. Not only hdfs, but
>mapred, lucene, spark, kafka etc. etc..
>There are only two real arguments against hbase in that scenario:
>* joins etc.: well, in sql that's a question of minutes. In hbase that
>takes a little more effort. BUT: then it's done the right way ;).
>* RDMSs are more widely known: well ... that's not the fault of hbase ;).
>Thus, I think that the hbase community should be more self-reliant for
>that matter, even and especially for applications in the SQL realm ;).
>Which is a good opportunity to say congratulations for the hbase 1.0
>milestone. And thank you for that.
>Best wishes

View raw message