hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Standalone == Dev Only?
Date Fri, 06 Mar 2015 22:21:17 GMT
On Fri, Mar 6, 2015 at 1:50 PM, Rose, Joseph <
Joseph.Rose@childrens.harvard.edu> wrote:

> So, I think Nick, St.Ack and Wilm have all made some excellent points, but
> this last email more or less hit it on the head. Like I said, I¹m working
> with patient data and while the volume is small now, it¹s not going to
> stay that way. And the cell-level security is a *huge* win ‹ I¹m sure you
> folks have some idea how happy that feature makes me. I¹d also rather be
> writing coprocessors than triggers or ‹ heaven forbid ‹ PL/SQL.
> But there¹s another, more fundamental thing: we¹re exploring other DB
> architectures because classical RDBMS systems haven¹t always worked out so
> well. In fact, we¹re having a bit of a hard time with the current project
> because we¹ve been constrained (thus far) to a relational system and it
> doesn¹t seem to be a clean fit. A key/val store, on the other hand, will
> have enough flexibility to get the job done, I think. It¹s all being
> prototyped now, so we¹ll see.
Ok. Sounds like you know the +/-s. Was just checking.

> I think the final issue with hadoop-common (re: unimplemented sync for
> local filesystems) is the one showstopper for us. We have to have assured
> durability. I¹m willing to devote some cycles to get it done, so maybe I¹m
> the one that says this problem is worthwhile.
I remember that was once the case but looking in codebase now, sync calls
through to ProtobufLogWriter which does a 'flush' on output (though comment
says this is a noop). The output stream is an instance of
FSDataOutputStream made with a RawLOS. The flush should come out here:

220     public void flush() throws IOException { fos.flush(); }

... where fos is an instance of FileOutputStream.

In sync we go on to call hflush which looks like it calls flush again.

What hadoop/hbase versions we talking about? HADOOP-8861 added the above
behavior for hadoop 1.2.

Try it I'd say.


> Thanks for chiming in. I¹d love to hear more.
> -j
> On 3/6/15, 3:02 PM, "Wilm Schumacher" <wilm.schumacher@gmail.com> wrote:
> >Hi,
> >
> >Am 06.03.2015 um 19:18 schrieb Stack:
> >> Why not use an RDBMS then?
> >
> >When I first read the hbase documentation I also stumbled about the
> >"only use for large datasets" or "standalone only in dev mode" etc. In
> >my point of view there are some arguments against RDBMSs and for e.g.
> >hbase, although we talk about a single node application.
> >
> >* scalability is a future investment. Even if the dataset is small now,
> >it doesn't mean that it is in the future, too. Scalabilty in size and
> >computing power is always a good idea.
> >
> >* query language: for a user hbase is more of a database library than a
> >"DBMS". For me this is a big plus, as it forces the user to do it the
> >right way. Just think of SQL-injection. Or CQL-injection for that
> >matter. Query languages are like scripting languages. Makes easy stuff
> >easier and hard stuff harder.
> >
> >* fancy features: hbase has fancy features RDBMSs doesn't have. E.g.
> >coprocessors. I know that e.g. mysql has "triggers", but they are not
> >nearly as powerful as coprocessors. And don't forget that you have to
> >write most of the triggers in this *curse word* SQ-language if you don't
> >want to use evil hacks.
> >
> >* schema-less: another HUGE plus is the possibility to use it without a
> >fixed schema. In SQL you would need several tables and do a lot of
> >joins. And the output is way harder to get and to parse.
> >
> >* ecosystem: when you use hbase you automatically get the whole hadoop,
> >or better apache foundation, ecosystem right away. Not only hdfs, but
> >mapred, lucene, spark, kafka etc. etc..
> >
> >There are only two real arguments against hbase in that scenario:
> >
> >* joins etc.: well, in sql that's a question of minutes. In hbase that
> >takes a little more effort. BUT: then it's done the right way ;).
> >
> >* RDMSs are more widely known: well ... that's not the fault of hbase ;).
> >
> >Thus, I think that the hbase community should be more self-reliant for
> >that matter, even and especially for applications in the SQL realm ;).
> >Which is a good opportunity to say congratulations for the hbase 1.0
> >milestone. And thank you for that.
> >
> >Best wishes
> >
> >Wilm
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message