hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Is the thrift server a likely bottleneck?
Date Thu, 03 Sep 2009 11:23:32 GMT

A BatchMutation is for a single row and multiple columns (for that
row) so in the HBase Thrift API you cannot batch insert many rows. In
the Java API the equivalent to BatchMutation is Put (which before was
named BatchInsert but people got confused, just like now).


On Thu, Sep 3, 2009 at 4:25 AM, Sylvain Hellegouarch<sh@defuze.org> wrote:
>> Thrift spawns as many threads as requests, so running more than one
>> shouldn't benefit you much I think?
> Being a little unaware of Java's cleverness with threads I cannot really
> say but you're probably right.
>> I run 1 thriftserver per regionserver, co existing, and then use
>> TSocketPool on the client side to spread load around.
>> But generally, YES, the thrift server could be a bottleneck.  The main
>> problem with thrift and performance is you cannot control the scanner
>> caching directly, and you cannot do bulk commits.  Both of those would
>> require some API changes, and while not impossible, just hasn't been
>> prioritized.
> I'm a little confused then as what is the difference between the bulk
> commit you mention and batch mutations support in the thrift interface.
> Moreover, the Hbase 0.20 API is a bit unclear as to when the commit is
> done when using Put. In fact I'm a little unclear as to what is the best
> practice to write lots of rows so that it is as efficient as it can. One
> by one? Batch Mutations?
>> Personally, we use thrift for php scripts, and use the Java API for
>> map-reduces and bulk data operations. Thus achieving the best of both
>> worlds: cross language access from PHP and the faster Java-based API
>> for certain scenarios.
> We will be using Pig Latin probably for the M/R with a Java adapter to
> fetch rows from HBase. However we do use Python for writing and I'm
> willing to use Jython but that would probably create other dependencies
> issue that I'd be happy to avoid if Thrift is good enough :)
> Thanks,
> - Sylvain
> --
> Sylvain Hellegouarch
> http://www.defuze.org

View raw message