hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: EC2 + Thrift inserts
Date Sat, 01 May 2010 00:01:28 GMT
Not sure why you are going through thrift if you are already using
java (you want to test thrift's speed because java isn't your main dev
language?) but it will maybe add 1ms or 2, really not that bad. Here
at StumbleUpon we use thrift to get our php website to talk to HBase
and on average we stay under 10ms for random gets. Our machines are
2xi7, 24GB, 4x1TB sata.

My coworker (Stack) pinged the author of the contrib to see if he can
make a patch for your issue.

J-D

On Fri, Apr 30, 2010 at 4:51 PM, Chris Tarnas <cft@email.com> wrote:
>
> On Apr 30, 2010, at 4:44 PM, Jean-Daniel Cryans wrote:
>
>> On Fri, Apr 30, 2010 at 4:32 PM, Chris Tarnas <cft@email.com> wrote:
>>>
>>>
>>> I'm also using thrift to connect and am wondering if that itself puts an overall
limit on scaling? It does seem that no matter how many more mappers and servers I add, even
without indexing, I am capped at about 5k rows/sec total. I'm waiting a bit as the table grows
so that it is split across more regionservers, hopefully that will help, but as far as I can
tell I am not hitting any CPU or IO constraint during my tests.
>>
>> I don't understand the "I'm also using thrift" and "how many more
>> mappers" part, you are using Thrift inside a map? Anyways, more
>> clients won't help since there's a single mega serialization of all
>> the inserts to the index table per region server. It's normal not to
>> see any CPU/mem/IO contention since, in this case, it's all about the
>> speed at which you can process a single row insertion The rest of the
>> threads just wait...
>>
>
> Sorry - should have been more clear. I'm testing now with a normal tables and regionservers
and I seem to cap out at about 5-7k rows a second for inserts. My method for doing inserts
is to use map reduce on hadoop to launch many insert processes, each process uses the local
thrift server on each node to connect to hbase. In this case I hope that other threads can
insert at the same time.
>
> -chris
>
>
>

Mime
View raw message