hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Dagaev" <michael.dag...@gmail.com>
Subject Re: HTable Pooling Results
Date Tue, 16 Dec 2008 21:19:18 GMT
    When I started testing it was pretty obvious that the synchronized
commit of Htable actually serializes access to the table, so pooling
was a trivial solution to work around the problem. It is good enough,
however, I hope that the RPC limitation will be fixed.

I believe this is not a big deal from the technical perspective but I
can understand why it is not trivial. It should be fixed by Hadoop.
Hadoop do not want to fix it, probably. Therefore, Hbase should
consider an alternative for Hadoop RPC and this is a big deal.

M.

On Tue, Dec 16, 2008 at 10:55 PM, Slava Gorelik <slava.gorelik@gmail.com> wrote:
> BTW, i recall, when we just started to use HBase we tried to to performance
> testing with multiple thread (big amount of small size transactions) and
> there was no improvements, that the reason why we started to parallel in
> multiple process.
> Best Regards.
>
> On Tue, Dec 16, 2008 at 9:34 PM, stack <stack@duboce.net> wrote:
>
>> Your inserts are 'fat' though, aren't they Michael?  Laden with lots of
>> columns?
>> St.Ack
>>
>>
>> Michael Dagaev wrote:
>>
>>> Can you also post your results to the list ? We are using the apache
>>> commons pool and got the max. throughput ~10 inserts per sec. for one
>>> Hbase client, which is good enough for now.
>>>
>>> On Tue, Dec 16, 2008 at 9:20 PM, Slava Gorelik <slava.gorelik@gmail.com>
>>> wrote:
>>>
>>>
>>>> Hi.Looks very interesting, i'll try it.
>>>>
>>>> Thank You.
>>>>
>>>>
>>>> On Tue, Dec 16, 2008 at 8:51 PM, Michael Dagaev <
>>>> michael.dagaev@gmail.com>wrote:
>>>>
>>>>
>>>>
>>>>> Hi, all
>>>>>
>>>>>   Looks like the pooling does improve the throughput. I guess the
>>>>> pool size should depend on the number of region servers, i.e. max pool
>>>>> size = k*N, where N is the number of region servers and k > 1.
>>>>> Currently,  I am using max size =20 for a 4-host cluster and the
>>>>> maximum throughput is achieved for ~20 concurrent threads.  When we
>>>>> tried to add more threads the throughput did not increase, so the only
>>>>> solution is adding  JVMs on client side.
>>>>>
>>>>> Thank you for your cooperation,
>>>>> M.
>>>>>
>>>>>
>>>>>
>>>>
>>
>

Mime
View raw message