hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan J. McDonough" <r...@damnhandy.com>
Subject Re: htable.getScanner() slow?
Date Fri, 29 May 2009 02:22:01 GMT
That's actually really good to know. I just had my bubble burst today  
when I found that my test results were so bad due slow scanner reads.  
Now I have to pull 0.2.0 to see how much faster it is ;)

Ryan-

On May 28, 2009, at 10:11 PM, Ryan Rawson wrote:

> The speed gains will be shocking.  Right now you can expect a 5-100x  
> speed
> increase, and soon it will be more like 10-20-200x.
>
> I found with 0.19 there was a 200ms floor in my tests, and 0.20 so  
> far has
> blown past that.  There is HBASE-1304 still in progress which is  
> showing
> much promise.  Please stay tuned!
>
> These are very exciting times for hbase... Soon HBase will be no  
> SPOF apart
> from HDFS, and performant as well.
>
> If you are feeling brave, try hadoop 0.20 and hbase-trunk.  Standard
> developer-preview type caveats apply, support is semi-limited since  
> the bug
> you might have is already being rewritten.
>
> Having said that, I use HBase 0.20-trunk in production.  I'm also a
> committer, so YMMV.
>
> Good luck!
> -ryan
>
> On Thu, May 28, 2009 at 7:06 PM, Xinan Wu <wuxinan@gmail.com> wrote:
>
>> Ryan,
>>
>> Thanks for the reply. I tried tweaking scanner caching but did not
>> change the speed much. The test I ended up doing was just  
>> getScanner()
>> and then immedietely scanner.close() without issuing  
>> scanner.next()...
>>
>> Anyway, it's good to know HBase 0.20 may improve the speed. Is slow
>> scanner a known issue with hbase < 0.19 too? (I am using 0.19.2/3,  
>> but
>> am just curious...)
>>
>> Xinan
>>
>> On Thu, May 28, 2009 at 6:56 PM, Ryan Rawson <ryanobjc@gmail.com>  
>> wrote:
>>> Hi,
>>>
>>> You should consider setting scanner caching to reduce the number of
>>> server-round trips.
>>>
>>> But slow scanners is a known problem with 0.19.  HBase 0.20 aims  
>>> to fix
>> this
>>> substantially.  Shocking speed gains are hopefully going to be par  
>>> for
>> the
>>> course.
>>>
>>> -ryan
>>>
>>> On Thu, May 28, 2009 at 6:47 PM, Xinan Wu <wuxinan@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I've been experimenting row scanning in hbase recently, following
>>>> advice from
>>>>
>> http://devblog.streamy.com/2009/04/23/hbase-row-key-design-for-paging-limit-offset-queries/
>>>> ?.
>>>>
>>>> One thing I notice is htable.getScanner() function call is very  
>>>> slow...
>>>>
>>>> My table schema is very simple. Integer (as binary 4 bytes) as  
>>>> rowKey,
>>>> and single column family..
>>>>
>>>> If I store 100 records in the same row with different columns, I  
>>>> can
>>>> get all the them with a single API call, at about 350 requests per
>>>> second (but paging would not be very scalable if records# gets
>>>> larger).
>>>>
>>>> If I store 100 records in 100 different rows (with sort-key  
>>>> appended
>>>> to rowKey), then I can use scanner to get them (and paging would be
>>>> more scalable). However, getScanner() call takes about 60 ms to
>>>> return, and subsequent scanner.next() calls are very fast. Overall,
>>>> this gives me only 15 requests per second.
>>>>
>>>> My dev box is ubuntu 8.04 2.4GHz Quad, 4GB mem, pretty typical one.
>>>>
>>>> Anyone has experience with slow scanner creation? Any suggestions?
>>>>
>>>> Xinan
>>>>
>>>
>>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message