hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Keller <brya...@gmail.com>
Subject Re: Poor HBase map-reduce scan performance
Date Mon, 01 Jul 2013 04:23:52 GMT
I'll attach my patch to HBASE-8369 tomorrow.

On Jun 28, 2013, at 10:56 AM, lars hofhansl <larsh@apache.org> wrote:

> If we can make a clean patch with minimal impact to existing code I would be supportive
of a backport to 0.94.
> 
> -- Lars
> 
> 
> 
> ----- Original Message -----
> From: Bryan Keller <bryanck@gmail.com>
> To: user@hbase.apache.org; lars hofhansl <larsh@apache.org>
> Cc: 
> Sent: Tuesday, June 25, 2013 1:56 AM
> Subject: Re: Poor HBase map-reduce scan performance
> 
> I tweaked Enis's snapshot input format and backported it to 0.94.6 and have snapshot
scanning functional on my system. Performance is dramatically better, as expected i suppose.
I'm seeing about 3.6x faster performance vs TableInputFormat. Also, HBase doesn't get bogged
down during a scan as the regionserver is being bypassed. I'm very excited by this. There
are some issues with file permissions and library dependencies but nothing that can't be worked
out.
> 
> On Jun 5, 2013, at 6:03 PM, lars hofhansl <larsh@apache.org> wrote:
> 
>> That's exactly the kind of pre-fetching I was investigating a bit ago (made a patch,
but ran out of time).
>> This pre-fetching is strictly client only, where the client keeps the server busy
while it is processing the previous batch, but filling up a 2nd buffer.
>> 
>> 
>> -- Lars
>> 
>> 
>> 
>> ________________________________
>> From: Sandy Pratt <prattrs@adobe.com>
>> To: "user@hbase.apache.org" <user@hbase.apache.org> 
>> Sent: Wednesday, June 5, 2013 10:58 AM
>> Subject: Re: Poor HBase map-reduce scan performance
>> 
>> 
>> Yong,
>> 
>> As a thought experiment, imagine how it impacts the throughput of TCP to
>> keep the window size at 1.  That means there's only one packet in flight
>> at a time, and total throughput is a fraction of what it could be.
>> 
>> That's effectively what happens with RPC.  The server sends a batch, then
>> does nothing while it waits for the client to ask for more.  During that
>> time, the pipe between them is empty.  Increasing the batch size can help
>> a bit, in essence creating a really huge packet, but the problem remains.
>> There will always be stalls in the pipe.
>> 
>> What you want is for the window size to be large enough that the pipe is
>> saturated.  A streaming API accomplishes that by stuffing data down the
>> network pipe as quickly as possible.
>> 
>> Sandy
>> 
>> On 6/5/13 7:55 AM, "yonghu" <yongyong313@gmail.com> wrote:
>> 
>>> Can anyone explain why client + rpc + server will decrease the performance
>>> of scanning? I mean the Regionserver and Tasktracker are the same node
>>> when
>>> you use MapReduce to scan the HBase table. So, in my understanding, there
>>> will be no rpc cost.
>>> 
>>> Thanks!
>>> 
>>> Yong
>>> 
>>> 
>>> On Wed, Jun 5, 2013 at 10:09 AM, Sandy Pratt <prattrs@adobe.com> wrote:
>>> 
>>>> https://issues.apache.org/jira/browse/HBASE-8691
>>>> 
>>>> 
>>>> On 6/4/13 6:11 PM, "Sandy Pratt" <prattrs@adobe.com> wrote:
>>>> 
>>>>> Haven't had a chance to write a JIRA yet, but I thought I'd pop in here
>>>>> with an update in the meantime.
>>>>> 
>>>>> I tried a number of different approaches to eliminate latency and
>>>>> "bubbles" in the scan pipeline, and eventually arrived at adding a
>>>>> streaming scan API to the region server, along with refactoring the
>>>> scan
>>>>> interface into an event-drive message receiver interface.  In so
>>>> doing, I
>>>>> was able to take scan speed on my cluster from 59,537 records/sec with
>>>> the
>>>>> classic scanner to 222,703 records per second with my new scan API.
>>>>> Needless to say, I'm pleased ;)
>>>>> 
>>>>> More details forthcoming when I get a chance.
>>>>> 
>>>>> Thanks,
>>>>> Sandy
>>>>> 
>>>>> On 5/23/13 3:47 PM, "Ted Yu" <yuzhihong@gmail.com> wrote:
>>>>> 
>>>>>> Thanks for the update, Sandy.
>>>>>> 
>>>>>> If you can open a JIRA and attach your producer / consumer scanner
>>>> there,
>>>>>> that would be great.
>>>>>> 
>>>>>> On Thu, May 23, 2013 at 3:42 PM, Sandy Pratt <prattrs@adobe.com>
>>>> wrote:
>>>>>> 
>>>>>>> I wrote myself a Scanner wrapper that uses a producer/consumer
>>>> queue to
>>>>>>> keep the client fed with a full buffer as much as possible. 
When
>>>>>>> scanning
>>>>>>> my table with scanner caching at 100 records, I see about a 24%
>>>> uplift
>>>>>>> in
>>>>>>> performance (~35k records/sec with the ClientScanner and ~44k
>>>>>>> records/sec
>>>>>>> with my P/C scanner).  However, when I set scanner caching to
5000,
>>>>>>> it's
>>>>>>> more of a wash compared to the standard ClientScanner: ~53k
>>>> records/sec
>>>>>>> with the ClientScanner and ~60k records/sec with the P/C scanner.
>>>>>>> 
>>>>>>> I'm not sure what to make of those results.  I think next I'll
shut
>>>>>>> down
>>>>>>> HBase and read the HFiles directly, to see if there's a drop
off in
>>>>>>> performance between reading them directly vs. via the RegionServer.
>>>>>>> 
>>>>>>> I still think that to really solve this there needs to be sliding
>>>>>>> window
>>>>>>> of records in flight between disk and RS, and between RS and
client.
>>>>>>> I'm
>>>>>>> thinking there's probably a single batch of records in flight
>>>> between
>>>>>>> RS
>>>>>>> and client at the moment.
>>>>>>> 
>>>>>>> Sandy
>>>>>>> 
>>>>>>> On 5/23/13 8:45 AM, "Bryan Keller" <bryanck@gmail.com>
wrote:
>>>>>>> 
>>>>>>>> I am considering scanning a snapshot instead of the table.
I
>>>> believe
>>>>>>> this
>>>>>>>> is what the ExportSnapshot class does. If I could use the
scanning
>>>>>>> code
>>>>>>>> from ExportSnapshot then I will be able to scan the HDFS
files
>>>>>>> directly
>>>>>>>> and bypass the regionservers. This could potentially give
me a huge
>>>>>>> boost
>>>>>>>> in performance for full table scans. However, it doesn't
really
>>>>>>> address
>>>>>>>> the poor scan performance against a table.
>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
> 


Mime
View raw message