Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 394AE1031A for ; Wed, 5 Jun 2013 21:00:01 +0000 (UTC) Received: (qmail 47098 invoked by uid 500); 5 Jun 2013 20:59:59 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 47047 invoked by uid 500); 5 Jun 2013 20:59:59 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 47039 invoked by uid 99); 5 Jun 2013 20:59:59 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Jun 2013 20:59:59 +0000 X-ASF-Spam-Status: No, hits=-1.3 required=5.0 tests=FRT_ADOBE2,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of prattrs@adobe.com designates 64.18.1.37 as permitted sender) Received: from [64.18.1.37] (HELO exprod6og116.obsmtp.com) (64.18.1.37) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Jun 2013 20:59:50 +0000 Received: from outbound-smtp-1.corp.adobe.com ([192.150.11.134]) by exprod6ob116.postini.com ([64.18.5.12]) with SMTP ID DSNKUa+msJtKj5HYUicwR23g9rr/11d99q44@postini.com; Wed, 05 Jun 2013 13:59:30 PDT Received: from inner-relay-2.corp.adobe.com ([153.32.1.52]) by outbound-smtp-1.corp.adobe.com (8.12.10/8.12.10) with ESMTP id r55IsBD8012879 for ; Wed, 5 Jun 2013 11:54:11 -0700 (PDT) Received: from nacas01.corp.adobe.com (nacas01.corp.adobe.com [10.8.189.99]) by inner-relay-2.corp.adobe.com (8.12.10/8.12.10) with ESMTP id r55IvTw7025837 for ; Wed, 5 Jun 2013 11:57:29 -0700 (PDT) Received: from NAMBX02.corp.adobe.com ([10.8.127.96]) by nacas01.corp.adobe.com ([10.8.189.99]) with mapi; Wed, 5 Jun 2013 11:57:29 -0700 From: Sandy Pratt To: "user@hbase.apache.org" Date: Wed, 5 Jun 2013 11:57:28 -0700 Subject: Re: Poor HBase map-reduce scan performance Thread-Topic: Poor HBase map-reduce scan performance Thread-Index: Ac5iHoahxXzoy5LpQm+YuABRAgfTwQ== Message-ID: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.3.4.130416 acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org That's my understanding of how the current scan API works, yes. The client calls next() to fetch a batch. While it's waiting for the response from the server, it blocks. After the server responds to the next() call, it does nothing for that scanner until the following next() call. That makes for some significant bubbles in the pipeline even with larger batch sizes for next(). Anyone please correct me if I'm wrong. On 6/5/13 11:14 AM, "yonghu" wrote: >Dear Sandy, > >Thanks for your explanation. > >However, what I don't get is your term "client", is this "client" means >MapReduce jobs? If I understand you right, this means Map function will >process the tuples and during this processing time, the regionserver did >nothing? > >regards! > >Yong > > >On Wed, Jun 5, 2013 at 6:12 PM, Ted Yu wrote: > >> bq. the Regionserver and Tasktracker are the same node when you use >> MapReduce to scan the HBase table. >> >> The scan performed by the Tasktracker on that node would very likely >>access >> data hosted by region server on other node(s). So there would be RPC >> involved. >> >> There is some discussion on providing shadow reads - writes to specific >> region are solely served by one region server but the reads can be >>served >> by more than one region server. Of course consistency is one aspect that >> must be tackled. >> >> Cheers >> >> On Wed, Jun 5, 2013 at 7:55 AM, yonghu wrote: >> >> > Can anyone explain why client + rpc + server will decrease the >> performance >> > of scanning? I mean the Regionserver and Tasktracker are the same node >> when >> > you use MapReduce to scan the HBase table. So, in my understanding, >>there >> > will be no rpc cost. >> > >> > Thanks! >> > >> > Yong >> > >> > >> > On Wed, Jun 5, 2013 at 10:09 AM, Sandy Pratt >>wrote: >> > >> > > https://issues.apache.org/jira/browse/HBASE-8691 >> > > >> > > >> > > On 6/4/13 6:11 PM, "Sandy Pratt" wrote: >> > > >> > > >Haven't had a chance to write a JIRA yet, but I thought I'd pop in >> here >> > > >with an update in the meantime. >> > > > >> > > >I tried a number of different approaches to eliminate latency and >> > > >"bubbles" in the scan pipeline, and eventually arrived at adding a >> > > >streaming scan API to the region server, along with refactoring the >> scan >> > > >interface into an event-drive message receiver interface. In so >> doing, >> > I >> > > >was able to take scan speed on my cluster from 59,537 records/sec >>with >> > the >> > > >classic scanner to 222,703 records per second with my new scan API. >> > > >Needless to say, I'm pleased ;) >> > > > >> > > >More details forthcoming when I get a chance. >> > > > >> > > >Thanks, >> > > >Sandy >> > > > >> > > >On 5/23/13 3:47 PM, "Ted Yu" wrote: >> > > > >> > > >>Thanks for the update, Sandy. >> > > >> >> > > >>If you can open a JIRA and attach your producer / consumer scanner >> > there, >> > > >>that would be great. >> > > >> >> > > >>On Thu, May 23, 2013 at 3:42 PM, Sandy Pratt >> > wrote: >> > > >> >> > > >>> I wrote myself a Scanner wrapper that uses a producer/consumer >> queue >> > to >> > > >>> keep the client fed with a full buffer as much as possible. >>When >> > > >>>scanning >> > > >>> my table with scanner caching at 100 records, I see about a 24% >> > uplift >> > > >>>in >> > > >>> performance (~35k records/sec with the ClientScanner and ~44k >> > > >>>records/sec >> > > >>> with my P/C scanner). However, when I set scanner caching to >>5000, >> > > >>>it's >> > > >>> more of a wash compared to the standard ClientScanner: ~53k >> > records/sec >> > > >>> with the ClientScanner and ~60k records/sec with the P/C >>scanner. >> > > >>> >> > > >>> I'm not sure what to make of those results. I think next I'll >>shut >> > > >>>down >> > > >>> HBase and read the HFiles directly, to see if there's a drop >>off in >> > > >>> performance between reading them directly vs. via the >>RegionServer. >> > > >>> >> > > >>> I still think that to really solve this there needs to be >>sliding >> > > >>>window >> > > >>> of records in flight between disk and RS, and between RS and >> client. >> > > >>>I'm >> > > >>> thinking there's probably a single batch of records in flight >> between >> > > >>>RS >> > > >>> and client at the moment. >> > > >>> >> > > >>> Sandy >> > > >>> >> > > >>> On 5/23/13 8:45 AM, "Bryan Keller" wrote: >> > > >>> >> > > >>> >I am considering scanning a snapshot instead of the table. I >> believe >> > > >>>this >> > > >>> >is what the ExportSnapshot class does. If I could use the >>scanning >> > > >>>code >> > > >>> >from ExportSnapshot then I will be able to scan the HDFS files >> > > >>>directly >> > > >>> >and bypass the regionservers. This could potentially give me a >> huge >> > > >>>boost >> > > >>> >in performance for full table scans. However, it doesn't really >> > > >>>address >> > > >>> >the poor scan performance against a table. >> > > >>> >> > > >>> >> > > > >> > > >> > > >> > >>