Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3AA34F607 for ; Wed, 20 Mar 2013 17:56:33 +0000 (UTC) Received: (qmail 93743 invoked by uid 500); 20 Mar 2013 17:56:31 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 93498 invoked by uid 500); 20 Mar 2013 17:56:30 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 93469 invoked by uid 99); 20 Mar 2013 17:56:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Mar 2013 17:56:29 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of yuzhihong@gmail.com designates 209.85.216.53 as permitted sender) Received: from [209.85.216.53] (HELO mail-qa0-f53.google.com) (209.85.216.53) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Mar 2013 17:56:25 +0000 Received: by mail-qa0-f53.google.com with SMTP id k4so362002qaq.19 for ; Wed, 20 Mar 2013 10:56:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=a+oOgD2fn9c0xoGIxGqVO4riieNXEfWCj1uCyzP1fH0=; b=PVxxoO83bj/oOi9OWxWfAC2L7qsKid1yENvEI56GqbH8O4z46W3IXGCr3YETxkLyP/ fgbi4XbKA8GplXGJayVJB/l5jTalq8vLvmgvcc9fxznsPfL9to2i+6RD8pbJhjWSM5Sb vcc67E/HSKATYSh1w9OCitFmTfmCyr7TYkp045/ie7MhofLkiI06sqYE/ieErL4ImbUy tIgmC7nrZF0G1TDNOsA1MtI2p6a8R2pP0N87Qjb437Jh1+O/qDB3by4UUPUZARX60rbl jU/7XbUYMfIHavSxfI+qDQ8txZipmRqTJl0FwRpQIyjikWlatZjL4kdGO8tX5p51oQ6l VbtQ== MIME-Version: 1.0 X-Received: by 10.229.76.95 with SMTP id b31mr1753006qck.83.1363802164803; Wed, 20 Mar 2013 10:56:04 -0700 (PDT) Received: by 10.101.208.18 with HTTP; Wed, 20 Mar 2013 10:56:04 -0700 (PDT) In-Reply-To: References: <5AD9155F-4B0B-4327-A0A0-F6B2EB3E8AB5@magnetic.com> <0F6881B7-DD83-4192-9C34-569805076F16@magnetic.com> Date: Wed, 20 Mar 2013 10:56:04 -0700 Message-ID: Subject: Re: Scanner timeout -- any reason not to raise? From: Ted Yu To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=0023543337ee93c49004d85eef32 X-Virus-Checked: Checked by ClamAV on apache.org --0023543337ee93c49004d85eef32 Content-Type: text/plain; charset=ISO-8859-1 Bryan: Interesting idea. You can log a JIRA with the following two suggestions. On Wed, Mar 20, 2013 at 10:39 AM, Bryan Beaudreault < bbeaudreault@hubspot.com> wrote: > I was thinking something like this: > > Scan scan = new Scan(startRow, endRow); > > scan.setCaching(someVal); // based on what we expect most rows to take for > processing time > > ResultScanner scanner = table.getScanner(scan); > > for (Result r : scanner) { > > // usual processing, the time for which we accounted for in our caching > and global lease timeout settings > > if (someCondition) { > > // More time-intensive processing necessary on this record, which is > hard to account for in the caching > > scanner.progress(); > > } > > } > > > -- > > I'm not sure how we could expose this in the context of a hadoop job, since > I don't believe we have access to the underlying scanner, but that would be > great also. > > > On Wed, Mar 20, 2013 at 1:11 PM, Ted Yu wrote: > > > bq. if HBase provided a way to manually refresh a lease similar to > > Hadoop's context.progress() > > > > Can you outline how the above works for long scan ? > > > > bq. Even being able to override the timeout on a per-scan basis would be > > nice. > > > > Agreed. > > > > On Wed, Mar 20, 2013 at 10:05 AM, Bryan Beaudreault < > > bbeaudreault@hubspot.com> wrote: > > > > > Typically it is better to use caching and batch size to limit the > number > > of > > > rows returned and thus the amount of processing required between calls > to > > > next() during a scan, but it would be nice if HBase provided a way to > > > manually refresh a lease similar to Hadoop's context.progress(). In a > > > cluster that is used for many different applications, upping the global > > > lease timeout is a heavy handed solution. Even being able to override > > the > > > timeout on a per-scan basis would be nice. > > > > > > Thoughts on that, Ted? > > > > > > > > > On Wed, Mar 20, 2013 at 1:00 PM, Ted Yu wrote: > > > > > > > In 0.94, there is only one setting. > > > > See release notes of HBASE-6170 which is in 0.95 > > > > > > > > Looks like this should help (in 0.95): > > > > > > > > https://issues.apache.org/jira/browse/HBASE-2214 > > > > Do HBASE-1996 -- setting size to return in scan rather than count of > > rows > > > > -- properly > > > > > > > > From your description, you should be able to raise the timeout since > > the > > > > writes are relatively fast. > > > > > > > > Cheers > > > > > > > > On Wed, Mar 20, 2013 at 9:32 AM, Dan Crosta > wrote: > > > > > > > > > I'm confused -- I only see one setting in CDH manager, what is the > > name > > > > of > > > > > the other setting? > > > > > > > > > > Our load is moderately frequent small writes (in batches of 1000 > > cells > > > at > > > > > a time, typically split over a few hundred rows -- these complete > > very > > > > > fast, we haven't seen any timeouts there), and infrequent batches > of > > > > large > > > > > reads (scans), which is where we do see timeouts. My guess is that > > the > > > > > timeout is more due to our application taking some time -- > apparently > > > > more > > > > > than 60s -- to process the results of each scan's output, rather > than > > > due > > > > > to slowness in HBase itself, which tends to be only moderately > loaded > > > > > (judging by CPU, network, and disk) while we do the reads. > > > > > > > > > > Thanks, > > > > > - Dan > > > > > > > > > > On Mar 17, 2013, at 2:20 PM, Ted Yu wrote: > > > > > > > > > > > The lease timeout is used by row locking too. > > > > > > That's the reason behind splitting the setting into two config > > > > > parameters. > > > > > > > > > > > > How is your load composition ? Do you mostly serve reads from > > HBase ? > > > > > > > > > > > > Cheers > > > > > > > > > > > > On Sun, Mar 17, 2013 at 1:56 PM, Dan Crosta > > > wrote: > > > > > > > > > > > >> Ah, thanks Ted -- I was wondering what that setting was for. > > > > > >> > > > > > >> We are using CDH 4.2.0, which is HBase 0.94.2 (give or take a > few > > > > > >> backports from 0.94.3). > > > > > >> > > > > > >> Is there any harm in setting the lease timeout to something > > larger, > > > > > like 5 > > > > > >> or 10 minutes? > > > > > >> > > > > > >> Thanks, > > > > > >> - Dan > > > > > >> > > > > > >> On Mar 17, 2013, at 1:46 PM, Ted Yu wrote: > > > > > >> > > > > > >>> Which HBase version are you using ? > > > > > >>> > > > > > >>> In 0.94 and prior, the config param is > > > > hbase.regionserver.lease.period > > > > > >>> > > > > > >>> In 0.95, it is different. See release notes of HBASE-6170 > > > > > >>> > > > > > >>> On Sun, Mar 17, 2013 at 11:46 AM, Dan Crosta > > > > > wrote: > > > > > >>> > > > > > >>>> We occasionally get scanner timeout errors such as "66698ms > > passed > > > > > since > > > > > >>>> the last invocation, timeout is currently set to 60000" when > > > > > iterating a > > > > > >>>> scanner through the Thrift API. Is there any reason not to > raise > > > the > > > > > >>>> timeout to something larger than the default 60s? Put another > > way, > > > > > what > > > > > >>>> resources (and how much of them) does a scanner take up on a > > > thrift > > > > > >> server > > > > > >>>> or region server? > > > > > >>>> > > > > > >>>> Also, to confirm -- I believe "hbase.rpc.timeout" is the > setting > > > in > > > > > >>>> question here, but someone please correct me if I'm wrong. > > > > > >>>> > > > > > >>>> Thanks, > > > > > >>>> - Dan > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >> > > > > > >> > > > > > > > > > > > > > > > > > > > > --0023543337ee93c49004d85eef32--