Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (athena.apache.org: domain of yuzhihong@gmail.com
 designates 209.85.216.53 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CANZDn9uBeeXAsyWq4XqGJ0oy-Qa4w3pqpmNW_r1REJzpwe1bjA@mail.gmail.com>
References: <5AD9155F-4B0B-4327-A0A0-F6B2EB3E8AB5@magnetic.com>
	<CALte62yhZyrHiYMB6dY3-KdkVyk--HUC-LmCmf4oPT8Lm3F71w@mail.gmail.com>
	<A0A94F44-A4B4-426C-8AB3-20DB1458CD9E@magnetic.com>
	<CALte62wbcdvek0-p7XjPj9OLPFw=9KQ3_LQADMHoCxx1Qm_vEA@mail.gmail.com>
	<0F6881B7-DD83-4192-9C34-569805076F16@magnetic.com>
	<CALte62zSXswBkM8J1qoY8Y25SZsm+vPeuNm8QpA-+9PvnCTXNQ@mail.gmail.com>
	<CANZDn9sjeioqhNTPzLM2g59Ay5QQ347jrFBr00VHkxchWpHPNw@mail.gmail.com>
	<CALte62yHdvwvtQroq2GOG_KHXxDTvb2DamR5scThkDLsFz+MTw@mail.gmail.com>
	<CANZDn9uBeeXAsyWq4XqGJ0oy-Qa4w3pqpmNW_r1REJzpwe1bjA@mail.gmail.com>
Date: Wed, 20 Mar 2013 10:56:04 -0700
Message-ID: 
 <CALte62yg8hfcpqp9pHhKFuZs=4oiTM0yjquRm4+3qgMYrZantA@mail.gmail.com>
Subject: Re: Scanner timeout -- any reason not to raise?
From: Ted Yu <yuzhihong@gmail.com>
To: user@hbase.apache.org
Content-Type: multipart/alternative; boundary=0023543337ee93c49004d85eef32

--0023543337ee93c49004d85eef32
Content-Type: text/plain; charset=ISO-8859-1

Bryan:
Interesting idea.

You can log a JIRA with the following two suggestions.

On Wed, Mar 20, 2013 at 10:39 AM, Bryan Beaudreault <
bbeaudreault@hubspot.com> wrote:

> I was thinking something like this:
>
> Scan scan = new Scan(startRow, endRow);
>
> scan.setCaching(someVal); // based on what we expect most rows to take for
> processing time
>
>  ResultScanner scanner = table.getScanner(scan);
>
>   for (Result r : scanner) {
>
>   // usual processing, the time for which we accounted for in our caching
> and global lease timeout settings
>
>   if (someCondition) {
>
>     // More time-intensive processing necessary on this record, which is
> hard to account for in the caching
>
>     scanner.progress();
>
>   }
>
>  }
>
>
> --
>
> I'm not sure how we could expose this in the context of a hadoop job, since
> I don't believe we have access to the underlying scanner, but that would be
> great also.
>
>
> On Wed, Mar 20, 2013 at 1:11 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > bq.  if HBase provided a way to manually refresh a lease similar to
> > Hadoop's context.progress()
> >
> > Can you outline how the above works for long scan ?
> >
> > bq. Even being able to override the timeout on a per-scan basis would be
> > nice.
> >
> > Agreed.
> >
> > On Wed, Mar 20, 2013 at 10:05 AM, Bryan Beaudreault <
> > bbeaudreault@hubspot.com> wrote:
> >
> > > Typically it is better to use caching and batch size to limit the
> number
> > of
> > > rows returned and thus the amount of processing required between calls
> to
> > > next() during a scan, but it would be nice if HBase provided a way to
> > > manually refresh a lease similar to Hadoop's context.progress().  In a
> > > cluster that is used for many different applications, upping the global
> > > lease timeout is a heavy handed solution.  Even being able to override
> > the
> > > timeout on a per-scan basis would be nice.
> > >
> > > Thoughts on that, Ted?
> > >
> > >
> > > On Wed, Mar 20, 2013 at 1:00 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> > >
> > > > In 0.94, there is only one setting.
> > > > See release notes of HBASE-6170 which is in 0.95
> > > >
> > > > Looks like this should help (in 0.95):
> > > >
> > > > https://issues.apache.org/jira/browse/HBASE-2214
> > > > Do HBASE-1996 -- setting size to return in scan rather than count of
> > rows
> > > > -- properly
> > > >
> > > > From your description, you should be able to raise the timeout since
> > the
> > > > writes are relatively fast.
> > > >
> > > > Cheers
> > > >
> > > > On Wed, Mar 20, 2013 at 9:32 AM, Dan Crosta <dan@magnetic.com>
> wrote:
> > > >
> > > > > I'm confused -- I only see one setting in CDH manager, what is the
> > name
> > > > of
> > > > > the other setting?
> > > > >
> > > > > Our load is moderately frequent small writes (in batches of 1000
> > cells
> > > at
> > > > > a time, typically split over a few hundred rows -- these complete
> > very
> > > > > fast, we haven't seen any timeouts there), and infrequent batches
> of
> > > > large
> > > > > reads (scans), which is where we do see timeouts. My guess is that
> > the
> > > > > timeout is more due to our application taking some time --
> apparently
> > > > more
> > > > > than 60s -- to process the results of each scan's output, rather
> than
> > > due
> > > > > to slowness in HBase itself, which tends to be only moderately
> loaded
> > > > > (judging by CPU, network, and disk) while we do the reads.
> > > > >
> > > > > Thanks,
> > > > > - Dan
> > > > >
> > > > > On Mar 17, 2013, at 2:20 PM, Ted Yu wrote:
> > > > >
> > > > > > The lease timeout is used by row locking too.
> > > > > > That's the reason behind splitting the setting into two config
> > > > > parameters.
> > > > > >
> > > > > > How is your load composition ? Do you mostly serve reads from
> > HBase ?
> > > > > >
> > > > > > Cheers
> > > > > >
> > > > > > On Sun, Mar 17, 2013 at 1:56 PM, Dan Crosta <dan@magnetic.com>
> > > wrote:
> > > > > >
> > > > > >> Ah, thanks Ted -- I was wondering what that setting was for.
> > > > > >>
> > > > > >> We are using CDH 4.2.0, which is HBase 0.94.2 (give or take a
> few
> > > > > >> backports from 0.94.3).
> > > > > >>
> > > > > >> Is there any harm in setting the lease timeout to something
> > larger,
> > > > > like 5
> > > > > >> or 10 minutes?
> > > > > >>
> > > > > >> Thanks,
> > > > > >> - Dan
> > > > > >>
> > > > > >> On Mar 17, 2013, at 1:46 PM, Ted Yu wrote:
> > > > > >>
> > > > > >>> Which HBase version are you using ?
> > > > > >>>
> > > > > >>> In 0.94 and prior, the config param is
> > > > hbase.regionserver.lease.period
> > > > > >>>
> > > > > >>> In 0.95, it is different. See release notes of HBASE-6170
> > > > > >>>
> > > > > >>> On Sun, Mar 17, 2013 at 11:46 AM, Dan Crosta <dan@magnetic.com
> >
> > > > wrote:
> > > > > >>>
> > > > > >>>> We occasionally get scanner timeout errors such as "66698ms
> > passed
> > > > > since
> > > > > >>>> the last invocation, timeout is currently set to 60000" when
> > > > > iterating a
> > > > > >>>> scanner through the Thrift API. Is there any reason not to
> raise
> > > the
> > > > > >>>> timeout to something larger than the default 60s? Put another
> > way,
> > > > > what
> > > > > >>>> resources (and how much of them) does a scanner take up on a
> > > thrift
> > > > > >> server
> > > > > >>>> or region server?
> > > > > >>>>
> > > > > >>>> Also, to confirm -- I believe "hbase.rpc.timeout" is the
> setting
> > > in
> > > > > >>>> question here, but someone please correct me if I'm wrong.
> > > > > >>>>
> > > > > >>>> Thanks,
> > > > > >>>> - Dan
> > > > > >>>>
> > > > > >>>>
> > > > > >>>>
> > > > > >>
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

--0023543337ee93c49004d85eef32--