hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From SiMaYunRui <myl...@hotmail.com>
Subject RE: Stargate perf and troubleshooting tips
Date Mon, 11 Aug 2014 23:33:00 GMT
Some of my query patterns ask to return at most 500 rows given different kinds of filters,
like SingleColumnValueFilter. You can take my application as auditing things, given a time
range, scan all files a specific user viewed. Some of factors are stored as qualifiers, that's
why filters are necessary.
Scanner is a perfect match except it requires two round trips thru RESTful api and the creation
of scanner might be slow. What's the purpose of stateless scanner without filters support?
Seems to me just a enhancement to get.
BTW, could you please elabrate which configuration and where I can try to increase the defaults?
I am a new comer to hbase, didn't find them thru google. :-)
> > ​Stargate functions as a client to the HBase cluster. Both the RPC client -
> > here, Stargate - and the RPC server - here, the RegionServer(s) - have
> > configurable intervals for keeping idle connections around. Either the
> > client or server will eventually drop the connection. Who does it depends
> > on the configured timeouts. You can increase the defaults but it would be
> > better not to hold scanner resources open.

> From: apurtell@apache.org
> Date: Mon, 11 Aug 2014 10:28:18 -0700
> Subject: Re: Stargate perf and troubleshooting tips
> To: user@hbase.apache.org
> 
> No, the stateless scanner does not support filters. Is this a requirement
> for your use case?
> 
> 
> On Mon, Aug 11, 2014 at 7:02 AM, SiMaYunRui <mylpis@hotmail.com> wrote:
> 
> > Thanks Andrew. Does the stateless scan supports filter? I read the doc you
> > referenced, but seems that only the follow parameters are supported, filter
> > is not part of the list.
> >
> >
> > startrow - The start row for the scan.
> > endrow - The end row for the scan.
> > columns - The columns to scan.
> > starttime, endtime - To only retrieve columns within a specific range of
> > version timestamps, both start and end time must be specified.
> > maxversions - To limit the number of versions of each column to be
> > returned.
> > batchsize - To limit the maximum number of values returned for each call
> > to next().
> > limit - The number of rows to return in the scan operation.
> >
> >
> >
> >
> >
> >
> > 发自 Windows 邮件
> >
> >
> >
> >
> >
> > 发件人: Andrew Purtell
> > 发送时间: ‎2014‎年‎8‎月‎8‎日, ‎星期五 ‎4‎:‎46
> > 收件人: user@hbase.apache.org
> >
> >
> >
> >
> >
> > On Wed, Aug 6, 2014 at 11:20 PM, SiMaYunRui <mylpis@hotmail.com> wrote:
> >
> > > Further investigation showsthat if I repeatedly fetch data very quick,
> > the
> > > latter scanner creations are very fast (< 100ms), but if there is
> > >1minute
> > > interval between two data fetching, the latter is slow.
> > > ​
> > > I am certain that it’s not caused by TCP/SSL handshake (fiddler proves
> > > that). So I believe there must be a resource reuse somewhere no matter in
> > > Rest service code or hbase server code.​
> >
> >
> > ​Stargate functions as a client to the HBase cluster. Both the RPC client -
> > here, Stargate - and the RPC server - here, the RegionServer(s) - have
> > configurable intervals for keeping idle connections around. Either the
> > client or server will eventually drop the connection. Who does it depends
> > on the configured timeouts. You can increase the defaults but it would be
> > better not to hold scanner resources open.
> >
> > If you are using version 0.98, we have a new stateless scanning option that
> > avoids the setup costs of old style scanners. See
> >
> > https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/rest/package-summary.html#operation_stateless_scanner
> >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
> 
> 
> 
> -- 
> Best regards,
> 
>    - Andy
> 
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message