hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Virag Kothari <vi...@yahoo-inc.com.INVALID>
Subject Re: Stargate perf and troubleshooting tips
Date Tue, 12 Aug 2014 22:40:59 GMT
Hi,
Filters are added for Stateless Scanner but its not in 0.98 (HBASE-9345).
We will add docs soon.

The query parameter for filters is filter='Filter expression'.
The filter language is the same as the one at
http://hbase.apache.org/book/thrift.html.

An example filter query param with a PrefixFilter will be

https://localhost:8080/ExampleScanner/*?filter=PrefixFilter%20\(\'row\'\)'


Example compound filter,
https://localhost:8080/ExampleScanner/*?filter=PrefixFilter%20\(\'row\'\)AN
D%20QualifierFilter%20\(%3D%2C%20\'binary:b\'\)'


Make sure all unsafe ASCII characters are URL-encoded.

Thanks,
Virag



On 8/11/14 4:33 PM, "SiMaYunRui" <mylpis@hotmail.com> wrote:

>Some of my query patterns ask to return at most 500 rows given different
>kinds of filters, like SingleColumnValueFilter. You can take my
>application as auditing things, given a time range, scan all files a
>specific user viewed. Some of factors are stored as qualifiers, that's
>why filters are necessary.
>Scanner is a perfect match except it requires two round trips thru
>RESTful api and the creation of scanner might be slow. What's the purpose
>of stateless scanner without filters support? Seems to me just a
>enhancement to get.
>BTW, could you please elabrate which configuration and where I can try to
>increase the defaults? I am a new comer to hbase, didn't find them thru
>google. :-)
>> > ​Stargate functions as a client to the HBase cluster. Both the RPC
>>client -
>> > here, Stargate - and the RPC server - here, the RegionServer(s) - have
>> > configurable intervals for keeping idle connections around. Either the
>> > client or server will eventually drop the connection. Who does it
>>depends
>> > on the configured timeouts. You can increase the defaults but it
>>would be
>> > better not to hold scanner resources open.
>
>> From: apurtell@apache.org
>> Date: Mon, 11 Aug 2014 10:28:18 -0700
>> Subject: Re: Stargate perf and troubleshooting tips
>> To: user@hbase.apache.org
>> 
>> No, the stateless scanner does not support filters. Is this a
>>requirement
>> for your use case?
>> 
>> 
>> On Mon, Aug 11, 2014 at 7:02 AM, SiMaYunRui <mylpis@hotmail.com> wrote:
>> 
>> > Thanks Andrew. Does the stateless scan supports filter? I read the
>>doc you
>> > referenced, but seems that only the follow parameters are supported,
>>filter
>> > is not part of the list.
>> >
>> >
>> > startrow - The start row for the scan.
>> > endrow - The end row for the scan.
>> > columns - The columns to scan.
>> > starttime, endtime - To only retrieve columns within a specific range
>>of
>> > version timestamps, both start and end time must be specified.
>> > maxversions - To limit the number of versions of each column to be
>> > returned.
>> > batchsize - To limit the maximum number of values returned for each
>>call
>> > to next().
>> > limit - The number of rows to return in the scan operation.
>> >
>> >
>> >
>> >
>> >
>> >
>> > 发自 Windows 邮件
>> >
>> >
>> >
>> >
>> >
>> > 发件人: Andrew Purtell
>> > 发送时间: ‎2014‎年‎8‎月‎8‎日, ‎星期五 ‎4‎:‎46
>> > 收件人: user@hbase.apache.org
>> >
>> >
>> >
>> >
>> >
>> > On Wed, Aug 6, 2014 at 11:20 PM, SiMaYunRui <mylpis@hotmail.com>
>>wrote:
>> >
>> > > Further investigation showsthat if I repeatedly fetch data very
>>quick,
>> > the
>> > > latter scanner creations are very fast (< 100ms), but if there is
>> > >1minute
>> > > interval between two data fetching, the latter is slow.
>> > > ​
>> > > I am certain that it’s not caused by TCP/SSL handshake (fiddler
>>proves
>> > > that). So I believe there must be a resource reuse somewhere no
>>matter in
>> > > Rest service code or hbase server code.​
>> >
>> >
>> > ​Stargate functions as a client to the HBase cluster. Both the RPC
>>client -
>> > here, Stargate - and the RPC server - here, the RegionServer(s) - have
>> > configurable intervals for keeping idle connections around. Either the
>> > client or server will eventually drop the connection. Who does it
>>depends
>> > on the configured timeouts. You can increase the defaults but it
>>would be
>> > better not to hold scanner resources open.
>> >
>> > If you are using version 0.98, we have a new stateless scanning
>>option that
>> > avoids the setup costs of old style scanners. See
>> >
>> > 
>>https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/rest/package-sum
>>mary.html#operation_stateless_scanner
>> >
>> >
>> >
>> > --
>> > Best regards,
>> >
>> >    - Andy
>> >
>> > Problems worthy of attack prove their worth by hitting back. - Piet
>>Hein
>> > (via Tom White)
>> >
>> 
>> 
>> 
>> -- 
>> Best regards,
>> 
>>    - Andy
>> 
>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> (via Tom White)
> 		 	   		  

Mime
View raw message