Return-Path: Delivered-To: apmail-hbase-user-archive@www.apache.org Received: (qmail 16095 invoked from network); 20 Jan 2011 16:41:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 20 Jan 2011 16:41:39 -0000 Received: (qmail 99467 invoked by uid 500); 20 Jan 2011 16:41:38 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 99270 invoked by uid 500); 20 Jan 2011 16:41:35 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 99262 invoked by uid 99); 20 Jan 2011 16:41:34 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Jan 2011 16:41:34 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of phaidinyak@local.com designates 70.183.28.5 as permitted sender) Received: from [70.183.28.5] (HELO mail11.local.com) (70.183.28.5) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Jan 2011 16:41:29 +0000 Received: from IRV1HUBCAS01.eLiberation.com (10.1.190.40) by mail11.local.com (10.1.230.41) with Microsoft SMTP Server (TLS) id 8.3.83.0; Thu, 20 Jan 2011 08:41:26 -0800 Received: from IRV1EXMB01.eLiberation.com ([fe80::895:3aed:8948:77ab]) by IRV1HUBCAS01.eLiberation.com ([fe80::7853:3c40:9449:94df%13]) with mapi; Thu, 20 Jan 2011 08:41:08 -0800 From: Peter Haidinyak To: "user@hbase.apache.org" Date: Thu, 20 Jan 2011 08:41:07 -0800 Subject: RE: Scan (Start Row, End Row) vs Scan (Row) Thread-Topic: Scan (Start Row, End Row) vs Scan (Row) Thread-Index: AQHLuHs/Y8zNIRp5T0WIQj/a6eS/+pPaEcoA///1DNCAAAmiAA== Message-ID: <321C2E54215EEB41A581FDD9DAECBC5DFCD067D9@IRV1EXMB01.eLiberation.com> References: <89EDE68D-F27E-4EF1-B47B-54E30B10C868@xebia.com> <5A76F6CE309AD049AAF9A039A39242820F6657C9@sc-mbx04.TheFacebook.com> In-Reply-To: <5A76F6CE309AD049AAF9A039A39242820F6657C9@sc-mbx04.TheFacebook.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Question, does HBase stop scanning after it hits the end row? I thought it = does. Thanks -Pete -----Original Message----- From: Jonathan Gray [mailto:jgray@fb.com]=20 Sent: Thursday, January 20, 2011 8:09 AM To: user@hbase.apache.org Subject: RE: Scan (Start Row, End Row) vs Scan (Row) The best way to do this is as Friso describes, using the existing stopRow p= arameter in Scan. There is another way to do it with startRow + a filter. There is a PrefixF= ilter which could be used here. Looking at the code, it seems as though th= e PrefixFilter does an early out and stops the scan once passed the prefix. If not, you can wrap any filter in a WhileMatchFilter. That wrapping filte= r will make it so once the underlying filter fails once, all further things= will fail and the scan will early out. JG > -----Original Message----- > From: Friso van Vollenhoven [mailto:fvanvollenhoven@xebia.com] > Sent: Thursday, January 20, 2011 12:45 AM > To: > Subject: Re: Scan (Start Row, End Row) vs Scan (Row) >=20 > Performing a scan with >=20 > start row =3D 20100809041500_abd > end row =3D 20100809041500_abe >=20 > will give you just that. The end row is exclusive, so it will only return= rows > with VAR1 =3D abd. You need to compute the 'abe' yourself, though (which = is > basically taking 'abd' and increasing the right most byte by 1 unless it'= s at max > byte value, then set it to 0 and increase the byte left to that by 1, etc= .). There > is no scan method that has 'starts with' semantics, AFAIK. >=20 > See here: > http://hbase.apache.org/docs/r0.89.20100924/apidocs/org/apache/hadoop/ > hbase/client/Scan.html#Scan(byte[], > byte[]) hadoop/hbase/client/Scan.html#Scan(byte%5B%5D,%20byte%5B%5D)> >=20 >=20 > Friso >=20 >=20 >=20 >=20 > On 20 jan 2011, at 09:22, Shuja Rehman wrote: >=20 > Hi > Consider the following scenario. >=20 > Row Key Format =3D DATETIME_VAR1_VAR2 (where var1 and var2 have fixed > lengths) >=20 > and example data could be >=20 > 20100809041500_abc_xyz > 20100809041500_abc_xyw > 20100809041500_abc_xyc > *20100809041500_abd_xyz* > 20100809041500_abd_xyw > 20100809041500_abf_xyz > ... >=20 > Now if i want to get the rows which only have this row key > 20100809041500_abd then is there anyway to achieve through scan without > using filter because if i use filter scan(startrow, filter) where > startrow=3D"20100809041500_abd" then it will scan whole table from start = key > to end of table. i want to just scan that part of table which i require. = So if > there is any method like this >=20 > scan(row) where row =3D"20100809041500_abd" and it just return the > following results >=20 > 20100809041500_abd_xyz > 20100809041500_abd_xyw >=20 > Kindly let me know whether it is achievable or not? > thnx > -- > Regards > Shuja-ur-Rehman Baig >