hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Himanshu Vashishtha <hvash...@cs.ualberta.ca>
Subject Re: How to scan rows starting with a particular string?
Date Wed, 27 Apr 2011 06:54:07 GMT
HBase uses utf-8 encoding to store the row keys, so it can store non-ascii
characters too (yes they will be larger than 1 byte).

A relevant thread:
http://search-hadoop.com/m/aJ0702Sq3Ii2/Scan+%2528Start+Row%252C+End+Row%2529+vs+Scan+%2528Row%2529&subj=RE+Scan+Start+Row+End+Row+vs+Scan+Row+

Hope this helps.
Himanshu

On Tue, Apr 26, 2011 at 11:43 PM, Hari Sreekumar
<hsreekumar@clickable.com>wrote:

> Just in case if there are other characters which fall after ~. We were
> using
> 'z' before. Then we realized we had some special characters in keys. So we
> updated it to ~. Does HBase support characters > ~ in row keys?
>
> hari
>
> On Tue, Apr 26, 2011 at 7:35 PM, Suraj Varma <svarma.ng@gmail.com> wrote:
>
> > Why did you feel this is error prone?
> > If you use server side filters by providing start/end Row or use a
> > PrefixFilter, the scans should work well, as it is going to be
> > sequential access. Depending on your data and use case, you may need
> > to tune it further (say by applying additional filters, limit results,
> > etc)... see here for some more tips on speeding up scans:
> > http://wiki.apache.org/hadoop/Hbase/Troubleshooting#A16
> >
> > --Suraj
> >
> >
> > On Tue, Apr 26, 2011 at 2:45 AM, Hari Sreekumar
> > <hsreekumar@clickable.com> wrote:
> > > Hi,
> > >
> > > I need to scan rows which have rowskey starting with a particular
> string
> > > (say abc). I am currently doing this by using startrow=abc and
> > endrow=abc~.
> > > (I am appending ~ as it is ASCII 126). It usually works, but is there a
> > > better, less error prone way? I know we can do this using filters, but
> > won't
> > > that be worse performance-wise?
> > >
> > > Thanks,
> > > Hari
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message