hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From anil gupta <anilgupt...@gmail.com>
Subject Re: ColumnSuffixFilter in HBase
Date Mon, 16 Feb 2015 16:23:28 GMT
Hi Alok,

"xyz$" regex is enough to get the desired result. It will perform better
than "^.*xyz$". "^.*xyz$" is doing some unnecessary work.

Thanks,
Anil Gupta

On Wed, Feb 11, 2015 at 10:44 AM, Alok Singh <aloksingh@gmail.com> wrote:

> A simple String suffix check will be definitely faster, but I doubt
> you will notice the difference as most of the time will be spent
> loading every column of every row to do the check.
>
> Just for the fun of it, I did a quick comparison of regex "^.*xyz$"
> with String.endsWith("xyz"): regex is about ~10x slower (2Million/sec
> vs 20Million/sec).
>
> Alok
>
> On Tue, Feb 10, 2015 at 10:56 PM, anil gupta <anilgupta84@gmail.com>
> wrote:
> > Completely missed out on QualifierFilter. Thanks Alok.
> > Since QualifierFilter is going to use regex. I would say that
> > QualifierFilter wont be super efficient. At present, i want to do this
> > quickly.
> > In near future, if time permits, i will try to write ColumnSuffixFilter
> and
> > use it.
> > An observation regarding naming, "ColumnPrefixFilter" and
> "QualifierFilter"
> > names are not coherent. IMHO, we should either call ColumnQualifier as
> > "Column" or "Qualifier". Using both interchangeably creates confusion.
> >
> > On Tue, Feb 10, 2015 at 9:57 PM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org> wrote:
> >
> >> Oh, you're right? I read the question too quickly and skipped the column
> >> information... FuzzyRowFilter is only for the key.
> >>
> >> 2015-02-11 0:53 GMT-05:00 Alok Singh <aloksingh@gmail.com>:
> >>
> >> > Doesn't the FuzzyRowFilter only work on rowkeys? Not sure how it would
> >> > apply to column qualifiers.
> >> >
> >> > Alok
> >> >
> >> > On Tue, Feb 10, 2015 at 9:48 PM, Jean-Marc Spaggiari
> >> > <jean-marc@spaggiari.org> wrote:
> >> > > Regex are more expensive that "basic" filters for sure. But it's a
> >> simple
> >> > > solution. You can implement a ColumnSuffixFilter. If you keys are
> >> always
> >> > > the same length, you can look at this:
> >> > >
> >> >
> >>
> https://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/filter/FuzzyRowFilter.html
> >> > >
> >> > > JM
> >> > >
> >> > > 2015-02-11 0:44 GMT-05:00 hongbin ma <mahongbin@apache.org>:
> >> > >
> >> > >> Will there be any performance issues ? i'm curious if there's
an
> >> > efficient
> >> > >> way of implement such kind of filer.
> >> > >>
> >> > >> On Wed, Feb 11, 2015 at 1:39 PM, Alok Singh <aloksingh@gmail.com>
> >> > wrote:
> >> > >>
> >> > >> > You could use a QualifierFilter with a RegexStringComparator
to
> do
> >> the
> >> > >> > same.
> >> > >> >
> >> > >> > Alok
> >> > >> >
> >> > >> > On Tue, Feb 10, 2015 at 7:23 PM, anil gupta <
> anilgupta84@gmail.com>
> >> > >> wrote:
> >> > >> > > Hi,
> >> > >> > >
> >> > >> > > I want to get all the columns of a row that ends with
"xyz". I
> >> know
> >> > >> there
> >> > >> > > is ColumnPrefixFilter. Is there any other column filter
in
> HBase
> >> > that
> >> > >> can
> >> > >> > > help me simulate feature of suffix matching?
> >> > >> > >
> >> > >> > > In my opinion, ColumnSuffixFilter would be a good addition
to
> >> > Filters
> >> > >> of
> >> > >> > > HBase.
> >> > >> > >
> >> > >> > > --
> >> > >> > > Thanks & Regards,
> >> > >> > > Anil Gupta
> >> > >> >
> >> > >>
> >> >
> >>
> >
> >
> >
> > --
> > Thanks & Regards,
> > Anil Gupta
>



-- 
Thanks & Regards,
Anil Gupta

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message