hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Is it possible to implement a NOT filter in Hbase?
Date Tue, 03 Jan 2017 02:36:06 GMT
There is INCLUDE_AND_SEEK_NEXT_ROW which was not accounted for in the
nested if statement.

Is it possible for you to come up with unit test showing what you observe ?

If you don't have time, please consider using De Morgan's law.

Cheers

On Mon, Jan 2, 2017 at 5:19 PM, Carl M <soloninguno@hotmail.com> wrote:

> Thanks for your response Ted.
>
>
> I did the change, unfortunately it doesn't make any difference.
>
>
> Best,
>
> ________________________________
> De: Ted Yu <yuzhihong@gmail.com>
> Enviado: lunes, 02 de enero de 2017 07:58 p.m.
> Para: user@hbase.apache.org
> Asunto: Re: Is it possible to implement a NOT filter in Hbase?
>
> In the nested if statement, can you also handle NEXT_COL return code ?
>
> It should be translated to INCLUDE_AND_NEXT_COL
>
> Cheers
>
> > On Jan 2, 2017, at 1:54 PM, Carl M <soloninguno@hotmail.com> wrote:
> >
> > Sorry for not being clear enough.
> >
> >
> > Maybe i misunderstand your suggestion, but what I've done in implement
> my own filter, wrapping a FilterList
> >
> > public class NotFilterList extends Filter {
> >    private FilterList filter;
> >
> >    public NotFilterList(final List<Filter> rowFilters) {
> >        this.filter = new FilterList(rowFilters);
> >    }
> >
> >    public void reset() throws IOException {
> >        this.filter.reset();
> >    }
> >
> >    public boolean filterRowKey(byte[] rowKey, int offset, int length)
> throws IOException {
> >        return this.filter.filterRowKey(rowKey, offset, length);
> >    }
> >
> >    public boolean filterAllRemaining() throws IOException {
> >        return this.filter.filterAllRemaining();
> >    }
> >
> >    public ReturnCode filterKeyValue(Cell v) throws IOException {
> >        ReturnCode code = this.filter.filterKeyValue(v);
> >
> >        if (code == ReturnCode.INCLUDE)
> >            code = ReturnCode.SKIP;
> >        else if (code == ReturnCode.INCLUDE_AND_NEXT_COL)
> >            code = ReturnCode.NEXT_COL;
> >        else if (code == ReturnCode.NEXT_ROW)
> >            code = ReturnCode.INCLUDE_AND_NEXT_COL;
> >        else if (code == ReturnCode.SKIP) {
> >            code = ReturnCode.INCLUDE;
> >        }
> >
> >        return code;
> >    }
> >
> >    public Cell transformCell(Cell v) throws IOException {
> >        return this.filter.transformCell(v);
> >    }
> >
> >    public KeyValue transform(KeyValue currentKV) throws IOException {
> >        return this.filter.transform(currentKV);
> >    }
> >
> >    public void filterRowCells(List<Cell> kvs) throws IOException {
> >        this.filter.filterRowCells(kvs);
> >    }
> >
> >    public boolean hasFilterRow() {
> >        return this.filter.hasFilterRow();
> >    }
> >
> >    public boolean filterRow() throws IOException {
> >        return this.filter.filterRow();
> >    }
> >
> >    public KeyValue getNextKeyHint(KeyValue currentKV) throws IOException
> {
> >        return this.filter.getNextKeyHint(currentKV);
> >    }
> >
> >    public Cell getNextCellHint(Cell currentKV) throws IOException {
> >        return this.filter.getNextCellHint(currentKV);
> >    }
> >
> >    public boolean isFamilyEssential(byte[] name) throws IOException {
> >        return this.filter.isFamilyEssential(name);
> >    }
> >
> >    public byte[] toByteArray() throws IOException {
> >        return this.filter.toByteArray();
> >    }
> >
> >    public static NotFilterList parseFrom(final byte [] pbBytes) throws
> DeserializationException {
> >        FilterList filterList = FilterList.parseFrom(pbBytes);
> >        return new NotFilterList(filterList.getFilters());
> >    }
> >
> >    boolean areSerializedFieldsEqual(Filter other) {
> >        return this.filter.areSerializedFieldsEqual(other);
> >    }
> >
> > }
> >
> >
> > What I mean is that  filterKeyValue in the way i have it now, return the
> right results but only the fields that were not originally skipped.
> >
> >
> > So for example, if i have two rows each one with two fields
> >
> > Row 1
> >
> > Name: Bill
> >
> > Surname: Gates
> >
> >
> > Row 2
> >
> > Name: Steve
> >
> > Surname: Jobs
> >
> >
> > And I want to query for  Rows that doesn't have Name 'Bill'
> >
> > NOT (Name='Bill')
> >
> >
> > What  I get as result from Hbase with this NotFilter is
> >
> > Row 2
> >
> > Surname: Jobs
> >
> >
> > I suppose it's related to the cell "Name: Steve" skipped in the first
> place (before reversing the ReturnCode).
> >
> >
> > Best,
> >
> >
> >
> > ________________________________
> > De: Ted Yu <yuzhihong@gmail.com>
> > Enviado: lunes, 02 de enero de 2017 06:31 p.m.
> > Para: user@hbase.apache.org
> > Asunto: Re: Is it possible to implement a NOT filter in Hbase?
> >
> > bq. the cell/value that was originally skip is not return
> >
> > Can you be a bit more specific (with a concrete example) : the skipped
> cell
> > would not be returned (as indicated by the ReturnCode).
> >
> > Thanks
> >
> >> On Mon, Jan 2, 2017 at 1:06 PM, Carl M <soloninguno@hotmail.com> wrote:
> >>
> >> Hi Ted,
> >>
> >>
> >> I tried your suggestion, unfortunately it doesn't work as expected. I
> >> don't fully understand FilterList, but if a cell value was skip and I
> >> reverse the ReturnCode, i get the right row but the cell/value that was
> >> originally skip is not return.
> >>
> >> I also tried reversing only filterRow() method of FilterList, but I got
> >> the same behaviour (the original cell/value missing).
> >>
> >>
> >> Best,
> >>
> >>
> >> ________________________________
> >> De: Ted Yu <yuzhihong@gmail.com>
> >> Enviado: viernes, 30 de diciembre de 2016 12:56 p.m.
> >> Para: user@hbase.apache.org
> >> Asunto: Re: Is it possible to implement a NOT filter in Hbase?
> >>
> >> I think the ReturnCode opposite INCLUDE_AND_NEXT_COL is NEXT_COL :
> you're
> >> not interested in any version of the current Cell.
> >>
> >> Cheers
> >>
> >>> On Fri, Dec 30, 2016 at 4:53 AM, Carl M <soloninguno@hotmail.com>
> wrote:
> >>>
> >>> Thanks Ted! Great idea replacing the value in filterKeyValue. Although
> >> I'm
> >>> not quite sure looking at FilterList code if only INCLUDE/SKIP should
> be
> >>> replaced, and which should be the correct replacement for
> >>> INCLUDE_AND_NEXT_COL. What do you think? If not maybe i should try to
> >>> implement DeMorgan's law but I think it would be harder.
> >>>
> >>>
> >>> Best,
> >>>
> >>> ________________________________
> >>> De: Ted Yu <yuzhihong@gmail.com>
> >>> Enviado: jueves, 29 de diciembre de 2016 06:10 p.m.
> >>> Para: user@hbase.apache.org
> >>> Asunto: Re: Is it possible to implement a NOT filter in Hbase?
> >>>
> >>> You can try negating the ReturnCode from filterKeyValue() (at the root
> of
> >>> FilterList):
> >>>
> >>>  abstract public ReturnCode filterKeyValue(final Cell v) throws
> >>> IOException;
> >>>
> >>> INCLUDE -> SKIP
> >>>
> >>> SKIP -> INCLUDE
> >>>
> >>> Alternatively, you can use De Morgan's law to transfer the condition:
> >>>
> >>> NOT (a = '123' AND b = '456') becomes
> >>>
> >>> (NOT a = '123') OR (b = '456')
> >>>
> >>>> On Thu, Dec 29, 2016 at 12:56 PM, Carl M <soloninguno@hotmail.com>
> >>> wrote:
> >>>
> >>>> Hi guys
> >>>>
> >>>>
> >>>> I'm trying to implement some kind of NOT filter in Hbase, but don't
> >> know
> >>>> if possible, I'm playing with FilterIfMissing and FilterList.Operator
> >> but
> >>>> without luck.
> >>>>
> >>>>
> >>>> I know how to return rows not having a specific column, but I mean
> >>>> something like returning rows NOT fullfilling a condition, where
> >>> condition
> >>>> could be not only a SingleColumnValueFilter but a combined condition
> >> with
> >>>> FilterList. In SQL would be something like this for example
> >>>>
> >>>>
> >>>> SELECT * FROM table WHERE NOT (a = '123' AND b = '456');
> >>>>
> >>>>
> >>>> Thanks in advance,
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message