hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Hierarchy of filters and filters list
Date Tue, 18 Nov 2014 16:24:00 GMT
Are you able to reproduce this using a unit test ?

I will take a closer look. 

Thanks 

On Nov 18, 2014, at 8:06 AM, Shahab Yunus <shahab.yunus@gmail.com> wrote:

> You mean if used independently? Yes, they do.
> 
> Regards,
> Shahab
> 
> On Tue, Nov 18, 2014 at 10:51 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> 
>> Have you verified that at least one of the following (when used alone)
>> returns data ?
>> (A and B), (B and C), (D and E)
>> 
>> Thanks
>> 
>> On Mon, Nov 17, 2014 at 9:27 PM, Shahab Yunus <shahab.yunus@gmail.com>
>> wrote:
>> 
>>> Missed couple of things.
>>> 
>>> 1- I am using SingleColumnValueFilter and the comparator
>>> is BinaryComparator which is passed into it.
>>> 
>>> 2- CDH 5.1.0
>>> (Hbase is 0.98.1-cdh5.1.0)
>>> 
>>> Regards,
>>> Shahab
>>> 
>>> On Tue, Nov 18, 2014 at 12:22 AM, Shahab Yunus <shahab.yunus@gmail.com>
>>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I have data where each row has start and end time stored in UTC (long).
>>>> The table is created through Phoenix and the columns have type
>>>> UNSIGNED_DATE (which according to Phoenix docs
>>>> <http://phoenix.apache.org/language/datatypes.html#unsigned_date_type>
>>>> does Hbase.toBytes(long) underneath for 8 bye long). I am storing data
>> in
>>>> this table using regular Bytes.toBytes from HBase api as well.
>>>> 
>>>> Now I want to query data given a time range, and get all rows lying
>>> within
>>>> or overlapping the search range. Pretty standard scenario.
>>>> 
>>>> For this I create a set of filtersList. A hierarchy of filtersList and
>>>> filters in fact.
>>>> 
>>>> If search criteria timerange  is denoted by *sd* and *ed*
>>>> 
>>>> And each row's date columns are denoted as *s* and *e* (signifying
>> start
>>>> and end datetimes.)
>>>> 
>>>> These 4 filterLists are created as per logic given below....
>>>> 
>>>> filterListLeft (must past all)= This further contains 2 filters= (sd<=
>> s
>>>> and ed>=s)
>>>> 
>>>> filterListRight (must past all)=This further contains 2 filters= (sd<=
>> e
>>>> and ed>=e)
>>>> 
>>>> filterListOverlap (must past all)=This further contains 2 filters=
>> (sd<=
>>> s
>>>> and ed>=e)
>>>> 
>>>> filterListWiithin (must past all)= This further contains 2 filters=
>> (sd>=
>>>> s and ed<=e)
>>>> 
>>>> 
>>>> Then I add these 4 filterLists into another filterList and that must
>> past
>>>> one. I realize that some records might satisfy more than one filter
>>> above.
>>>> But that is OK.
>>>> 
>>>> parentFilterList = new FilterList(must past one)
>>>> parentFilterList.addFilter(filterListLeft):
>>>> parentFilterList.addFilter(filterListRight):
>>>> parentFilterList.addFilter(filterListOverlap):
>>>> parentFilterList.addFilter(filterListWithin):
>>>> 
>>>> Note all filters have setFilterIfMissing = true.
>>>> 
>>>> Then I pass parentFilterList.addFilter to the scanner.
>>>> 
>>>> So it is like= (A and B) or (B and C) or (D and E) or (F and G)
>>>> 
>>>> But what is happening is that I only get data back for the records
>>>> matching filterListWithin. No records which satisfy the other 3
>>> filterList
>>>> criteria comeback. The data exists and is valid form for other
>>> scenarios. I
>>>> can also view it through Phoenix UI tools.
>>>> 
>>>> What am I missing? Could this be a phoenix issue?
>>>> 
>>>> Thanks like always.
>>>> 
>>>> Regards,
>>>> Shahab
>> 

Mime
View raw message