hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anoop Sam John <anoo...@huawei.com>
Subject RE: Temporal in Hbase?
Date Thu, 11 Oct 2012 05:10:14 GMT
Hi Shumin,

>start_time < const_et and end_time >= const_st or end_time is null.
Your problem was end_time >= const_st or end_time is null.....

You can make use of FilterList with MUST_PASS_ALL (AND) condition only.. This can contain
one SingleColumnValueFilter correspodning to start_time with your condition and value.  "end_time"
is null means you are not having any KV added in the row for this column? [This column value
is missing for a row] In that case you can use a SingleColumnValueFilter with condition and
value "const_st"  and to this filter set  filterIfMissing(false). This means if the column
is absent for a row, that row wont get filtered out.. This is what you want right? [But the
default value for filterIfMissing is false only ]

FYI a FilterList can contain another filter list
So if you have a query like  col1=? AND ( col2=? OR col2=? ) you can use FilterList.. One
inner filter list with MUST_PASS_ONE for col2 and an outer FL with MUST_PASS_ALL which contains
the inner FL and SCVF for col1..    Hope I understood your problem and giving the answer which
you are looking for   :)

From: Shumin Wu [shumin.wu@gmail.com]
Sent: Thursday, October 11, 2012 4:54 AM
To: user@hbase.apache.org
Subject: Re: Temporal in Hbase?

How I can miss this reply!!

Hi Anoop,

First, thanks for your reply to my question and apologize for not following
up promptly. I have put off a million of fires and come back to this issue.
Here are my thoughts. Yes, a FilterList with MUST_PASS_ALL works fine for
simple temporal clause.

However, I have a use case like this. I need to find all data having
overlapping time range for a given time range. Some data are valid till
now, which have a open-ended end time timestamp, marked as end_time = null
in our database.

To express it formally, for a given time range [const_st, const_end], where
const_st represents the constant start time and const_et the constant end
time, my task is to find all data rows with start_time and end_time
satisfying this expression:

start_time < const_et and end_time >= const_st or end_time is null.

In a FilterList, I can choose either MUST_PASS_ALL or MUST_PASS_ONE, but
none is applicable to this use case.

It would be nice if there is a temporal filter that allows me to select
data valid between [const_st, const_et] (and that end_time is null will be
automatically interpreted as valid up to now).

My domain is not traditionally Internet area, but I am sure folks in
clickstream business have a similar need. And I am wondering how they solve
this problem.

Temporal is commonly supported in traditional databases. So maybe HBase can
offer the same? I guess the current version does not have this support, and
a customer filter needs to be written by myself. I could be wrong. Please
enlighten me.

Shumin Wu

On Mon, Sep 17, 2012 at 8:16 PM, Anoop Sam John <anoopsj@huawei.com> wrote:

> Hi
> start_time and end_time are 2 qualifiers in your table.
> You can use a FilterList with MUST_PASS_ALL ( AND condition)
> Add SingleColumnValueFilter for each of the qualifier with the value and
> condition..
> -Anoop-
> ________________________________________
> From: Shumin Wu [shumin.wu@gmail.com]
> Sent: Monday, September 17, 2012 9:58 PM
> To: user@hbase.apache.org
> Subject: Temporal in Hbase?
> Hi,
> I have a user case to "filter" out rows using an "as of" predicate. For
> example, given a specific time point T, I would like to find all rows where
> start_time<=T<=end_time.
> Example hbase table schema:
> row_key, col_A, col_B, start_time, end_time
> I am wondering if there is any existing filter that allows me to do this.
> If not, I guess I would have to write my own custom filter.
> Thanks,
> Shumin Wu
View raw message