hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Would ROWCOL Bloom filter help in Scan
Date Thu, 03 Dec 2015 17:01:32 GMT
On Wed, Dec 2, 2015 at 10:01 PM, Jerry He <jerryjch@gmail.com> wrote:

> Thanks for the response.  You got my question correctly.
> If we are scanning the rows one by one and we have the requested column in
> the column tracker, we have the row+column to look up in the bloom filter,
> don't we? We may not be able to filter out the file scanners upfront. But
> may at the later time and lower level to skip something?
>
>
<I've not looked at the code>You are right. If more than one explicit
column specified, we could do a bloom check for the second and so on since
we'd have the current row to hand. It could make for a nice speedup for
scans of many explicit columns traversing a dataset that is sparsely
populated.</I've not looked at the code>.

St.Ack



> Jerry
>
> On Mon, Nov 30, 2015 at 10:55 PM, Stack <stack@duboce.net> wrote:
>
> > On Mon, Nov 30, 2015 at 9:56 AM, Jerry He <jerryjch@gmail.com> wrote:
> >
> > > Hi, experts
> > >
> > > HBASE supports ROWCOL bloom filter. ROW+COL would be the bloom key.
> > > In most of the documentations, it says only GET would benefit. For
> > > multi-column as well.
> > >
> > > If I do scan with StartRow and EndRow, and also specify columns.
> > > Would ROWCOL bloom filter provide any benefit in anyway?
> > >
> > >
> > If I understand your question properly, the answer is no. While we might
> > have a set of columns to check in the bloom, we'd not know the set of
> rows
> > between start and end row and so would not be able to formulate a query
> > against the ROW+COL bloom filter.
> >
> > St.Ack
> >
> >
> >
> > > Thank you.
> > >
> > > Jerry
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message