hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Biplob Biswas <revolutioni...@gmail.com>
Subject Re: difference - filterlist with rowfilters vs multiget
Date Thu, 16 Aug 2018 08:39:27 GMT
Thanks Ted for the response, we have to look over the entire table for the
rowkeys so we can't set range for the scanner. It looks like we should
generally do multigets when we know the rowkeys and use the filterlist for
all other kind of filters.

Although I would check out the source and see if I can do something about
filterlist containing rowkey filters, so as to optimize it.

Thanks a lot.

Thanks & Regards
Biplob Biswas


On Wed, Aug 15, 2018 at 8:46 PM Manjeet Singh <manjeet.chandhok@gmail.com>
wrote:

> Scan always comes with performance cost as its seek the table regardless
> you are having filters, on the other hand Get is always give you the better
> results, its actually based on your requirement Get can be helpful if you
> the rowkey if you don't know you have to scan, yeah range based scan can
> improve performance.
>
> Thanks
> Manjeet singh
>
> On Wed, 15 Aug 2018, 14:37 Biplob Biswas, <revolutionisme@gmail.com>
> wrote:
>
> > Hi,
> >
> > During our implementation for fetching multiple records from an HBase
> > table, we came across a discussion regarding the best way to get records
> > out.
> >
> > The first implementation is something like:
> >
> >           FilterList filterList = new FilterList(Operator.MUST_PASS_ONE);
> > >           for (String rowKey : rowKeys) {
> > >             filterList.addFilter(new RowFilter(CompareOp.EQUAL,new
> > > BinaryComparator(Bytes.toBytes(rowKey))));
> > >           }
> > >
> > >   Scan scan = new Scan();
> > >   scan.setFilter(filterList);
> > >   ResultScanner resultScanner = table.getScanner(scan);
> >
> >
> > and the second implementation is somethign like this:
> >
> >          List<Get> listGet = rowKeys.stream()
> > >               .map(entry -> {
> > >                 Get get = new Get(Bytes.toBytes(entry));
> > >                 return get;
> > >               })
> > >               .collect(Collectors.toList());
> > >   Result[] results = table.get(listGet)
> >
> >
> > The only difference I see directly is that filterList would do a full
> table
> > scan whereas multiget wouldn't do anything as such.
> >
> > But what other benefits one has over the other? Also, when HBase finds
> out
> > that all the filters in the filterList are RowFilters, would it perform
> > some kind of optimization and perform multiget rather than doing a full
> > table scan?
> >
> >
> > Thanks & Regards
> > Biplob Biswas
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message