hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: setFilter for Delete operations?
Date Thu, 25 Dec 2014 04:39:47 GMT
bq. Using a scan for just one known row

Can you batch some deletions in one invocation of the endpoint ?

Supporting filter in the delete path requires non-trivial amount of work.
So for the time being, please use BulkDeleteEndpoint.

Cheers

On Wed, Dec 24, 2014 at 6:23 PM, Devaraja Swami <devarajaswami@gmail.com>
wrote:

> Thanks for your reply, Ted. I looked into the coprocessor example you
> provided. It will definitely address my specific need. However, two aspects
> of this approach seem less than ideal to me:
> 1. Being a coprocessor service, I believe the endpoint needs to be
> pre-installed on the region servers. This is not possible in typical cases
> where the user does not have influence over the HBase installation or
> administrators.
> 2. In my use case, I already know the row key for which I need the
> specified column qualifier prefixes to be deleted. Using a scan for just
> one known row, as in the coprocessor example, appears to be a bit of an
> overkill...
>
> Overall, the coprocessor approach seems somewhat like using a hammer to
> push in a pushpin. Specifying a filter from the client side is much easier
> and more straightforward, IMHO.
>
>
> On Wed, Dec 24, 2014 at 2:01 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > Have you looked
> > at
> >
> hbase-examples/src/main/java/org/apache/hadoop/hbase/coprocessor/example/BulkDeleteEndpoint.java
> > to see if it fits your need ?
> >
> > Cheers
> >
> > On Wed, Dec 24, 2014 at 1:34 PM, Devaraja Swami <devarajaswami@gmail.com
> >
> > wrote:
> >
> > > Are there any plans for including a Filter for Delete?
> > > Currently, the only way seems to be via checkAndDelete in HTable/Table.
> > > This is helpful but does not cover all use cases.
> > >
> > > For e.g., I use column qualifier prefixes as a sort of poor man's 2rd
> > level
> > > of indexing (i.e, 3 levels of indexing comprising row key --> column
> > > qualifier prefix --> column qualifier suffix). This works well for Get
> > and
> > > Scan, since I can use a prefix column qualifier filter for the 2nd
> > indexing
> > > level.
> > > However, I am not able to specify that an entire set of column
> qualifiers
> > > sharing the same prefix should be deleted, without doing a Get first to
> > > identify all the full column qualifier values with the same prefix, and
> > > then adding those qualifiers to the Delete. This is obviously highly
> > > inefficient.
> > >
> > > checkAndDelete doesn't help here since it does not support prefix
> tests.
> > > Moreover, I cannot just add a new column family for every unique column
> > > qualifier prefix I need in my data model. In general, using just one
> > column
> > > family per table seems to be most efficient.
> > >
> > > I can think of other use cases where one would need to delete a lot of
> > > columns that match one of the available HBase filters, but whose exact
> > > column qualifier values are not known at deletion time at the client.
> > >
> > > All these uses cases can be taken care of by allowing Delete to
> support a
> > > setFilter method, exactly as in the case of Get and Scan.
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message