hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18165) Predicate based deletion during major compactions
Date Mon, 05 Jun 2017 21:57:05 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037672#comment-16037672

Josh Elser commented on HBASE-18165:

bq. Filters or other transformations that run at compaction time are essentially the Accumulo
Iterators idea. We've had this proposed in other contexts. I guess I have the same question
now as then, why not implement support for Accumulo-style Iterators?

The lay of abstract that Iterators provide is probably my favorite thing about the system.
The implementation isn't without its own warts, but it certainly makes a nice layer of abstraction
of operations over "streams" of data.

bq. Would that cover this predicate-based deletion also?

Could certainly do something which looks/feels like a Java8 streams filter. I would say that
limiting the kind of things people can do is probably better than allowing arbitrary predicates.
Like coprocessors, Accumulo iterators suffer from people trying to use them as tools for something
they weren't meant to do.

For some context, Accumulo's "system" level deleting iterator (in this case, handling delete
tombstones in a row+column): https://github.com/apache/accumulo/blob/f81a8ec7410e789d11941351d5899b8894c6a322/core/src/main/java/org/apache/accumulo/core/iterators/system/DeletingIterator.java

> Predicate based deletion during major compactions
> -------------------------------------------------
>                 Key: HBASE-18165
>                 URL: https://issues.apache.org/jira/browse/HBASE-18165
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Lars Hofhansl
> In many cases it is expensive to place a delete per version, column, or family.
> HBase should have way to specify a predicate and remove all Cells matching the predicate
during the next compactions (major and minor).
> Nothing more concrete. The tricky part would be to know when it is safe to remove the
predicate, i.e. when we can be sure that all Cells matching the predicate actually have been
> Could potentially use HBASE-12859 for that.

This message was sent by Atlassian JIRA

View raw message