hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18165) Predicate based deletion during major compactions
Date Mon, 05 Jun 2017 21:49:04 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037663#comment-16037663

Andrew Purtell commented on HBASE-18165:


See section 7.5.1 (Filter) and 7.8 (Compaction-time Iterators). You'd be able to filter out
by key-predicate. There are some limitations enumerated in the doc that would hold for an
implementation in side of HBase too, e.g.

Iterators will not necessarily see all of the Key-Value pairs in ever invocation. Because
compactions often do not rewrite all files (only a subset of them), it is possible that the
logic take this into consideration.
a Combiner that runs over data at during compactions, might not see all of the values for
a given Key. The Combiner must recognize this and not perform any function that would be incorrect
due to the missing values.

> Predicate based deletion during major compactions
> -------------------------------------------------
>                 Key: HBASE-18165
>                 URL: https://issues.apache.org/jira/browse/HBASE-18165
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Lars Hofhansl
> In many cases it is expensive to place a delete per version, column, or family.
> HBase should have way to specify a predicate and remove all Cells matching the predicate
during the next compactions (major and minor).
> Nothing more concrete. The tricky part would be to know when it is safe to remove the
predicate, i.e. when we can be sure that all Cells matching the predicate actually have been
> Could potentially use HBASE-12859 for that.

This message was sent by Atlassian JIRA

View raw message