hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-18165) Predicate based deletion during major compactions
Date Mon, 05 Jun 2017 21:50:04 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037663#comment-16037663
] 

Andrew Purtell edited comment on HBASE-18165 at 6/5/17 9:49 PM:
----------------------------------------------------------------

[~davelatham] 
https://accumulo.apache.org/1.7/accumulo_user_manual.html#_iterator_design

See section 7.5.1 (Filter) and 7.8 (Compaction-time Iterators). You'd be able to filter out
by key-predicate. There are some limitations enumerated in the doc that would hold for an
implementation in HBase too, e.g.

{quote}
Iterators will not necessarily see all of the Key-Value pairs in ever invocation. Because
compactions often do not rewrite all files (only a subset of them), it is possible that the
logic take this into consideration.
[...]
a Combiner that runs over data at during compactions, might not see all of the values for
a given Key. The Combiner must recognize this and not perform any function that would be incorrect
due to the missing values.
{quote}


was (Author: apurtell):
[~davelatham] 
https://accumulo.apache.org/1.7/accumulo_user_manual.html#_iterator_design

See section 7.5.1 (Filter) and 7.8 (Compaction-time Iterators). You'd be able to filter out
by key-predicate. There are some limitations enumerated in the doc that would hold for an
implementation in side of HBase too, e.g.

{quote}
Iterators will not necessarily see all of the Key-Value pairs in ever invocation. Because
compactions often do not rewrite all files (only a subset of them), it is possible that the
logic take this into consideration.
[...]
a Combiner that runs over data at during compactions, might not see all of the values for
a given Key. The Combiner must recognize this and not perform any function that would be incorrect
due to the missing values.
{quote}

> Predicate based deletion during major compactions
> -------------------------------------------------
>
>                 Key: HBASE-18165
>                 URL: https://issues.apache.org/jira/browse/HBASE-18165
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Lars Hofhansl
>
> In many cases it is expensive to place a delete per version, column, or family.
> HBase should have way to specify a predicate and remove all Cells matching the predicate
during the next compactions (major and minor).
> Nothing more concrete. The tricky part would be to know when it is safe to remove the
predicate, i.e. when we can be sure that all Cells matching the predicate actually have been
removed.
> Could potentially use HBASE-12859 for that.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message