accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-2232) Combiners can cause deleted data to come back
Date Tue, 01 Sep 2015 17:32:46 GMT


Keith Turner commented on ACCUMULO-2232:

 I was trying to think of something that could be done for 1.6.4 to improve this situation.
 One thing I thought of is doing the following as a subtask.
 * Log an error if combiner sees a delete marker (and option suggested below is not configured).
 Would use some static state to ensure error msg is not logged to often.
 * Add a combiner option to only run a full major compaction  (gives users a possible work
around for 1.6.4)

If I hear no objections I'll do this in a few days for 1.6.4

> Combiners can cause deleted data to come back
> ---------------------------------------------
>                 Key: ACCUMULO-2232
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>          Components: client, tserver
>            Reporter: John Vines
> The case-
> 3 files with-
> * 1 with a key, k, with timestamp 0, value 3
> * 1 with a delete of k with timestamp 1
> * 1 with k with timestamp 2, value 2
> The column of k has a summing combiner set on it. The issue here is that depending on
how the major compactions play out, differing values with result. If all 3 files compact,
the correct value of 2 will result. However, if 1 & 3 compact first, they will aggregate
to 5. And then the delete will fall after the combined value, resulting in the result 5 to
> First and foremost, this should be documented. I think to remedy this, combiners should
only be used on full MajC, not not full ones. This may necessitate a special flag or a new
combiner that implemented the proper semantics.

This message was sent by Atlassian JIRA

View raw message