accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Fuchs <>
Subject Re: exception thrown during minor compaction
Date Tue, 01 Sep 2015 14:37:02 GMT
In your case, there shouldn't be a delete marker unless you're explicitly
writing one.

The tricky thing about deletes in a summing combiner is that sums and
deletes together are not commutative, and combiners require associativity
and commutativity. If I have three operations: add 1 to x, delete x, and
then add 1 to x, I might reasonably expect the result of performing these
operations in order to be x = 1. However, if I reorder the first add and
the delete operations I could get alternatively get x = 2. When using a
combiner this could happen when the first and last entries are included in
two files that go through a non-full major compaction, and the second entry
is in a third file that is not included. For this reason, we shouldn't have
general support for deletes in a SummingCombiner (but maybe we should have
better documentation).

There are a couple of alternative implementations to get delete
1. Use a read-write loop to negate the current value of a key. Read the
current value and write back the same key with negative that value. Make
sure to batch this for performance.
2. Write a different iterator that supports deletes, but only operates on
minor compaction and full major compaction scopes.

There may also be a project that the Accumulo dev community would be
interested in, which would be to add a compaction strategy that makes sure
compactions always include a contiguous range of timestamps. I think this
would remove the requirement for commutativity in iterator operations and
wouldn't introduce performance problems in most cases.


On Tue, Sep 1, 2015 at 9:13 AM, z11373 <> wrote:

> Thanks Eric and Josh.
> There shouldn't be delete marker because my code doesn't perform any delete
> operation, right?
> Josh: if that out-of-the-box SummingCombiner cannot handle delete marker,
> then I'd think that's bug :-)
> Thanks,
> Z
> --
> View this message in context:
> Sent from the Developers mailing list archive at

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message