Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BA18F18B98 for ; Tue, 22 Sep 2015 04:39:04 +0000 (UTC) Received: (qmail 98598 invoked by uid 500); 22 Sep 2015 04:39:04 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 98554 invoked by uid 500); 22 Sep 2015 04:39:04 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 98540 invoked by uid 99); 22 Sep 2015 04:39:04 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Sep 2015 04:39:04 +0000 Date: Tue, 22 Sep 2015 04:39:04 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ACCUMULO-2232) Combiners can cause deleted data to come back MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ACCUMULO-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14901923#comment-14901923 ] ASF GitHub Bot commented on ACCUMULO-2232: ------------------------------------------ Github user joshelser commented on a diff in the pull request: https://github.com/apache/accumulo/pull/47#discussion_r40052565 --- Diff: core/src/test/java/org/apache/accumulo/core/iterators/user/CombinerTest.java --- @@ -786,4 +816,107 @@ public void testAdds() { assertEquals(LongCombiner.safeAdd(Long.MAX_VALUE - 5, 5), Long.MAX_VALUE); } + private TreeMap readAll(SortedKeyValueIterator combiner) throws Exception { + TreeMap ret = new TreeMap(); + + combiner.seek(new Range(), EMPTY_COL_FAMS, false); + + while (combiner.hasTop()) { + ret.put(new Key(combiner.getTopKey()), new Value(combiner.getTopValue())); + combiner.next(); + } + + return ret; + } + + private void runDeleteHandlingTest(TreeMap input, TreeMap expected, DeleteHandlingAction dha, IteratorEnvironment env) + throws Exception { + runDeleteHandlingTest(input, expected, dha, env, null); + } + + private void runDeleteHandlingTest(TreeMap input, TreeMap expected, DeleteHandlingAction dha, IteratorEnvironment env, + String expectedLog) throws Exception { + boolean deepCopy = expected == null; + + StringWriter writer = new StringWriter(); + WriterAppender appender = new WriterAppender(new PatternLayout("%p, %m%n"), writer); + Logger logger = Logger.getLogger(Combiner.class); + boolean additivity = logger.getAdditivity(); + try { + logger.addAppender(appender); + logger.setAdditivity(false); + + Combiner ai = new SummingCombiner(); + + IteratorSetting is = new IteratorSetting(1, SummingCombiner.class); + SummingCombiner.setEncodingType(is, LongCombiner.StringEncoder.class); + Combiner.setColumns(is, Collections.singletonList(new IteratorSetting.Column("cf001"))); + if (dha != null) { + Combiner.setDeleteHandlingAction(is, dha); + } + + ai.init(new SortedMapIterator(input), is.getOptions(), env); + + if (deepCopy) + assertEquals(expected, readAll(ai.deepCopy(env))); + assertEquals(expected, readAll(ai)); + + } finally { + logger.removeAppender(appender); --- End diff -- Glad to see the try/finally logger reset. > Combiners can cause deleted data to come back > --------------------------------------------- > > Key: ACCUMULO-2232 > URL: https://issues.apache.org/jira/browse/ACCUMULO-2232 > Project: Accumulo > Issue Type: Bug > Components: client, tserver > Reporter: John Vines > > The case- > 3 files with- > * 1 with a key, k, with timestamp 0, value 3 > * 1 with a delete of k with timestamp 1 > * 1 with k with timestamp 2, value 2 > The column of k has a summing combiner set on it. The issue here is that depending on how the major compactions play out, differing values with result. If all 3 files compact, the correct value of 2 will result. However, if 1 & 3 compact first, they will aggregate to 5. And then the delete will fall after the combined value, resulting in the result 5 to persist. > First and foremost, this should be documented. I think to remedy this, combiners should only be used on full MajC, not not full ones. This may necessitate a special flag or a new combiner that implemented the proper semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)