Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 447C8200B5A for ; Thu, 4 Aug 2016 19:17:22 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 2E994160AAE; Thu, 4 Aug 2016 17:17:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 76EA6160A6A for ; Thu, 4 Aug 2016 19:17:21 +0200 (CEST) Received: (qmail 69124 invoked by uid 500); 4 Aug 2016 17:17:20 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 69112 invoked by uid 99); 4 Aug 2016 17:17:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Aug 2016 17:17:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 71C992C029E for ; Thu, 4 Aug 2016 17:17:20 +0000 (UTC) Date: Thu, 4 Aug 2016 17:17:20 +0000 (UTC) From: "Christopher Tubbs (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ACCUMULO-4398) Possible for client to see TableNotFoundException adding splits on a newly created table MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 04 Aug 2016 17:17:22 -0000 [ https://issues.apache.org/jira/browse/ACCUMULO-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408157#comment-15408157 ] Christopher Tubbs commented on ACCUMULO-4398: --------------------------------------------- In some cases, though I'm not sure all, when a server is receives a table operation and the table doesn't exist, it will clear its ZooCache and check again. However, it looks like this isn't good enough, as it may still get old data. It looks like what we should be doing is a sync() after we clear the cache: https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#N1043A > Possible for client to see TableNotFoundException adding splits on a newly created table > ---------------------------------------------------------------------------------------- > > Key: ACCUMULO-4398 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4398 > Project: Accumulo > Issue Type: Bug > Components: client, zookeeper > Affects Versions: 1.7.1 > Reporter: Josh Elser > > Just came across a really odd scenario. I believe that it's a race condition in the client that stems from our beloved {{ZooCache}}. > This was observed via a test failure in {{LogicalTimeIT}}: > {noformat} > Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 29.249 sec <<< FAILURE! - in org.apache.accumulo.test.functional.LogicalTimeIT > run(org.apache.accumulo.test.functional.LogicalTimeIT) Time elapsed: 29.037 sec <<< ERROR! > org.apache.accumulo.core.client.TableNotFoundException: Table LogicalTimeIT_run06 does not exist > at org.apache.accumulo.core.client.impl.Tables._getTableId(Tables.java:117) > at org.apache.accumulo.core.client.impl.Tables.getTableId(Tables.java:102) > at org.apache.accumulo.core.client.impl.TableOperationsImpl.addSplits(TableOperationsImpl.java:374) > at org.apache.accumulo.test.functional.LogicalTimeIT.runMergeTest(LogicalTimeIT.java:81) > at org.apache.accumulo.test.functional.LogicalTimeIT.run(LogicalTimeIT.java:56) > {noformat} > Ultimately: > {code} > conn.tableOperations().create(table, new NewTableConfiguration().setTimeType(TimeType.LOGICAL)); > TreeSet splitSet = new TreeSet(); > for (String split : splits) { > splitSet.add(new Text(split)); > } > conn.tableOperations().addSplits(table, splitSet); > {code} > The important piece to remember is that a ZooKeeper client, when a watcher is set, will eventually get all updates from that watcher in the order which they occurred. LogicalTimeIT is repeatedly running the same test over tables of varying characteristics. I think these are the important points. > Consider the following: > # Client creates a table T1 > # ZooCache is cleared after FATE op finishes > # Watcher is set on ZTABLES in ZK > # Client interacts with T1 > # Client creates T2 > # ZooCache is cleared after FATE op finishes > # Watcher fires on ZTABLES node in ZK (CHILDREN_CHANGED) and repopulates the child list on the ZTABLES node > # Client makes call to split T2 > # Code will check if the table exists, but the childrenCache will be repopulated in ZooCache which will cause the client to think the table doesn't exit > # Eventually, the watcher would fire and ZTABLES would be updated and everything is fine. > I believe this is a plausible scenario, however perhaps unlikely. -- This message was sent by Atlassian JIRA (v6.3.4#6332)