accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ACCUMULO-4398) Possible for client to see TableNotFoundException adding splits on a newly created table
Date Wed, 03 Aug 2016 17:57:20 GMT
Josh Elser created ACCUMULO-4398:
------------------------------------

             Summary: Possible for client to see TableNotFoundException adding splits on a
newly created table
                 Key: ACCUMULO-4398
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4398
             Project: Accumulo
          Issue Type: Bug
          Components: client, zookeeper
            Reporter: Josh Elser


Just came across a really odd scenario. I believe that it's a race condition in the client
that stems from our beloved {{ZooCache}}.

This was observed via a test failure in {{LogicalTimeIT}}:

{noformat}
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 29.249 sec <<< FAILURE!
- in org.apache.accumulo.test.functional.LogicalTimeIT
run(org.apache.accumulo.test.functional.LogicalTimeIT)  Time elapsed: 29.037 sec  <<<
ERROR!
org.apache.accumulo.core.client.TableNotFoundException: Table LogicalTimeIT_run06 does not
exist
	at org.apache.accumulo.core.client.impl.Tables._getTableId(Tables.java:117)
	at org.apache.accumulo.core.client.impl.Tables.getTableId(Tables.java:102)
	at org.apache.accumulo.core.client.impl.TableOperationsImpl.addSplits(TableOperationsImpl.java:374)
	at org.apache.accumulo.test.functional.LogicalTimeIT.runMergeTest(LogicalTimeIT.java:81)
	at org.apache.accumulo.test.functional.LogicalTimeIT.run(LogicalTimeIT.java:56)
{noformat}

Ultimately:

{code}
    conn.tableOperations().create(table, new NewTableConfiguration().setTimeType(TimeType.LOGICAL));
    TreeSet<Text> splitSet = new TreeSet<Text>();
    for (String split : splits) {
      splitSet.add(new Text(split));
    }
    conn.tableOperations().addSplits(table, splitSet);
{code}

The important piece to remember is that a ZooKeeper client, when a watcher is set, will eventually
get all updates from that watcher in the order which they occurred. LogicalTimeIT is repeatedly
running the same test over tables of varying characteristics. I think these are the important
points.

Consider the following:

# Client creates a table T1
# ZooCache is cleared after FATE op finishes
# Watcher is set on ZTABLES in ZK
# Client interacts with T1
# Client creates T2
# ZooCache is cleared after FATE op finishes
# Watcher fires on ZTABLES node in ZK (CHILDREN_CHANGED) and repopulates the child list on
the ZTABLES node
# Client makes call to split T2
# Code will check if the table exists, but the childrenCache will be repopulated in ZooCache
which will cause the client to think the table doesn't exit
# Eventually, the watcher would fire and ZTABLES would be updated and everything is fine.

I believe this is a plausible scenario, however perhaps unlikely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message