accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-4362) TabletStateChangeIteratorIT failure on cloning metadata table
Date Mon, 08 Aug 2016 02:10:20 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411182#comment-15411182
] 

Josh Elser commented on ACCUMULO-4362:
--------------------------------------

I was hoping this was just a dumb bug:

{noformat}
diff --git a/server/base/src/main/java/org/apache/accumulo/server/util/MetadataTableUtil.java
b/server/base/src/main/java/org/apache/accumulo/server/util/MetadataTableUtil.java
index b38083f..8d9b234 100644
--- a/server/base/src/main/java/org/apache/accumulo/server/util/MetadataTableUtil.java
+++ b/server/base/src/main/java/org/apache/accumulo/server/util/MetadataTableUtil.java
@@ -744,13 +744,18 @@ public class MetadataTableUtil {
   @VisibleForTesting
   public static int checkClone(String tableName, String srcTableId, String tableId, Connector
conn, BatchWriter bw) throws TableNotFoundException,
       MutationsRejectedException {
-    TabletIterator srcIter = new TabletIterator(createCloneScanner(tableName, srcTableId,
conn), new KeyExtent(srcTableId, null, null).toMetadataRange(), true,
-        true);
+    TabletIterator srcIter;
+    if (srcTableId.equals(MetadataTable.ID))
+      srcIter = new TabletIterator(createCloneScanner(tableName, srcTableId, conn), new Range(),
true, true);
+    else
+      srcIter = new TabletIterator(createCloneScanner(tableName, srcTableId, conn), new KeyExtent(srcTableId,
null, null).toMetadataRange(), true, true);
     TabletIterator cloneIter = new TabletIterator(createCloneScanner(tableName, tableId,
conn), new KeyExtent(tableId, null, null).toMetadataRange(), true,
         true);

-    if (!cloneIter.hasNext() || !srcIter.hasNext())
-      throw new RuntimeException(" table deleted during clone?  srcTableId = " + srcTableId
+ " tableId=" + tableId);
+    if (!cloneIter.hasNext())
+      throw new RuntimeException("Destination table deleted during clone?  tableId=" + tableId);
+    if (!srcIter.hasNext())
+      throw new RuntimeException("Source table deleted during clone?  srcTableId = " + srcTableId);

     int rewrites = 0;

@@ -855,7 +860,7 @@ public class MetadataTableUtil {
           // delete what we have cloned and try again
           deleteTable(tableId, false, context, null);

-          log.debug("Tablets merged in table " + srcTableId + " while attempting to clone,
trying again");
+          log.debug("Tablets merged in table " + srcTableId + " while attempting to clone,
trying again", tde);

           sleepUninterruptibly(100, TimeUnit.MILLISECONDS);
         }
{noformat}

Turns out, this just lead me to another issue. I had hoped to get to this yesterday, but I
had other FOSS work to do. With these changes, I'm now seeing the following:

{noformat}
2016-08-07 22:07:52,026 [util.MetadataTableUtil] DEBUG: Tablets merged in table !0 while attempting
to clone, trying again
org.apache.accumulo.server.util.TabletIterator$TabletDeletedException: Tablets deleted from
src during clone : some split null
	at org.apache.accumulo.server.util.MetadataTableUtil.checkClone(MetadataTableUtil.java:786)
	at org.apache.accumulo.server.util.MetadataTableUtil.cloneTable(MetadataTableUtil.java:847)
	at org.apache.accumulo.master.tableOps.CloneMetadata.call(CloneMetadata.java:45)
	at org.apache.accumulo.master.tableOps.CloneMetadata.call(CloneMetadata.java:24)
	at org.apache.accumulo.master.tableOps.TraceRepo.call(TraceRepo.java:57)
	at org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:74)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
	at java.lang.Thread.run(Thread.java:745)
{noformat}

I need to step back and figure out how this is actually supposed to work. I don't have any
understanding presently.

FYI, [~kturner], you may have some more familiarity than I do also.

> TabletStateChangeIteratorIT failure on cloning metadata table
> -------------------------------------------------------------
>
>                 Key: ACCUMULO-4362
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4362
>             Project: Accumulo
>          Issue Type: Bug
>          Components: test
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>            Priority: Blocker
>             Fix For: 1.8.0
>
>
> In the Master log:
> {noformat}
> 2016-07-09 16:22:15,858 [master.MasterClientServiceHandler] ERROR:  table deleted during
clone?  srcTableId = !0 tableId=4
> java.lang.RuntimeException:  table deleted during clone?  srcTableId = !0 tableId=4
> 	at org.apache.accumulo.server.util.MetadataTableUtil.checkClone(MetadataTableUtil.java:753)
> 	at org.apache.accumulo.server.util.MetadataTableUtil.cloneTable(MetadataTableUtil.java:842)
> 	at org.apache.accumulo.master.tableOps.CloneMetadata.call(CloneMetadata.java:45)
> 	at org.apache.accumulo.master.tableOps.CloneMetadata.call(CloneMetadata.java:24)
> 	at org.apache.accumulo.master.tableOps.TraceRepo.call(TraceRepo.java:57)
> 	at org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:74)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
> 	at java.lang.Thread.run(Thread.java:745)
> 2016-07-09 16:22:15,859 [thrift.ProcessFunction] ERROR: Internal error processing waitForFateOperation
> org.apache.thrift.TException:  table deleted during clone?  srcTableId = !0 tableId=4
> 	at org.apache.accumulo.server.rpc.RpcWrapper$1.invoke(RpcWrapper.java:81)
> 	at com.sun.proxy.$Proxy10.waitForFateOperation(Unknown Source)
> 	at org.apache.accumulo.core.master.thrift.FateService$Processor$waitForFateOperation.getResult(FateService.java:481)
> 	at org.apache.accumulo.core.master.thrift.FateService$Processor$waitForFateOperation.getResult(FateService.java:465)
> 	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> 	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> 	at org.apache.accumulo.server.rpc.TimedProcessor.process(TimedProcessor.java:63)
> 	at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:518)
> 	at org.apache.accumulo.server.rpc.CustomNonBlockingServer$CustomFrameBuffer.invoke(CustomNonBlockingServer.java:106)
> 	at org.apache.thrift.server.Invocation.run(Invocation.java:18)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
> 	at java.lang.Thread.run(Thread.java:745)
> {noformat}
> The test case:
> {noformat}
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 9.544 sec <<<
FAILURE! - in org.apache.accumulo.test.functional.TabletStateChangeIteratorIT
> test(org.apache.accumulo.test.functional.TabletStateChangeIteratorIT)  Time elapsed:
7.402 sec  <<< ERROR!
> org.apache.accumulo.core.client.AccumuloException: Internal error processing waitForFateOperation
> 	at org.apache.accumulo.test.functional.TabletStateChangeIteratorIT.cloneMetadataTable(TabletStateChangeIteratorIT.java:201)
> 	at org.apache.accumulo.test.functional.TabletStateChangeIteratorIT.test(TabletStateChangeIteratorIT.java:103)
> Caused by: org.apache.thrift.TApplicationException: Internal error processing waitForFateOperation
> 	at org.apache.accumulo.test.functional.TabletStateChangeIteratorIT.cloneMetadataTable(TabletStateChangeIteratorIT.java:201)
> 	at org.apache.accumulo.test.functional.TabletStateChangeIteratorIT.test(TabletStateChangeIteratorIT.java:103)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message