accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ACCUMULO-3774) Deadlock after recovering root tablet
Date Tue, 05 May 2015 20:33:59 GMT
Keith Turner created ACCUMULO-3774:
--------------------------------------

             Summary: Deadlock after recovering root tablet
                 Key: ACCUMULO-3774
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3774
             Project: Accumulo
          Issue Type: Bug
         Environment: Hadoop 2.7.0, ZK 3.4.6, Accumulo 83d1b8388ad807d678c9a3a922e5025faa9a5933,
20 node m3.large EC2 cluster
            Reporter: Keith Turner
            Priority: Blocker
             Fix For: 1.7.0


I started CI running against 1.7.0-SNAP.   After CI ran for while I started agitation.   Then
everything froze up.   The root tablet node was killed, the root tablet had a lot of walogs
(will open a seperate issue for this), the root tablet was reloaded on another machine.  However
it hung up while loading with the following issue.  The minor compaction after recovery was
trying to write to the root tablet.  This happened before the root tablet location was set.

{noformat}
"Minor compacting +r<<" daemon prio=10 tid=0x00000000046cd800 nid=0x3508 in Object.wait()
[0x00007fb0ac3b1000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:503)
        at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.waitRTE(TabletServerBatchWriter.java:459)
        at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.close(TabletServerBatchWriter.java:352)
        - locked <0x000000078d154840> (a org.apache.accumulo.core.client.impl.TabletServerBatchWriter)
        at org.apache.accumulo.core.client.impl.BatchWriterImpl.close(BatchWriterImpl.java:54)
        at org.apache.accumulo.server.util.MetadataTableUtil.markLogUnused(MetadataTableUtil.java:1131)
        at org.apache.accumulo.tserver.TabletServer.markUnusedWALs(TabletServer.java:3032)
        at org.apache.accumulo.tserver.TabletServer.minorCompactionFinished(TabletServer.java:2917)
        at org.apache.accumulo.tserver.tablet.DatafileManager.bringMinorCompactionOnline(DatafileManager.java:440)
        at org.apache.accumulo.tserver.tablet.Tablet.minorCompact(Tablet.java:956)
        at org.apache.accumulo.tserver.tablet.MinorCompactionTask.run(MinorCompactionTask.java:84)
        at org.apache.accumulo.tserver.tablet.Tablet.minorCompactNow(Tablet.java:1080)
        at org.apache.accumulo.tserver.TabletServer$AssignmentHandler.run(TabletServer.java:2124)
        at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler$3.run(TabletServer.java:1510)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message