accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Berman (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ACCUMULO-1651) GC removed WAL that master wasn't done with
Date Fri, 09 Aug 2013 17:16:50 GMT
Michael Berman created ACCUMULO-1651:
----------------------------------------

             Summary: GC removed WAL that master wasn't done with
                 Key: ACCUMULO-1651
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1651
             Project: Accumulo
          Issue Type: Bug
          Components: gc, master
    Affects Versions: 1.6.0
            Reporter: Michael Berman
            Assignee: Michael Berman


I have a master that's spinning trying to recover a walog that doesn't exist in hdfs.  It
looks like the GC cleaned it up.  I was stopping and starting my cluster throughout this period,
and there was at least a few minutes in which every service was talking SSL except the GC,
so the GC couldn't receive thrift messages from other services, but [~vines] says this shouldn't
affect the GC's deletion behavior.


Here are some relevant logs.  Note that the master thinks its logSet includes that file straight
through the time the GC removed it.

GC:
{code}
2013-08-09 11:58:14,835 [util.MetadataTableUtil] INFO : Returning logs [!!R<< hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
(1)] for extent !!R<<
2013-08-09 11:58:14,852 [gc.GarbageCollectWriteAheadLogs] DEBUG: Removing WAL for offline
server hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
2013-08-09 12:03:15,467 [util.MetadataTableUtil] INFO : Returning logs [!!R<< hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
(1)] for extent !!R<<
{code}

Master:
{code}
2013-08-09 11:57:45,235 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:45,238 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:45,286 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:45,324 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:45,939 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:45,942 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:45,975 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:55,612 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:55,679 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:55,739 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:55,764 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:55,784 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:56,031 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:56,046 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:58:56,051 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:59:56,057 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:00:56,062 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:01:56,066 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:02:56,071 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:08:56,103 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:09:56,108 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:10:56,113 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:11:56,118 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:13:19,883 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:14:19,887 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
<master was restarted here>
2013-08-09 12:15:44,459 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:15:44,467 [recovery.RecoveryManager] DEBUG: Recovering hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
to hdfs://localhost:54310/otherAccumuloInstance/recovery/5a383792-c89b-41ed-bc22-0802e76638f7
2013-08-09 12:15:44,472 [recovery.RecoveryManager] INFO : Starting recovery of hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
(in : 10s) created for localhost+9997, tablet !!R<< holds a reference
2013-08-09 12:15:54,479 [recovery.RecoveryManager] DEBUG: Unable to initate log sort for hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7:
java.io.FileNotFoundException: java.io.FileNotFoundException: File not found /otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
2013-08-09 12:16:44,487 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:16:44,488 [recovery.RecoveryManager] DEBUG: Recovering hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
to hdfs://localhost:54310/otherAccumuloInstance/recovery/5a383792-c89b-41ed-bc22-0802e76638f7
2013-08-09 12:16:44,490 [recovery.RecoveryManager] INFO : Starting recovery of hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
(in : 20s) created for localhost+9997, tablet !!R<< holds a reference
2013-08-09 12:17:04,494 [recovery.RecoveryManager] DEBUG: Unable to initate log sort for hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7:
java.io.FileNotFoundException: java.io.FileNotFoundException: File not found /otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
<repeating ad infinitum>
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message