accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-1651) GC removed WAL that master wasn't done with
Date Fri, 09 Aug 2013 18:20:49 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735101#comment-13735101
] 

Eric Newton commented on ACCUMULO-1651:
---------------------------------------

This is probably due to switching over to the Root Table.  I'll bet that confirmDeletes is
not taking the root table log reference (in zookeeper) into account.

                
> GC removed WAL that master wasn't done with
> -------------------------------------------
>
>                 Key: ACCUMULO-1651
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1651
>             Project: Accumulo
>          Issue Type: Bug
>          Components: gc, master
>    Affects Versions: 1.6.0
>            Reporter: Michael Berman
>            Assignee: Michael Berman
>
> I have a master that's spinning trying to recover a walog that doesn't exist in hdfs.
 It looks like the GC cleaned it up.  I was stopping and starting my cluster throughout this
period, and there was at least a few minutes in which every service was talking SSL except
the GC, so the GC couldn't receive thrift messages from other services, but [~vines] says
this shouldn't affect the GC's deletion behavior.
> Here are some relevant logs.  Note that the master thinks its logSet includes that file
straight through the time the GC removed it.
> GC:
> {code}
> 2013-08-09 11:58:14,835 [util.MetadataTableUtil] INFO : Returning logs [!!R<< hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
(1)] for extent !!R<<
> 2013-08-09 11:58:14,852 [gc.GarbageCollectWriteAheadLogs] DEBUG: Removing WAL for offline
server hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
> 2013-08-09 12:03:15,467 [util.MetadataTableUtil] INFO : Returning logs [!!R<< hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
(1)] for extent !!R<<
> {code}
> Master:
> {code}
> 2013-08-09 11:57:45,235 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 11:57:45,238 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 11:57:45,286 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 11:57:45,324 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 11:57:45,939 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 11:57:45,942 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 11:57:45,975 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 11:57:55,612 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 11:57:55,679 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 11:57:55,739 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 11:57:55,764 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 11:57:55,784 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 11:57:56,031 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 11:57:56,046 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 11:58:56,051 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 11:59:56,057 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 12:00:56,062 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 12:01:56,066 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 12:02:56,071 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 12:08:56,103 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 12:09:56,108 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 12:10:56,113 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 12:11:56,118 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 12:13:19,883 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 12:14:19,887 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> <master was restarted here>
> 2013-08-09 12:15:44,459 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 12:15:44,467 [recovery.RecoveryManager] DEBUG: Recovering hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
to hdfs://localhost:54310/otherAccumuloInstance/recovery/5a383792-c89b-41ed-bc22-0802e76638f7
> 2013-08-09 12:15:44,472 [recovery.RecoveryManager] INFO : Starting recovery of hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
(in : 10s) created for localhost+9997, tablet !!R<< holds a reference
> 2013-08-09 12:15:54,479 [recovery.RecoveryManager] DEBUG: Unable to initate log sort
for hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7:
java.io.FileNotFoundException: java.io.FileNotFoundException: File not found /otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
> 2013-08-09 12:16:44,487 [state.ZooTabletStateStore] DEBUG: root tablet logSet [localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
> 2013-08-09 12:16:44,488 [recovery.RecoveryManager] DEBUG: Recovering hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
to hdfs://localhost:54310/otherAccumuloInstance/recovery/5a383792-c89b-41ed-bc22-0802e76638f7
> 2013-08-09 12:16:44,490 [recovery.RecoveryManager] INFO : Starting recovery of hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
(in : 20s) created for localhost+9997, tablet !!R<< holds a reference
> 2013-08-09 12:17:04,494 [recovery.RecoveryManager] DEBUG: Unable to initate log sort
for hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7:
java.io.FileNotFoundException: java.io.FileNotFoundException: File not found /otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
> <repeating ad infinitum>
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message