hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16056) Procedure v2 - fix master crash for FileNotFound
Date Sat, 18 Jun 2016 00:28:06 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15337310#comment-15337310
] 

Hudson commented on HBASE-16056:
--------------------------------

SUCCESS: Integrated in HBase-1.1-JDK7 #1733 (See [https://builds.apache.org/job/HBase-1.1-JDK7/1733/])
HBASE-16056 Procedure v2 - fix master crash for FileNotFound (matteo.bertozzi: rev ececf19dbaae38773f4b58454439a0914c4f8375)
* hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/store/wal/WALProcedureStore.java
* hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/store/wal/TestWALProcedureStore.java


> Procedure v2 - fix master crash for FileNotFound
> ------------------------------------------------
>
>                 Key: HBASE-16056
>                 URL: https://issues.apache.org/jira/browse/HBASE-16056
>             Project: HBase
>          Issue Type: Sub-task
>          Components: proc-v2
>    Affects Versions: 2.0.0, 1.3.0, 1.2.1, 1.1.5
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>            Priority: Minor
>             Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6
>
>         Attachments: HBASE-16056-v0.patch, HBASE-16056-v1.patch, HBASE-16056-v2.patch
>
>
> [~syuanjiang] and [~tedyu] reported a backup master not able to start with FileNotFound
during proc-v2 lease recovery. (another restart should have solved the problem)
> {noformat}
> FileNotFoundException: File does not exist: /hbase/MasterProcWALs/state-000001.log
> namenode.INodeFile.valueOf(INodeFile.java:61) at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLease(FSNamesystem.java:2877)
at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.recoverLease(NameNodeRpcServer.java:753)
at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.recoverLease(ClientNamenodeProtocolServerSideTranslatorPB.java:671)

> {noformat}
> this may happen when the other master is still active (e.g. GC) and tries to remove files
while the other master tries to become active. This operation is retryable so the code should
able to handle that.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message