hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16056) Procedure v2 - fix master crash for FileNotFound
Date Fri, 17 Jun 2016 22:30:05 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15337126#comment-15337126
] 

Hudson commented on HBASE-16056:
--------------------------------

FAILURE: Integrated in HBase-1.3 #744 (See [https://builds.apache.org/job/HBase-1.3/744/])
HBASE-16056 Procedure v2 - fix master crash for FileNotFound (matteo.bertozzi: rev a9fe7dcf2c38e783d9651b2e673c6933b5860ce7)
* hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/store/wal/TestWALProcedureStore.java
* hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/store/wal/WALProcedureStore.java


> Procedure v2 - fix master crash for FileNotFound
> ------------------------------------------------
>
>                 Key: HBASE-16056
>                 URL: https://issues.apache.org/jira/browse/HBASE-16056
>             Project: HBase
>          Issue Type: Sub-task
>          Components: proc-v2
>    Affects Versions: 2.0.0, 1.3.0, 1.2.1, 1.1.5
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>            Priority: Minor
>             Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6
>
>         Attachments: HBASE-16056-v0.patch, HBASE-16056-v1.patch, HBASE-16056-v2.patch
>
>
> [~syuanjiang] and [~tedyu] reported a backup master not able to start with FileNotFound
during proc-v2 lease recovery. (another restart should have solved the problem)
> {noformat}
> FileNotFoundException: File does not exist: /hbase/MasterProcWALs/state-000001.log
> namenode.INodeFile.valueOf(INodeFile.java:61) at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLease(FSNamesystem.java:2877)
at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.recoverLease(NameNodeRpcServer.java:753)
at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.recoverLease(ClientNamenodeProtocolServerSideTranslatorPB.java:671)

> {noformat}
> this may happen when the other master is still active (e.g. GC) and tries to remove files
while the other master tries to become active. This operation is retryable so the code should
able to handle that.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message