hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haibo Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
Date Tue, 19 Apr 2016 20:59:25 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248625#comment-15248625
] 

Haibo Chen commented on MAPREDUCE-6657:
---------------------------------------

updated the test method according to [~templedf]'s comments, and moved it to a new test class
because it cannot share clusters with other test methods in TestHistoryFileManager.

> job history server can fail on startup when NameNode is in start phase
> ----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6657
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobhistoryserver
>            Reporter: Haibo Chen
>            Assignee: Haibo Chen
>         Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, mapreduce6677.003.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. When NameNode
is in safe mode, it will keep retrying for a configurable time period.  However, it should
also keeps retrying if the name node is in start state. Safe mode does not happen until the
NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown when the NN
is in its internal service startup phase. We should add the check for this specific exception
in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message