Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8FB992004CA for ; Wed, 11 May 2016 15:20:19 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 8E770160A13; Wed, 11 May 2016 13:20:19 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E26C81602BE for ; Wed, 11 May 2016 15:20:18 +0200 (CEST) Received: (qmail 52601 invoked by uid 500); 11 May 2016 13:20:13 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 52546 invoked by uid 99); 11 May 2016 13:20:13 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 May 2016 13:20:13 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id DD9AC2C1F62 for ; Wed, 11 May 2016 13:20:12 +0000 (UTC) Date: Wed, 11 May 2016 13:20:12 +0000 (UTC) From: "Daniel Templeton (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 11 May 2016 13:20:19 -0000 [ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280098#comment-15280098 ] Daniel Templeton commented on MAPREDUCE-6657: --------------------------------------------- OK. Latest patch looks good to me. [~rkanter]? > job history server can fail on startup when NameNode is in start phase > ---------------------------------------------------------------------- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver > Reporter: Haibo Chen > Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch > > > Job history server will try to create a history directory in HDFS on startup. When NameNode is in safe mode, it will keep retrying for a configurable time period. However, it should also keeps retrying if the name node is in start state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown when the NN is in its internal service startup phase. We should add the check for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org