Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 314BF18504 for ; Wed, 13 May 2015 14:21:10 +0000 (UTC) Received: (qmail 44389 invoked by uid 500); 13 May 2015 14:21:09 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 44330 invoked by uid 500); 13 May 2015 14:21:09 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 44317 invoked by uid 99); 13 May 2015 14:21:09 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 May 2015 14:21:09 +0000 Date: Wed, 13 May 2015 14:21:09 +0000 (UTC) From: "Hudson (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (MAPREDUCE-6251) JobClient needs additional retries at a higher level to address not-immediately-consistent dfs corner cases MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541976#comment-14541976 ] Hudson commented on MAPREDUCE-6251: ----------------------------------- FAILURE: Integrated in Hadoop-Hdfs-trunk #2124 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2124/]) MAPREDUCE-6251. Added a new config for JobClient to retry JobStatus calls so that they don't fail on history-server backed by DFSes with not so strong guarantees. Contributed by Craig Welch. (vinodkv: rev f24452d14e9ba48cdb82e5e6e5c10ce5b1407308) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/JobClientUnitTest.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java * hadoop-mapreduce-project/CHANGES.txt > JobClient needs additional retries at a higher level to address not-immediately-consistent dfs corner cases > ----------------------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-6251 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6251 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver, mrv2 > Affects Versions: 2.6.0 > Reporter: Craig Welch > Assignee: Craig Welch > Labels: BB2015-05-TBR > Fix For: 2.7.1 > > Attachments: MAPREDUCE-6251.0.patch, MAPREDUCE-6251.1.patch, MAPREDUCE-6251.2.patch, MAPREDUCE-6251.3.patch, MAPREDUCE-6251.4.patch, MAPREDUCE-6251.6.patch, MAPREDUCE-6251.7.patch, MAPREDUCE-6251.8.patch, MAPREDUCE-6251.8.patch > > > The JobClient is used to get job status information for running and completed jobs. Final state and history for a job is communicated from the application master to the job history server via a distributed file system - where the history is uploaded by the application master to the dfs and then scanned/loaded by the jobhistory server. While HDFS has strong consistency guarantees not all Hadoop DFS's do. When used in conjunction with a distributed file system which does not have this guarantee there will be cases where the history server may not see an uploaded file, resulting in the dreaded "no such job" and a null value for the RunningJob in the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)