Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DF59018CD4 for ; Tue, 17 Nov 2015 22:39:11 +0000 (UTC) Received: (qmail 96148 invoked by uid 500); 17 Nov 2015 22:39:11 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 96073 invoked by uid 500); 17 Nov 2015 22:39:11 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 95986 invoked by uid 99); 17 Nov 2015 22:39:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Nov 2015 22:39:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 5D0502C1F5F for ; Tue, 17 Nov 2015 22:39:11 +0000 (UTC) Date: Tue, 17 Nov 2015 22:39:11 +0000 (UTC) From: "Daniel Templeton (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (MAPREDUCE-6344) Inconsistent classpath/classloading from DistributedCache archives MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-6344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15009742#comment-15009742 ] Daniel Templeton commented on MAPREDUCE-6344: --------------------------------------------- [~pk020157], in the comment explaining what you're doing, let's leave the JIRA line out of it. Folks can get that from git if they want to know. Other than that, it looks good to me. I haven't done any testing, however. > Inconsistent classpath/classloading from DistributedCache archives > ------------------------------------------------------------------ > > Key: MAPREDUCE-6344 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6344 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 2.5.0, 2.6.0, 2.7.0, 2.5.1, 2.5.2 > Reporter: Preston Koprivica > Attachments: MAPREDUCE-6344.patch > > > We recently upgraded to MRv2 on YARN and have been noticing very inconsistent classloading between the job submission client and the tasks as they start up. > I've tracked the issue to this method: > https://github.com/apache/hadoop/blob/release-2.5.0/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java#L264 > It appears that the classpath is simply "wild carded". According the javase 7&8 docs, the order of enumeration is not specified and may differ from moment to moment [1][2]. This is a problem for applications that rely on strict ordering, which the MRv1 DistributedCache used to provide. > I'm unable to track down all the things that are linked or landed into the $PWD of the container, but assuming we can't account for all these things, a simple solution could be to explicitly enumerate the files in DistributedCache - similar to the "non jar" case [3] - and then add the "*" for passivity. > [1] http://docs.oracle.com/javase/7/docs/technotes/tools/windows/classpath.htm > [2] http://docs.oracle.com/javase/8/docs/technotes/tools/windows/classpath.html#A1100762 > [3] https://github.com/apache/hadoop/blob/release-2.5.0/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java#L270 -- This message was sent by Atlassian JIRA (v6.3.4#6332)