Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A659C17F63 for ; Tue, 21 Apr 2015 10:42:04 +0000 (UTC) Received: (qmail 71925 invoked by uid 500); 21 Apr 2015 10:42:04 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 71863 invoked by uid 500); 21 Apr 2015 10:42:04 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 71851 invoked by uid 99); 21 Apr 2015 10:42:04 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Apr 2015 10:42:04 +0000 Date: Tue, 21 Apr 2015 10:42:04 +0000 (UTC) From: "Hudson (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (MAPREDUCE-6238) MR2 can't run local jobs with -libjars command options which is a regression from MR1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-6238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504747#comment-14504747 ] Hudson commented on MAPREDUCE-6238: ----------------------------------- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #170 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/170/]) MAPREDUCE-6238. MR2 can't run local jobs with -libjars command options which is a regression from MR1 (zxu via rkanter) (rkanter: rev d50e8f09287deeb51012d08e326a2ed71a6da869) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalDistributedCacheManager.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestLocalJobSubmission.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobResourceUploader.java * hadoop-mapreduce-project/CHANGES.txt > MR2 can't run local jobs with -libjars command options which is a regression from MR1 > ------------------------------------------------------------------------------------- > > Key: MAPREDUCE-6238 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6238 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 > Reporter: zhihai xu > Assignee: zhihai xu > Priority: Critical > Fix For: 2.8.0 > > Attachments: MAPREDUCE-6238.000.patch > > > MR2 can't run local jobs with -libjars command options which is a regression from MR1. > When run MR2 job with -jt local and -libjars, the job fails with java.io.FileNotFoundException: File does not exist: hdfs://XXXXXXXXXXXXXXX.jar. > But the same command is working in MR1. > I find the problem is > 1. > because when MR2 run local job using LocalJobRunner > from JobSubmitter, the JobSubmitter#jtFs is local filesystem, > So copyRemoteFiles will return from [the middle of the function|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmitter.java#L138] > because source and destination file system are same. > {code} > if (compareFs(remoteFs, jtFs)) { > return originalPath; > } > {code} > The following code at [JobSubmitter.java|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmitter.java#L219] > try to add the destination file to DistributedCache which introduce a bug for local job. > {code} > Path newPath = copyRemoteFiles(libjarsDir, tmp, conf, replication); > DistributedCache.addFileToClassPath( > new Path(newPath.toUri().getPath()), conf); > {code} > Because new Path(newPath.toUri().getPath()) will lose the filesystem information from newPath, the file added to DistributedCache will use the default Uri filesystem hdfs based on the following code. This causes the > FileNotFoundException when we access the file later at > [determineTimestampsAndCacheVisibilities|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmitter.java#L270] > {code} > public static void addFileToClassPath(Path file, Configuration conf) > throws IOException { > addFileToClassPath(file, conf, file.getFileSystem(conf)); > } > public static void addFileToClassPath > (Path file, Configuration conf, FileSystem fs) > throws IOException { > String classpath = conf.get(MRJobConfig.CLASSPATH_FILES); > conf.set(MRJobConfig.CLASSPATH_FILES, classpath == null ? file.toString() > : classpath + "," + file.toString()); > URI uri = fs.makeQualified(file).toUri(); > addCacheFile(uri, conf); > } > {code} > Compare to the following [MR1 code|https://github.com/apache/hadoop/blob/branch-1/src/mapred/org/apache/hadoop/mapred/JobClient.java#L811]: > {code} > Path newPath = copyRemoteFiles(fs, libjarsDir, tmp, job, replication); > DistributedCache.addFileToClassPath( > new Path(newPath.toUri().getPath()), job, fs); > {code} > You will see why MR1 doesn't have this issue. > because it passes the local filesystem into DistributedCache#addFileToClassPath instead of using the default Uri filesystem hdfs. > 2. > Another incompatible change in MR2 is in [LocalDistributedCacheManager#setup|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalDistributedCacheManager.java#L113] > {code} > // Find which resources are to be put on the local classpath > Map classpaths = new HashMap(); > Path[] archiveClassPaths = DistributedCache.getArchiveClassPaths(conf); > if (archiveClassPaths != null) { > for (Path p : archiveClassPaths) { > FileSystem remoteFS = p.getFileSystem(conf); > p = remoteFS.resolvePath(p.makeQualified(remoteFS.getUri(), > remoteFS.getWorkingDirectory())); > classpaths.put(p.toUri().getPath().toString(), p); > } > } > Path[] fileClassPaths = DistributedCache.getFileClassPaths(conf); > if (fileClassPaths != null) { > for (Path p : fileClassPaths) { > FileSystem remoteFS = p.getFileSystem(conf); > p = remoteFS.resolvePath(p.makeQualified(remoteFS.getUri(), > remoteFS.getWorkingDirectory())); > classpaths.put(p.toUri().getPath().toString(), p); > } > } > {code} > Similar code from MR1 is at [TaskDistributedCacheManager#makeCacheFiles|https://github.com/apache/hadoop/blob/branch-1/src/mapred/org/apache/hadoop/filecache/TaskDistributedCacheManager.java#L119] > {code} > Map classPaths = new HashMap(); > if (paths != null) { > for (Path p : paths) { > classPaths.put(p.toUri().getPath().toString(), p); > } > } > {code} > I think we don't need call remoteFS.resolvePath to get the class path and > We can use the class path from DistributedCache.getFileClassPaths directly. > Also p.toUri().getPath().toString() will remove the filesystem information(scheme) and only keySet of classpaths is used(ValueSet of classpaths is not used). > It is better to do the same in MR2 to maintain backward compatible with MR1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)