Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id AE82C2005E6 for ; Sun, 30 Jul 2017 20:57:04 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id ACECA163F12; Sun, 30 Jul 2017 18:57:04 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id C995D163F0E for ; Sun, 30 Jul 2017 20:57:03 +0200 (CEST) Received: (qmail 21557 invoked by uid 500); 30 Jul 2017 18:57:03 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 21548 invoked by uid 99); 30 Jul 2017 18:57:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 30 Jul 2017 18:57:02 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 8D968C0042 for ; Sun, 30 Jul 2017 18:57:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id MkVpT7Am2yQF for ; Sun, 30 Jul 2017 18:57:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id ECD5E5FC1C for ; Sun, 30 Jul 2017 18:57:00 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 5FD61E00A9 for ; Sun, 30 Jul 2017 18:57:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 0FDFB21ED9 for ; Sun, 30 Jul 2017 18:57:00 +0000 (UTC) Date: Sun, 30 Jul 2017 18:57:00 +0000 (UTC) From: "Albert Chu (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (SPARK-21570) File __spark_libs__XXX.zip does not exist on networked file system w/ yarn MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sun, 30 Jul 2017 18:57:04 -0000 [ https://issues.apache.org/jira/browse/SPARK-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16106607#comment-16106607 ] Albert Chu edited comment on SPARK-21570 at 7/30/17 6:56 PM: ------------------------------------------------------------- (Oops, apologies, raced on comments and didn't see your prior one). {noformat} // Subdirectory where Spark libraries will be placed. val LOCALIZED_LIB_DIR = "__spark_libs__" {noformat} {noformat} val jarsArchive = File.createTempFile(LOCALIZED_LIB_DIR, ".zip", new File(Utils.getLocalDir(sparkConf))) {noformat} So it looks like Spark does create this file. was (Author: chu11): FWIW (and my scala and Spark code knowledge is not elite), in resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala {noformat} // Subdirectory where Spark libraries will be placed. val LOCALIZED_LIB_DIR = "__spark_libs__" {noformat} {noformat} val jarsArchive = File.createTempFile(LOCALIZED_LIB_DIR, ".zip", new File(Utils.getLocalDir(sparkConf))) {noformat} So it looks like Spark does create this file. > File __spark_libs__XXX.zip does not exist on networked file system w/ yarn > -------------------------------------------------------------------------- > > Key: SPARK-21570 > URL: https://issues.apache.org/jira/browse/SPARK-21570 > Project: Spark > Issue Type: Bug > Components: YARN > Affects Versions: 2.2.0 > Reporter: Albert Chu > > I have a set of scripts that run Spark with data in a networked file system. One of my unit tests to make sure things don't break between Spark releases is to simply run a word count (via org.apache.spark.examples.JavaWordCount) on a file in the networked file system. This test broke with Spark 2.2.0 when I use yarn to launch the job (using the spark standalone scheduler things still work). I'm currently using Hadoop 2.7.0. I get the following error: > {noformat} > Diagnostics: File file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip does not exist > java.io.FileNotFoundException: File file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip does not exist > at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606) > at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819) > at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596) > at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > {noformat} > While debugging, I sat and watched the directory and did see that /p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip does show up at some point. > Wondering if it's possible something racy was introduced. Nothing in the Spark 2.2.0 release notes suggests any type of configuration change that needs to be done. > Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org