Return-Path: X-Original-To: apmail-spark-user-archive@minotaur.apache.org Delivered-To: apmail-spark-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B23CA179EB for ; Wed, 15 Oct 2014 07:49:31 +0000 (UTC) Received: (qmail 48044 invoked by uid 500); 15 Oct 2014 07:49:27 -0000 Delivered-To: apmail-spark-user-archive@spark.apache.org Received: (qmail 47967 invoked by uid 500); 15 Oct 2014 07:49:27 -0000 Mailing-List: contact user-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@spark.apache.org Received: (qmail 47951 invoked by uid 99); 15 Oct 2014 07:49:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Oct 2014 07:49:27 +0000 X-ASF-Spam-Status: No, hits=1.1 required=5.0 tests=HTML_MESSAGE,MISSING_HEADERS,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [95.211.115.67] (HELO ironport1.kelkoo.com) (95.211.115.67) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Oct 2014 07:49:23 +0000 Received: from unknown (HELO DC1SRVEXC.ds.corp.kelkoo.net) ([10.76.97.31]) by ironport1-ex.kelkoo.net with ESMTP; 15 Oct 2014 09:49:00 +0200 Received: from [10.76.76.106] (10.76.76.106) by DC1SRVEXC.ds.corp.kelkoo.net (10.76.97.31) with Microsoft SMTP Server id 8.3.348.2; Wed, 15 Oct 2014 09:49:02 +0200 Message-ID: <543E26EC.6020305@kelkoo.com> Date: Wed, 15 Oct 2014 09:49:00 +0200 From: =?UTF-8?B?Q2hyaXN0b3BoZSBQcsOpYXVk?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.7.0 MIME-Version: 1.0 CC: "user@spark.apache.org" Subject: Re: Spark can't find jars References: <94A54C30-3075-49B6-B8EE-1ED0BD0D8E30@sellpoints.com> <543D1034.1060706@kelkoo.com> In-Reply-To: Content-Type: multipart/alternative; boundary="------------070009030709020207070906" X-Virus-Checked: Checked by ClamAV on apache.org --------------070009030709020207070906 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hi Jimmy, Did you try my patch? The problem on my side was that the hadoop.tmp.dir (in hadoop core-site.xm= l) was not handled properly by Spark when it is set on multiple partitions/= disks, i.e.: hadoop.tmp.dir file:/d1/yarn/local,file:/d2/yarn/local,file:/d3/yarn/local,file:/= d4/yarn/local,file:/d5/yarn/local,file:/d6/yarn/local,file:/d7/yarn/local Hence, you won't be hit by this bug if your hadoop.tmp.dir is set on one pa= rtition only. If your hadoop.tmp.dir is also set on several partitions, I agree that it l= ooks like a bug in Spark. Christophe. On 14/10/2014 18:50, Jimmy McErlain wrote: So the only way that I could make this work was to build a fat jar file as = suggested earlier. To me (and I am no expert) it seems like this is a bug.= Everything was working for me prior to our upgrade to Spark 1.1 on Hadoop= 2.2 but now it seems to not... ie packaging my jars locally then pushing = them out to the cluster and pointing them to corresponding dependent jars..= .. Sorry I cannot be more help! J [https://mailfoogae.appspot.com/t?sender=3DaamltbXlAc2VsbHBvaW50cy5jb20%3D&= type=3Dzerocontent&guid=3Dc1a21a6a-dbf9-453d-8c2a-b5e6a8d5ca56]=E1=90=A7 JIMMY MCERLAIN DATA SCIENTIST (NERD) . . . . . . . . . . . . . . . . . . [http://assetsw.sellpoint.net/IA/creative_services/logo_2014/sellpoints_log= o_black_transparent_170x81.png] IF WE CAN=E2=80=99T DOUBLE YOUR SALES, ONE OF US IS IN THE WRONG BUSINESS. E: jimmy@sellpoints.com M: 510.303.7751 On Tue, Oct 14, 2014 at 4:59 AM, Christophe Pr=C3=A9aud > wrote: Hello, I have already posted a message with the exact same problem, and proposed a= patch (the subject is "Application failure in yarn-cluster mode"). Can you test it, and see if it works for you? I would be glad too if someone can confirm that it is a bug in Spark 1.1.0. Regards, Christophe. On 14/10/2014 03:15, Jimmy McErlain wrote: BTW this has always worked for me before until we upgraded the cluster to S= park 1.1.1... J [https://mailfoogae.appspot.com/t?sender=3DaamltbXlAc2VsbHBvaW50cy5jb20%3D&= type=3Dzerocontent&guid=3D92430839-642b-4921-8d42-f266e48bcdfe]=E1=90=A7 JIMMY MCERLAIN DATA SCIENTIST (NERD) . . . . . . . . . . . . . . . . . . [http://assetsw.sellpoint.net/IA/creative_services/logo_2014/sellpoints_log= o_black_transparent_170x81.png] IF WE CAN=E2=80=99T DOUBLE YOUR SALES, ONE OF US IS IN THE WRONG BUSINESS. E: jimmy@sellpoints.com M: 510.303.7751 On Mon, Oct 13, 2014 at 5:39 PM, HARIPRIYA AYYALASOMAYAJULA > wrote: Helo, Can you check if the jar file is available in the target->scala-2.10 folde= r? When you use sbt package to make the jar file, that is where the jar file w= ould be located. The following command works well for me: spark-submit --class =E2=80=9CClassname" --master yarn-cluster jarfile(wi= thcomplete path) Can you try checking with this initially and later add other options? On Mon, Oct 13, 2014 at 7:36 PM, Jimmy > wrote: Having the exact same error with the exact same jar.... Do you work for Alt= iscale? :) J Sent from my iPhone On Oct 13, 2014, at 5:33 PM, Andy Srine > wrote: Hi Guys, Spark rookie here. I am getting a file not found exception on the --jars. T= his is on the yarn cluster mode and I am running the following command on o= ur recently upgraded Spark 1.1.1 environment. ./bin/spark-submit --verbose --master yarn --deploy-mode cluster --class my= Engine --driver-memory 1g --driver-library-path /hadoop/share/hadoop/mapred= uce/lib/hadoop-lzo-0.4.18-201406111750.jar --executor-memory 5g --executor-= cores 5 --jars /home/andy/spark/lib/joda-convert-1.2.jar --queue default --= num-executors 4 /home/andy/spark/lib/my-spark-lib_1.0.jar This is the error I am hitting. Any tips would be much appreciated. The fil= e permissions looks fine on my local disk. 14/10/13 22:49:39 INFO yarn.ApplicationMaster: Unregistering ApplicationMas= ter with FAILED 14/10/13 22:49:39 INFO impl.AMRMClientImpl: Waiting for application to be s= uccessfully unregistered. Exception in thread "Driver" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav= a:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor= Impl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMa= ster.scala:162) Caused by: org.apache.spark.SparkException: Job aborted due to stage failur= e: Task 3 in stage 1.0 failed 4 times, most recent failure: Lost task 3.3 i= n stage 1.0 (TID 12, 122-67.vb2.company.com)= : java.io.FileNotFoundException: ./joda-convert-1.2.jar (Permission denied) java.io.FileOutputStream.open(Native Method) java.io.FileOutputStream.(FileOutputStream.java:221) com.google.common.io.Files$FileByteSink.openStream(Files.java:223) com.google.common.io.Files$FileByteSink.openStream(Files.java:211) Thanks, Andy -- Regards, Haripriya Ayyalasomayajula ________________________________ Kelkoo SAS Soci=C3=A9t=C3=A9 par Actions Simplifi=C3=A9e Au capital de =E2=82=AC 4.168.964,30 Si=C3=A8ge social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pi=C3=A8ces jointes sont confidentiels et =C3=A9tablis = =C3=A0 l'attention exclusive de leurs destinataires. Si vous n'=C3=AAtes pa= s le destinataire de ce message, merci de le d=C3=A9truire et d'en avertir = l'exp=C3=A9diteur. ________________________________ Kelkoo SAS Soci=C3=A9t=C3=A9 par Actions Simplifi=C3=A9e Au capital de =E2=82=AC 4.168.964,30 Si=C3=A8ge social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pi=C3=A8ces jointes sont confidentiels et =C3=A9tablis = =C3=A0 l'attention exclusive de leurs destinataires. Si vous n'=C3=AAtes pa= s le destinataire de ce message, merci de le d=C3=A9truire et d'en avertir = l'exp=C3=A9diteur. --------------070009030709020207070906 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Jimmy,
Did you try my patch?
The problem on my side was that the hadoop.tmp.dir  (in hadoo= p core-site.xml) was not handled properly by Spark when it is set on multip= le partitions/disks, i.e.:

<property>
  <name>hadoop.tmp.dir</name>
  <value>file:/d1/yar= n/local,file:/d2/yarn/local,file:/d3/yarn/local,file:/d4/yarn/local,file:/d= 5/yarn/local,file:/d6/yarn/local,file:/d7/yarn/local</value>=
</property>

Hence, you won't be hit by this bug if your hadoop.tmp.dir is set = on one partition only.
If your hadoop.tmp.dir is also set on several partitions, I agree = that it looks like a bug in Spark.

Christophe.

On 14/10/2014 18:50, Jimmy McErlain wrote:
So the only way that I could make this work was to build a= fat jar file as suggested earlier.  To me (and I am no expert) it see= ms like this is a bug.  Everything was working for me prior to our upg= rade to Spark 1.1 on Hadoop 2.2 but now it seems to not...  ie packaging my jars locally then pushing them out to the = cluster and pointing them to corresponding dependent jars.... 

Sorry I cannot be more help!
J
=E1=90=A7




JIMMY MCER= LAIN

DATA SCIENTIST (NERD)=

. . . . . .= . . . . . . . . . . . . 


IF WE CAN= =E2=80=99T DOUBLE YOUR SALES,

ONE OF US= IS IN THE WRONG BUSINESS.

E:&nbs= p;jimmy@sellpoints.com  &= nbsp;

M: <= u>510.303.7751


On Tue, Oct 14, 2014 at 4:59 AM, Christophe Pr= =C3=A9aud <christophe.preaud@kelkoo.com> wrote:
Hello,

I have already posted a message with the exact same problem, and proposed a= patch (the subject is "Application failure in yarn-cluster mode"= ).
Can you test it, and see if it works for you?
I would be glad too if someone can confirm that it is a bug in Spark 1.1.0.=

Regards,
Christophe.


On 14/10/2014 03:15, Jimmy McErlain wrote:
BTW this has always worked for me before until we upgraded= the cluster to Spark 1.1.1... 
J
=E1=90=A7




JIMMY MCER= LAIN

DATA SCIENTIST (NERD)=

. . . . . . . . . . . . . . . . . . 


IF WE CAN= =E2=80=99T DOUBLE YOUR SALES,

ONE OF US= IS IN THE WRONG BUSINESS.

E:&nbs= p;jimmy@sellpoints.com  &= nbsp;

M: <= u>510.303.7751


On Mon, Oct 13, 2014 at 5:39 PM, HARIPRIYA AYYAL= ASOMAYAJULA <aharipriya92@gmail.com> wrote= :
Helo, 

Can you check if  the jar file is available in the target->sca= la-2.10 folder? 

When you use sbt package to make the jar file, that is where the jar f= ile would be located. 

The following command works well for me:

spark-submit --class =E2=80=9CClassname"   --master yarn-clust= er jarfile(withcomplete path)

Can you try checking  with this initially and later add other optio= ns?


On Mon, Oct 13, 2014 at 7:36 PM, Jimmy <jimmy@sellpoints.com> wrote:
Having the exact same error with the exact same jar.... Do you work fo= r Altiscale? :) 
J

Sent from my iPhone

On Oct 13, 2014, at 5:33 PM, Andy Srine <andy.srine@gmail.com> wrote:

Hi Guys,


Spark rookie here.= I am getting a file not found exception on the --jars. This is on the yarn= cluster mode and I am running the following command on our recently upgrad= ed Spark 1.1.1 environment.


./bin/spark-submit= --verbose --master yarn --deploy-mode cluster --class myEngine --driver-me= mory 1g --driver-library-path /hadoop/share/hadoop/mapreduce/lib/hadoop-lzo= -0.4.18-201406111750.jar --executor-memory 5g --executor-cores 5 --jars /home/andy/spark/lib/joda-convert-1.2.jar --q= ueue default --num-executors 4 /home/andy/spark/lib/my-spark-lib_1.0.jar

Th= is is the error I am hitting. Any tips would be much appreciated. The file = permissions looks fine on my local disk.

14/10/13 22:49:39 = INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with FAILED

14/10/13 22:49:39 = INFO impl.AMRMClientImpl: Waiting for application to be successfully unregi= stered.

Exception in threa= d "Driver" java.lang.reflect.InvocationTargetException

at sun.reflect.NativeMethodAccessorImpl.invoke0(N= ative Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(Na= tiveMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invok= e(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:60= 6)

at org.apache.spark.deploy.yarn.ApplicationMaster= $$anon$2.run(ApplicationMaster.scala:162)

Caused by: org.apache.spark.SparkException: Job aborted due to stage = failure: Task 3 in stage 1.0 failed 4 times, most recent failure: Lost task= 3.3 in stage 1.0 (TID 12, 122-67.vb2.company.com): java.io.FileNotFoundException: ./j= oda-convert-1.2.jar (Permission denied)

    &nbs= p;   java.io.FileOutputStream.open(Native Method)

    &nbs= p;   java.io.FileOutputStream.<init>(FileOutputStream.java:221)<= /p>

    &nbs= p;   com.google.common.io.Files$FileByteSink.openStream(Files.java:223= )

    &nbs= p;   com.google.common.io.Files$FileByteSink.openStream(Files.java:211= )

    &nbs= p;   

Thanks,
Andy




--
Regards,
Haripriya Ayyalasomayajula





Kelkoo SAS
Soci=C3=A9t=C3=A9 par Actions Simplifi=C3=A9e
Au capital de =E2=82=AC 4.168.964,30
Si=C3=A8ge social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pi=C3=A8ces jointes sont confidentiels et =C3=A9tablis = =C3=A0 l'attention exclusive de leurs destinataires. Si vous n'=C3=AAtes pa= s le destinataire de ce message, merci de le d=C3=A9truire et d'en avertir = l'exp=C3=A9diteur.




Kelkoo SAS
Soci=C3=A9t=C3=A9 par Actions Simplifi=C3=A9e
Au capital de =E2=82=AC 4.168.964,30
Si=C3=A8ge social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pi=C3=A8ces jointes sont confidentiels et =C3=A9tablis = =C3=A0 l'attention exclusive de leurs destinataires. Si vous n'=C3=AAtes pa= s le destinataire de ce message, merci de le d=C3=A9truire et d'en avertir = l'exp=C3=A9diteur.
--------------070009030709020207070906--