spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: submit_spark_job_to_YARN
Date Mon, 31 Aug 2015 01:31:30 GMT
This is related:
SPARK-10288 Add a rest client for Spark on Yarn

FYI

On Sun, Aug 30, 2015 at 12:12 PM, Dawid Wysakowicz <
wysakowicz.dawid@gmail.com> wrote:

> Hi Ajay,
>
> In short story: No, there is no easy way to do that. But if you'd like to
> play around this topic a good starting point would be this blog post from
> sequenceIQ: blog
> <http://blog.sequenceiq.com/blog/2014/08/22/spark-submit-in-java/>.
>
> I heard rumors that there are some work going on to prepare Submit API,
> but I am not a contributor and I can't say neither if it is true nor how
> are the works going on.
> For now the suggested way is to use the provided script: spark-submit.
>
> Regards
> Dawid
>
> 2015-08-30 20:54 GMT+02:00 Ajay Chander <itschevva@gmail.com>:
>
>> Hi David,
>>
>> Thanks for responding! My main intention was to submit spark Job/jar to
>> yarn cluster from my eclipse with in the code. Is there any way that I
>> could pass my yarn configuration somewhere in the code to submit the jar to
>> the cluster?
>>
>> Thank you,
>> Ajay
>>
>>
>> On Sunday, August 30, 2015, David Mitchell <jdavidmitchell@gmail.com>
>> wrote:
>>
>>> Hi Ajay,
>>>
>>> Are you trying to save to your local file system or to HDFS?
>>>
>>> // This would save to HDFS under "/user/hadoop/counter"
>>> counter.saveAsTextFile("/user/hadoop/counter");
>>>
>>> David
>>>
>>>
>>> On Sun, Aug 30, 2015 at 11:21 AM, Ajay Chander <itschevva@gmail.com>
>>> wrote:
>>>
>>>> Hi Everyone,
>>>>
>>>> Recently we have installed spark on yarn in hortonworks cluster. Now I
>>>> am trying to run a wordcount program in my eclipse and I
>>>> did setMaster("local") and I see the results that's as expected. Now I want
>>>> to submit the same job to my yarn cluster from my eclipse. In storm
>>>> basically I was doing the same by using StormSubmitter class and by passing
>>>> nimbus & zookeeper host to Config object. I was looking for something
>>>> exactly the same.
>>>>
>>>> When I went through the documentation online, it read that I am suppose
>>>> to "export HADOOP_HOME_DIR=path to the conf dir". So now I copied the conf
>>>> folder from one of sparks gateway node to my local Unix box. Now I did
>>>> export that dir...
>>>>
>>>> export HADOOP_HOME_DIR=/Users/user1/Documents/conf/
>>>>
>>>> And I did the same in .bash_profile too. Now when I do echo
>>>> $HADOOP_HOME_DIR, I see the path getting printed in the command prompt. Now
>>>> my assumption is, in my program when I change setMaster("local") to
>>>> setMaster("yarn-client") my program should pick up the resource mangers i.e
>>>> yarn cluster info from the directory which I have exported and the job
>>>> should get submitted to resolve manager from my eclipse. But somehow it's
>>>> not happening. Please tell me if my assumption is wrong or if I am missing
>>>> anything here.
>>>>
>>>> I have attached the word count program that I was using. Any help is
>>>> highly appreciated.
>>>>
>>>> Thank you,
>>>> Ajay
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>>> For additional commands, e-mail: user-help@spark.apache.org
>>>>
>>>
>>>
>>>
>>> --
>>> ### Confidential e-mail, for recipient's (or recipients') eyes only, not
>>> for distribution. ###
>>>
>>
>

Mime
View raw message