mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From X Brick <ngdoc...@gmail.com>
Subject Re: Failed to launch spark jobs on mesos due to "hadoop" not found
Date Tue, 22 Nov 2016 04:09:01 GMT
you can try "--executor_environment_variables" when you start mesos-slave
like the following:

```
/usr/local/sbin/mesos-agent --work_dir=/var/lib/mesos/agent  ...
--executor_environment_variables="{\"HADOOP_HOME\":\"/path/to/your/hadoop_home\"}"
```

Or you could patch MesosClusterScheduler.scala
<https://github.com/apache/spark/blob/d89bfc92302424406847ac7a9cfca714e6b742fc/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala#L543>
and MesosCoarseGrainedSchedulerBackend.scala
<https://github.com/apache/spark/blob/d89bfc92302424406847ac7a9cfca714e6b742fc/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala>,
set `HADOOP_HOME`, `JAVA_HOME` or something else dynamically, it will do
more hacking.

BTW, spark on mesos always switch user automatically, and if you did not
add user in the slave, may raise some errors. Set "--switch_user =false"
and env "HADOOP_USER_NAME" may be better.

2016-11-13 22:26 GMT+08:00 tommy xiao <xiaods@gmail.com>:

> let me input some hints information:
>
> for hadoop, you can see here about proxy_user concepts:
> https://hadoop.apache.org/docs/r2.7.1/hadoop-project-
> dist/hadoop-common/Superusers.html
>
> for lunch spark job in mesos cluster, see this case:
> https://marc.info/?l=mesos-user&m=144228174927503&w=3
>
>
>
>
>
> 2016-11-13 0:17 GMT+08:00 Yu Wei <yu2003w@hotmail.com>:
>
> > A little interesting.
> >
> >
> > ________________________________
> > From: tommy xiao <xiaods@gmail.com>
> > Sent: Friday, November 11, 2016 2:19 PM
> > To: dev
> > Subject: Re: Failed to launch spark jobs on mesos due to "hadoop" not
> found
> >
> > this is pain. if the mesos support proxy user feature like hadoop,then
> > every thing can easy handle.
> >
> > 2016-11-11 14:06 GMT+08:00 Yu Wei <yu2003w@hotmail.com>:
> >
> > > I fixed the problem by adding environment variables HADOOP_HOME and
> > > JAVA_HOME to launch mesos-agent as below,
> > >
> > > sudo HADOOP_HOME=/home/dcos/repo/bigdata/hadoop-2.7.3/
> > > JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.111-1.b15.
> > el7_2.x86_64/jre/
> > > /usr/local/sbin/mesos-agent --work_dir=/var/lib/mesos/agent  ...
> > >
> > > It seemed that this is caused by sudo secure_path.
> > >
> > >
> > > Is there any better solution to fix this problem?
> > >
> > >
> > >
> > > Thanks,
> > >
> > > Jared, (??)
> > > Software developer
> > > Interested in open source software, big data, Linux
> > >
> > > ________________________________
> > > From: tommy xiao <xiaods@gmail.com>
> > > Sent: Friday, November 11, 2016 1:52:54 PM
> > > To: dev
> > > Subject: Re: Failed to launch spark jobs on mesos due to "hadoop" not
> > found
> > >
> > > it seem like in same user mode.
> > >
> > > 2016-11-10 22:37 GMT+08:00 Joris Van Remoortere <joris@mesosphere.io>:
> > >
> > > > Switch to the user that the agent is running as, and the directory
> from
> > > > which it is executing and ensure that you can find hadoop. For
> example
> > by
> > > > running
> > > > `which hadoop`. This may be a PATH or JAVA_HOME issue?
> > > >
> > > > -
> > > > *Joris Van Remoortere*
> > > > Mesosphere
> > > >
> > > > On Thu, Nov 10, 2016 at 6:29 AM, Yu Wei <yu2003w@hotmail.com> wrote:
> > > >
> > > > > Hi Guys,
> > > > >
> > > > > I failed to launch spark jobs on mesos. Actually I submitted the
> job
> > to
> > > > > cluster successfully.
> > > > >
> > > > > But the job failed to run.
> > > > >
> > > > > I1110 18:25:11.095507   301 fetcher.cpp:498] Fetcher Info:
> > > > > {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/
> > > > 1f8e621b-3cbf-4b86-a1c1-
> > > > > 9e2cf77265ee-S7\/root","items":[{"action":"BYPASS_CACHE","
> > > > > uri":{"extract":true,"value":"hdfs:\/\/192.168.111.74:9090\/
> > > > > bigdata\/package\/spark-examples_2.11-2.0.1.jar"}}],"
> > > > > sandbox_directory":"\/var\/lib\/mesos\/agent\/slaves\/
> > > > > 1f8e621b-3cbf-4b86-a1c1-9e2cf77265ee-S7\/frameworks\/
> > > > > 1f8e621b-3cbf-4b86-a1c1-9e2cf77265ee-0002\/executors\/
> > > > > driver-20161110182510-0001\/runs\/b561328e-9110-4583-b740-
> > > > > 98f9653e7fc2","user":"root"}
> > > > > I1110 18:25:11.099799   301 fetcher.cpp:409] Fetching URI 'hdfs://
> > > > > 192.168.111.74:9090/bigdata/package/spark-examples_2.11-2.0.1.jar'
> > > > > I1110 18:25:11.099820   301 fetcher.cpp:250] Fetching directly into
> > the
> > > > > sandbox directory
> > > > > I1110 18:25:11.099862   301 fetcher.cpp:187] Fetching URI 'hdfs://
> > > > > 192.168.111.74:9090/bigdata/package/spark-examples_2.11-2.0.1.jar'
> > > > > E1110 18:25:11.101842   301 shell.hpp:106] Command 'hadoop version
> > > 2>&1'
> > > > > failed; this is the output:
> > > > > sh: hadoop: command not found
> > > > > Failed to fetch 'hdfs://192.168.111.74:9090/bigdata/package/spark-
> > > > > examples_2.11-2.0.1.jar': Failed to create HDFS client: Failed to
> > > execute
> > > > > 'hadoop version 2>&1'; the command was either not found or
exited
> > with
> > > a
> > > > > non-zero exit status: 127
> > > > > Failed to synchronize with agent (it's probably exited
> > > > >
> > > > >
> > > > > Actually I installed hadoop on each agent node.
> > > > >
> > > > >
> > > > > Any advice?
> > > > >
> > > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jared, (??)
> > > > > Software developer
> > > > > Interested in open source software, big data, Linux
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Deshi Xiao
> > > Twitter: xds2000
> > > E-mail: xiaods(AT)gmail.com
> > >
> >
> >
> >
> > --
> > Deshi Xiao
> > Twitter: xds2000
> > E-mail: xiaods(AT)gmail.com
> >
>
>
>
> --
> Deshi Xiao
> Twitter: xds2000
> E-mail: xiaods(AT)gmail.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message