mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James DeFelice <james.defel...@gmail.com>
Subject Re: TASK_LOST on storm task
Date Fri, 19 Sep 2014 21:12:30 GMT
https://github.com/mesos/storm/pull/11

Looks like some cleanup has been requested... but should work as-is.

On Fri, Sep 19, 2014 at 1:57 AM, Luyi Wang <wangluyi1982@gmail.com> wrote:

> Thanks, james.  I will pull the latest change and see what you committed.
> Thanks.
> On Sep 18, 2014 8:21 PM, "James DeFelice" <james.defelice@gmail.com>
> wrote:
>
>> I submitted a pull request that facilitates what you're asking for. It
>> lets you specify a port number for the built in file server on nimbus. Once
>> you have a predictable uri for that built in server you can rebuild the
>> storm tarball with whatever config you want your executors to have and
>> throw it in conf/ so nimbus can serve it up. I've done exactly this and
>> it's been working great for us.
>>
>> --sent from my phone
>> On Sep 18, 2014 6:06 PM, "Luyi Wang" <wangluyi1982@gmail.com> wrote:
>>
>>> Well. After investigating the problem. It turns out to be a setting
>>> problem.
>>>
>>> My old storm task never ran correctly as it kept trying to connect a
>>> wrong zookeeper server.
>>>
>>> The way I started the mesos is following. I use zookeeper to store
>>> configuration but this zookeeper is embedded and running standalone on
>>> master node(192.168.1.11).
>>>
>>> nohup sudo /home/ubuntu/mesos/build/bin/mesos-master.sh
>>> --work_dir=/var/lib/mesos --zk=zk://0.0.0.0:2181/mesos --quorum=1
>>> --log_dir=/var/log/mesos </dev/null >/dev/null 2>&1 &
>>>
>>>
>>> And I started the slave using following command.
>>>
>>> nohup sudo /home/ubuntu/mesos/build/bin/mesos-slave.sh --master=zk://
>>> 192.168.123.19:2181/mesos --log_dir=/var/log/mesos </dev/null
>>> >/dev/null 2>&1 &
>>>
>>>
>>> Under this situation, everything looks fine.
>>>
>>> To set up the storm framework, I change the storm.yaml in the conf
>>> folder.
>>>
>>> mesos.master.url: "zk://192.168.123.19:2181/mesos"
>>> storm.zookeeper.servers:
>>>     - "localhost"
>>> nimbus.host: "localhost"
>>>
>>> and running "storm-mesos nimbus" and "storm ui".
>>>
>>> The problem raised here in this configuration.For every task,
>>> storm-mesos created an executor with  storm-mesos environment by
>>> downloading the full tar ball from either http from memosphere or hdfs
>>> which includes  configuration file may or may not as same as the one using
>>> in the master node. In my case,  all executor using the above
>>> configuration.  The new created executor would fetch from zookeeper server
>>> to but here what it tried to talked with is still "localhost". The
>>> zookeeper server exists on the slave but never used for mesos, so the task
>>> were marked as LOST.  To make it work,In my case, the set up should be like
>>> this.( I assume I also need to change the nimbus host to the master node").
>>>
>>> mesos.master.url: "zk://192.168.123.19:2181/mesos"
>>> storm.zookeeper.servers:
>>>     - "192.168.123.19"
>>> nimbus.host: "192.168.123.19"
>>>
>>>
>>> After making this change, everything works fine now.
>>> I hope this would help people having the same issues.
>>>
>>> Meanwhile. In my opinion, they way downloading whole tarball with fixed
>>> configuration from somewhere should be avoided or improved.
>>>
>>> Probably worth to discuss.
>>>
>>> Thanks.
>>>
>>>
>>>
>>>
>>>
>>>
>>> -Luyi.
>>>
>>>
>>>
>>>
>>> On Thu, Sep 18, 2014 at 11:36 AM, Luyi Wang <wangluyi1982@gmail.com>
>>> wrote:
>>>
>>>> I attached nimbus.log and supervisor.log for your reference
>>>>
>>>>
>>>> On Wed, Sep 17, 2014 at 5:30 PM, Benjamin Mahler <
>>>> benjamin.mahler@gmail.com> wrote:
>>>>
>>>>> logs
>>>>
>>>>
>>>>
>>>>
>>>> -Luyi.
>>>>
>>>>
>>>>
>>>>
>>>


-- 
James DeFelice
585.241.9488 (voice)
650.649.6071 (fax)

Mime
View raw message