mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom Arnfeld" <...@duedil.com>
Subject Re: scheduler.killExecutor()
Date Tue, 30 Sep 2014 07:42:42 GMT
Thanks Vinod. I missed that issue when searching!


I did consider sending a shutdown task, though my worry was that there may be cases where
the task might not launch. Perhaps due to resource starvation and/or no offers being received.
Presumably it would not be correct to store the original OfferId and launch a new task from
that offer, as it *could* be days old.

On Tue, Sep 30, 2014 at 2:10 AM, Vinod Kone <vinodkone@gmail.com> wrote:

> Adding a shutdownExecutor() driver call has been discussed before.
> https://issues.apache.org/jira/browse/MESOS-330
> As a work around, have you considered sending a special "kill" task as a
> signal to the executor to commit suicide?
> On Mon, Sep 29, 2014 at 5:27 PM, Tom Arnfeld <tom@duedil.com> wrote:
>> Hi,
>>
>> I've been making some modifications to the Hadoop framework recently and
>> have come up against a brick wall. I'm wondering if the concept of killing
>> an executor from a framework has been discussed before?
>>
>> Currently we are launching two tasks for each Hadoop TaskTracker, one that
>> has a bit of CPU and all the memory, and then another with the rest of the
>> CPU. In total this equals the amount of resources we want to give each
>> TaskTracker. This is *kind of* how spark works, ish.
>>
>> The reason we do this is to be able to free up CPU resources and remove
>> slots from a TaskTracker (killing it half dead) but keeping the executor
>> alive. At some undefined point in the future we then want to kill the
>> executor, this happens by killing the other "control" task.
>>
>> This approach doesn't work very well in practice as a result of
>> https://issues.apache.org/jira/browse/MESOS-1812 which means tasks are not
>> launched in order on the slave, so there is no way to guarantee the control
>> task comes up first, which leads to all sorts of interesting races.
>>
>> Is this is bad road to go down? I can't use framework messages as I don't
>> believe those are a reliable way of sending signals, so not sure where else
>> to turn.
>>
>> Cheers,
>>
>> Tom.
>>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message