aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Erb <stephan....@blue-yonder.com>
Subject Re: Mesos and Aurora out of Sync
Date Fri, 19 Sep 2014 10:30:03 GMT
I've filed the bug: https://issues.apache.org/jira/browse/AURORA-728

Regards,
Stephan

On 18.09.2014 17:38, Bill Farner wrote:
> Answering my own question: the GC executor log shows the task ended up in
> LOST, so i'd guess you saw PENDING -> ASSIGNED -> [STARTING ->] LOST, the
> final one being the scheduler assuming the task was dead.  Definitely
> bug-worthy.
>
> -=Bill
>
> On Thu, Sep 18, 2014 at 8:37 AM, Bill Farner <wfarner@apache.org> wrote:
>
>> For that thermos executor stderr, was its task
>> (1410972813312-www-data-test-ipython-15-1de938e1-5575-4510-985b-bdf7ea8a0f01)
>> transitioned cleanly to FAILED?
>>
>> The error itself indicates that the executor timed out communicating with
>> your ZooKeeper cluster, something you should look into.  If the task didn't
>> ~immediately go to FAILED, that's a bug on our side, which i encourage you
>> to file a bug for.
>>
>> -=Bill
>>
>> On Thu, Sep 18, 2014 at 8:33 AM, Bill Farner <wfarner@apache.org> wrote:
>>
>>> Just to rule out the obvious - are GC tasks in the master's 22 tasks?
>>>   Their task IDs would start with 'system-gc-'.
>>>
>>> -=Bill
>>>
>>> On Thu, Sep 18, 2014 at 6:47 AM, Stephan Erb <stephan.erb@blue-yonder.com
>>>> wrote:
>>>>   Hi everyone,
>>>>
>>>> on my local test cluster mesos and aurora seem to be running out of sync:
>>>>
>>>>     - Mesos status: 22 active tasks by the twitter scheduler
>>>>     - Aurora status: 4 active production tasks,  1 active test task
>>>>     - Slave status: thermos reports 5 active tasks and 'ps aux' reports
>>>>     5 active processes, i.e., aurora and thermos seem to be correct
>>>>
>>>>
>>>> I thought the GC was supposed to reconcile this status? I have attached
>>>> the log file of a recent gc_executor run and the stderr of one of the
>>>> faulty executors. I am omitting the logfile for the executors as these are
>>>> large and don't seem to be showing anything of interest.
>>>>
>>>> Any idea what is wrong here?
>>>>
>>>> Thanks,
>>>> Stephan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>


-- 

Stephan Erb
Software Engineer
*Blue Yonder GmbH*
Ohiostrasse 8
D-76149 Karlsruhe

Tel +49 (0)721 383 117 6243
Fax +49 (0)721 383 117 69

stephan.erb@blue-yonder.com <mailto:stephan.erb@blue-yonder.com>
www.blue-yonder.com <http://www.blue-yonder.com/>
Registergericht Mannheim, HRB 704547
USt-IdNr. DE DE 277 091 535
Geschäftsführer: Jochen Bossert, Uwe Weiss (CEO)

<http://www.datalympics.com>

Mime
View raw message