mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Whitney Sorenson <wsoren...@hubspot.com>
Subject Re: Task Reconciliation [MESOS-1453]
Date Tue, 30 Sep 2014 13:21:49 GMT
That makes sense and I suppose your first sentence was the real
summary/confirmation of the change in 0.20 that I was looking for.

A quick question - can we trust that the taskStatus objects sent as a
result of calling reconciliation themselves are the exact same object as
was (possibly) sent before, or would they have updated timestamps?



On Tue, Sep 30, 2014 at 4:14 AM, Benjamin Mahler <benjamin.mahler@gmail.com>
wrote:

> We want reconciliation to be a process that eventually terminates.
>
> In <= 0.19.0, the following two cases are conflated through no update
> being sent:
>   (1) No state difference.
>   (2) Master temporarily cannot reply / dropped message.
>
> As a result, a scheduler cannot determine when it is finished reconciling
> (is my state correct? or was my message not processed?).
>
>
> The way we want to steer frameworks to use reconciliation is as follows.
>
> (1) You should only need to reconcile with the master after a
> re-registration occurs (either master failed over, or framework failed
> over). Some frameworks may want to be more defensive against both
> themselves and against Mesos, and may reconcile on a periodic basis (e.g.
> hourly or daily).
>
> (2) Reconciliation is a process which terminates when an update has been
> received for each task. Here is some pseudo-code that demonstrates how a
> scheduler would implement reliable reconciliation:
>
>
>     # Reconciles state against the master.
>     # TODO: If you call this twice, it will start two reconciliation
> cycles, instead of starting a new one.
>     def reconcile():
>       start_time = now()
>       remaining_tasks = [all non terminal tasks]
>       driver.reconcileTasks() # Implicit reconciliation, lets you discover
> unknown tasks.
>       delay(Seconds(30), _reconcile, start_time, remaining_tasks)
>
>     def _reconcile(start_time, remaining_tasks):
>       remaining_tasks = [t for t in remaining_tasks if
> t.latest_update_time() < start_time]
>       if not remaining_tasks.empty():
>         driver.reconcileTasks(remaining_tasks)
>         delay(Seconds(30), _reconcile, start_time, remaining_tasks) #
> TODO: Use backoff instead.
>
>
> The idea is that you reconcile a set of tasks until you receive updates
> for each one, this set will converge to become empty. You would call
> reconcile() when a (re-)registration occurs.
>
> This is the model you should use in 0.20.0, there are some edge cases that
> we'll fix for 0.21.0, but you likely will not notice them:
> https://issues.apache.org/jira/browse/MESOS-1407
>
> Appreciate you starting this thread. Let me know if anything is not clear.
>
> Ben
>
> On Sun, Sep 28, 2014 at 11:58 AM, Whitney Sorenson <wsorenson@hubspot.com>
> wrote:
>
>> I'm trying to understand the changes in
>> https://issues.apache.org/jira/browse/MESOS-1453 and the SchedulerDriver
>> JavaDoc.
>>
>> In the 0.19 behavior, it made sense to me that a framework would hold
>> onto a copy of all the latest task statuses it knew about, and could poll
>> reconcileTasks with these statuses in order to request delivery of any lost
>> messages (covering the case of both the framework being absent for a while
>> or just a general loss of messages.)
>>
>> Is the idea behind the changes in 0.20 that a framework now need only
>> call reconcileTasks once after registering with a master? In that case,
>> what is the use case for having the API still take a list of taskStatus
>> objects - so frameworks can decide that they don't want to know about
>> unknown tasks [1379]? If frameworks should still routinely ask for missing
>> messages - then why bother sending all updates and causing the framework to
>> have to handle the work of routinely ignoring duplicate status updates?
>>
>> Thanks,
>>
>> -Whitney
>>
>>
>

Mime
View raw message