mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Farner" <b...@twitter.com>
Subject Re: Review Request 12603: Fixed scheduler driver to minimize sending duplicate status updates to the scheduler.
Date Wed, 17 Jul 2013 03:52:12 GMT


> On July 17, 2013, 12:15 a.m., Bill Farner wrote:
> > src/sched/sched.cpp, line 390
> > <https://reviews.apache.org/r/12603/diff/1/?file=322203#file322203line390>
> >
> >     I'm ignorant to the implications of this, but can you confirm/deny the following
behavior?
> >     
> >     - Queue holds [U1, U2, U3, U4] which have yet to be processed.
> >     
> >     - Update U1 arrives, this code processes it.
> >     
> >     - Scheduler aborts.
> >     
> >     - New scheduler receives retried [U1, U2, U2, U4] (in any order)
> 
> Vinod Kone wrote:
>     Not sure which queue you are referring to, but I'm assuming you mean the 'uuids'
set?
>     
>     An update goes into 'uuids' only after it is processed (i.e., Scheduler::statusUpdate()
returns) by the scheduler.
>     
>     In the above scenario if a duplicate U1 is enqueued in the libprocess queue and the
scheduler aborts after handling the original U1, the driver would've aborted and we would
have never come here.
>     
>     When a new scheduler (and driver) becomes the leader they get updates fresh from
mesos.
>     
>     Does that make sense?

I think you explained behavior for a slightly different scenario than what i'm attempting
to describe.

- The driver has received [U1, U2, U3, U4], but the scheduler implementation has yet to receive/ACK
them.

- A duplicate U1 arrives.

- Scheduler aborts.

What happens in that scenario?  Based on the verbiage in the diff, it sounds as though U1
is ACKed to other parts of the system, and will not be retried when the new scheduler takes
over.


- Bill


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12603/#review23212
-----------------------------------------------------------


On July 17, 2013, 1:34 a.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12603/
> -----------------------------------------------------------
> 
> (Updated July 17, 2013, 1:34 a.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Ben Mahler.
> 
> 
> Bugs: MESOS-551
>     https://issues.apache.org/jira/browse/MESOS-551
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   src/sched/sched.cpp 7ea82e547c612159c9fa24fb6d62e3d2b5f11982 
>   src/tests/status_update_manager_tests.cpp 42395324dfe49659bee2229c6573ffef0874d923

> 
> Diff: https://reviews.apache.org/r/12603/diff/
> 
> 
> Testing
> -------
> 
> make check (OSX and Linux)
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message