reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matteo Interlandi <>
Subject Re: REEF scheduler: Tasks groups and fault tolerance
Date Wed, 12 Jul 2017 16:52:03 GMT
Thanks Saikat. I am going to push everything I have in a branch. I have 3-4
example applications, the design should be clearer from them. I will also
try to write a diagram by the next week.

On Wed, Jul 12, 2017 at 9:26 AM, Saikat Kanjilal <> wrote:

> Hi Matteo,
> This is great, I'm trying to wrap my head around this and give more
> concrete feedback but am looking for 2 additional things, is it possible
>  to see like an interaction diagram between all the components you identify
> above and an additional explanation around where it fits within the current
> workflow, being somewhat new to reef this would help me understand the
> overall workflow and help drive this proposal forward.   Also should we
> create a JIRA around this?
> Thanks in advance for doing this.
> On Tue, Jul 11, 2017 at 8:25 PM, Matteo Interlandi <
> >
> wrote:
> > Hi,
> >
> > I am working a scheduling layer on REEF for managing group of tasks
> > (TaskSets) and related fault-tolerance policies and mechanisms. I would
> > like to share the design I have in mind with you and gather some
> feedbacks.
> >
> > The idea is to have the usual two level scheduling (physical and logical
> > task scheduler). At the logical level I have a *service* provider and a
> set
> > of *subscriptions*. Subscriptions contains a pipeline of *operators*. For
> > the moment I am only considering group communication operators plus
> > *iterate*. Each operator (as in group communication) takes as a
> parameter a
> > *topology*, plus a user defined *failure state machine*, and a
> *checkpoint
> > policy*. The failure machine implements different response on the event
> of
> > a failure based on how many data points are lost as a consequence of a
> > failure (as a form or ratio between lost and initial data points). Lost
> > data points are computed by looking at the topology: for instance in case
> > of a broadcast under the tree topology, if a receiver task fails, one
> data
> > point is lost; conversely if a non-leaf node goes down all data points
> > under its branch are lost.
> >
> > Tasks are grouped into *task sets*. Each task set subscribe to one or
> more
> > subscriptions, based on the type of operators it wants to implement. When
> > available, tasks are added to the task sets and try to be added to each
> > subscriptions the task set is subscribed to. A subscription may or may
> not
> > add a task based on its semantics. Each task configuration is the union
> > between the task set conf, each subscription the task was successfully
> > subscribed too (which also contains the conf for each operator in the
> > subscription) plus the configuration of the service.
> >
> > When a failure occurs during the execution of an operator, the task set
> > forward it to subscription containing the failing operation, which update
> > the failure rate on its related failure state machine and eventually
> > generate a failure response. For the moment I am considering as failure
> > responses / failure states: continue, reconfigure (e.g., reconfigure the
> > topology so that computation can be continue), continue and reschedule a
> > task, stop the world and wait for a new task to be active. (New failure
> > machine implementations can be provided by the users by extending the
> > IFailureStateMachine interface). The failure is then propagated to the
> next
> > operator in the pipeline and up to the service (i.e., multiple operators
> > may fails in different subscriptions therefore the final decision have to
> > be taken at the service level). The service decides which action to take
> > based on its own failure state machine. The action is implemented by
> > related mechanism at the task set level.
> >
> > I have two goals in mind when developing this design:
> > - have a design expressible enough to be applicable to different
> scenarios
> > (e.g., IMRU and parameter servers).
> > - have a design still flexible enough so that any failure response
> > mechanism could be implemented (from do nothing, to recompute the state
> > using lineage information from previous operators).
> >
> > Any feedback is welcome!
> >
> > Best,
> > Matteo
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message