reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julia (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (REEF-1378) Evaluator Manager for IMRU
Date Tue, 03 May 2016 17:50:13 GMT

    [ https://issues.apache.org/jira/browse/REEF-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269184#comment-15269184
] 

Julia commented on REEF-1378:
-----------------------------

If the number of FailedEvalutor and Failed Contexts are always equal, yes, we can only keep
ActiveContext collection. But they are not. 
* Theoretically one evaluator may have multiple active contexts. We do recovery based on failed
evaluators not failed contexts.  
* After requested all the evaluators, contexts are still on the way. We cannot based on the
context number  to determine we have missing evalutors. 
* If context fails during data loading and evalautor is still healthy, in next phase, we might
only need to resubmit a context instead of entire evaluator. 

Adding a couple of collections is just for tracking, logging, easy to manage purpose. It won't
add any more system state. When we receive IAllocatedEvaluator event, or IActiveContext, we
need to take some action, do book keeping, validation, check how many are missing, etc. Keep
both evalutor and contexts collections would also allow us to do double validation. When an
event happens, based on the data we collected and event type, we decide if the state should
be changed. But not always need to change. 

Look at today's ServiceAndContextConfigurationProvider, with very limited fault tolerant,
we already have collections for _submittedEvaluators, _contextLoadedEvaluators. I am consolidating
all of those into one class. 

> Evaluator Manager for IMRU
> --------------------------
>
>                 Key: REEF-1378
>                 URL: https://issues.apache.org/jira/browse/REEF-1378
>             Project: REEF
>          Issue Type: Task
>          Components: IMRU, REEF.NET
>            Reporter: Julia
>            Assignee: Julia
>         Attachments: EvaluatorManager.cs
>
>
> Booking tracking allocated Evaluators, failed Evalators. Provided methods to add/remove
evaluators and request evaluators. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message