reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mariia Mykhailova (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (REEF-1641) Remove batch id and change the way to request maser/slave revalautors
Date Mon, 17 Oct 2016 18:26:58 GMT

     [ https://issues.apache.org/jira/browse/REEF-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mariia Mykhailova updated REEF-1641:
------------------------------------
    Assignee: Julia

> Remove batch id and change the way to request maser/slave revalautors 
> ----------------------------------------------------------------------
>
>                 Key: REEF-1641
>                 URL: https://issues.apache.org/jira/browse/REEF-1641
>             Project: REEF
>          Issue Type: Improvement
>            Reporter: Julia
>            Assignee: Julia
>              Labels: FT
>             Fix For: 0.16
>
>
> Currently in IMRU, we have different spec for master vs mapper elevators. We use EvaluatorBatchId
to distinguish which one is what we requested when receiving IAllocatedEvalauator. However,
EvaluatorBatchId  may not be supported in some environment such as HDI. So we need a way to
decide which evaluator is for master. 
> There are multiple options:
> a). Match the specification. The issue is when we ask for evaluators with specific cores
and memory, we may not get the exact same specification but some rounded off resources. 
> b). Use the same spec for both master and mappers. Choose the first one for master if
master doesn't exist. The issue is if mappers are more memory heavy as compared to update
func. Allocating some extra memory for one update evaluator is ok. But in scenarios where
its other way around, giving all mappers extra memory can be a huge waste.
> c). Request the first evaluator for master. After receiving it, record that one as master,
then request the mappers. The issue is what if master fails during WaitingForEvalautor phase?
A solution is assuming if master fails at any stage, fail the system. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message