reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markus Weimer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (REEF-1412) Improve efficiency of DFSEvaluatorLogOverwriteWriter
Date Thu, 02 Jun 2016 00:25:59 GMT

    [ https://issues.apache.org/jira/browse/REEF-1412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15311465#comment-15311465
] 

Markus Weimer commented on REEF-1412:
-------------------------------------

Do we need to support {{FileSystem}} implementations that do not support append? If not, this
can all be greatly simplified.

> Improve efficiency of DFSEvaluatorLogOverwriteWriter
> ----------------------------------------------------
>
>                 Key: REEF-1412
>                 URL: https://issues.apache.org/jira/browse/REEF-1412
>             Project: REEF
>          Issue Type: Sub-task
>          Components: REEF Driver, REEF.NET Driver
>            Reporter: Andrew Chung
>            Assignee: Andrew Chung
>
> {{DFSEvaluatorLogOverwriteWriter}} currently reads from original EvalutorChangesLog before
overwriting it. This can become a problem when the job has been running for a long time. We
can make this more efficient by simply recording the Evaluators it expects in a log.
> Another problem that needs to be addressed is that it currently deletes the old EvaluatorChangesLog
prior to writing the new one. Instead, we should use a two-file approach, where a tmp file
is created prior to deleting the old one.
> *Scenario Analysis*
> Read Scenario:
> On read, the latest EvaluatorChangesLog/EvaluatorChangesLog.tmp should always be read.
> Write Scenario:
> * If both EvaluatorChangesLog and EvaluatorChangesLog.tmp exist, and EvaluatorChangesLog.tmp
is older, we write to EvalutorChangesLog.tmp, delete EvaluatorChangesLog, and rename EvaluatorChangesLog.tmp
to EvaluatorChangesLog.
> * If both EvaluatorChangesLog and EvaluatorChangesLog.tmp exist, and EvaluatorChangesLog
is older, we write to EvalutorChangesLog and call it done.
> * If EvaluatorChangesLog exists but not EvalutorChangesLog.tmp, we write to EvaluatorChangesLog.tmp,
delete EvaluatorChangesLog, and rename EvaluatorChangesLog.tmp to EvaluatorChangesLog.
> * If EvaluatorChangesLog.tmp exists but not EvaluatorChangesLog, we write to EvalutorChangesLog,
we write to EvalutorChangesLog and call it done.
> * If neither exist, we write to EvaluatorChangesLog and call it done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message