reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Chung (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (REEF-1412) Improve efficiency of DFSEvaluatorLogOverwriteWriter
Date Thu, 02 Jun 2016 18:47:59 GMT

     [ https://issues.apache.org/jira/browse/REEF-1412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Chung updated REEF-1412:
-------------------------------
    Description: 
{{DFSEvaluatorLogOverwriteWriter}} currently reads from original EvalutorChangesLog before
overwriting it. This can become a problem when the job has been running for a long time. We
can make this more efficient by simply recording the Evaluators it expects in a log.

Another problem that needs to be addressed is that it currently deletes the old EvaluatorChangesLog
prior to writing the new one. Instead, we should use a two-file approach, where an alternative
file, EvaluatorChangesLog.alt, is used in tandem with EvaluatorChangesLog.
On read, the newest between EvaluatorChangesLog and EvaluatorChangesLog.alt should always
be read. 
On write, we always overwrite the older file between EvaluatorChangesLog and EvaluatorChangesLog.alt.




  was:
{{DFSEvaluatorLogOverwriteWriter}} currently reads from original EvalutorChangesLog before
overwriting it. This can become a problem when the job has been running for a long time. We
can make this more efficient by simply recording the Evaluators it expects in a log.

Another problem that needs to be addressed is that it currently deletes the old EvaluatorChangesLog
prior to writing the new one. Instead, we should use a two-file approach, where a tmp file
is created prior to deleting the old one.

*Scenario Analysis*

Read Scenario:
On read, the latest EvaluatorChangesLog/EvaluatorChangesLog.tmp should always be read.

Write Scenario:
* If both EvaluatorChangesLog and EvaluatorChangesLog.tmp exist, and EvaluatorChangesLog.tmp
is older, we write to EvalutorChangesLog.tmp, delete EvaluatorChangesLog, and rename EvaluatorChangesLog.tmp
to EvaluatorChangesLog.
* If both EvaluatorChangesLog and EvaluatorChangesLog.tmp exist, and EvaluatorChangesLog is
older, we write to EvalutorChangesLog and call it done.
* If EvaluatorChangesLog exists but not EvalutorChangesLog.tmp, we write to EvaluatorChangesLog.tmp,
delete EvaluatorChangesLog, and rename EvaluatorChangesLog.tmp to EvaluatorChangesLog.
* If EvaluatorChangesLog.tmp exists but not EvaluatorChangesLog, we write to EvalutorChangesLog,
we write to EvalutorChangesLog and call it done.
* If neither exist, we write to EvaluatorChangesLog and call it done.





> Improve efficiency of DFSEvaluatorLogOverwriteWriter
> ----------------------------------------------------
>
>                 Key: REEF-1412
>                 URL: https://issues.apache.org/jira/browse/REEF-1412
>             Project: REEF
>          Issue Type: Sub-task
>          Components: REEF Driver, REEF.NET Driver
>            Reporter: Andrew Chung
>            Assignee: Andrew Chung
>
> {{DFSEvaluatorLogOverwriteWriter}} currently reads from original EvalutorChangesLog before
overwriting it. This can become a problem when the job has been running for a long time. We
can make this more efficient by simply recording the Evaluators it expects in a log.
> Another problem that needs to be addressed is that it currently deletes the old EvaluatorChangesLog
prior to writing the new one. Instead, we should use a two-file approach, where an alternative
file, EvaluatorChangesLog.alt, is used in tandem with EvaluatorChangesLog.
> On read, the newest between EvaluatorChangesLog and EvaluatorChangesLog.alt should always
be read. 
> On write, we always overwrite the older file between EvaluatorChangesLog and EvaluatorChangesLog.alt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message