hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1185) FileSystemRMStateStore can leave partial files that prevent subsequent recovery
Date Thu, 17 Oct 2013 02:54:44 GMT

    [ https://issues.apache.org/jira/browse/YARN-1185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13797558#comment-13797558
] 

Jian He commented on YARN-1185:
-------------------------------

The test case may also better to assert in the end that the corrupted application/attempt
is not loaded back in RMState and doesn't exist in FileSystem

> FileSystemRMStateStore can leave partial files that prevent subsequent recovery
> -------------------------------------------------------------------------------
>
>                 Key: YARN-1185
>                 URL: https://issues.apache.org/jira/browse/YARN-1185
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 2.1.0-beta
>            Reporter: Jason Lowe
>            Assignee: Omkar Vinit Joshi
>         Attachments: YARN-1185.1.patch
>
>
> FileSystemRMStateStore writes directly to the destination file when storing state. However
if the RM were to crash in the middle of the write, the recovery method could encounter a
partially-written file and either outright crash during recovery or silently load incomplete
state.
> To avoid this, the data should be written to a temporary file and renamed to the destination
file afterwards.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message