hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsuyoshi OZAWA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2052) ContainerId creation after work preserving restart is broken
Date Thu, 26 Jun 2014 20:55:25 GMT

    [ https://issues.apache.org/jira/browse/YARN-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14045164#comment-14045164

Tsuyoshi OZAWA commented on YARN-2052:

Thank you for the review, Jian! Updated a patch including following changes:

* Updated the structure graph in ZKRMStateStore to reflect the new epoch node.
* Replaced {{new RMEpochPBImpl()}} with {{RMEpoch.newInstance()}}.
* Removed {{this.containerIdCounter.incrementAndGet();}} from {{recoverContainer()}}.
* Updated {{ContainerId.toString()}} format. The format is {{container_appId_appAttemptId_timestamp_containerId_epoch_sequenceNumber}}.
I just added {{_epoch_sequenceNumber}} to the tail of current format, because of backward
compatibility ConverterUtils#toContainerId between the before and after this JIRA.
* Added comment to {{getId()}}.

> ContainerId creation after work preserving restart is broken
> ------------------------------------------------------------
>                 Key: YARN-2052
>                 URL: https://issues.apache.org/jira/browse/YARN-2052
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Tsuyoshi OZAWA
>            Assignee: Tsuyoshi OZAWA
>         Attachments: YARN-2052.1.patch, YARN-2052.2.patch, YARN-2052.3.patch, YARN-2052.4.patch,
YARN-2052.5.patch, YARN-2052.6.patch, YARN-2052.7.patch
> Container ids are made unique by using the app identifier and appending a monotonically
increasing sequence number to it. Since container creation is a high churn activity the RM
does not store the sequence number per app. So after restart it does not know what the new
sequence number should be for new allocations.

This message was sent by Atlassian JIRA

View raw message