hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsuyoshi OZAWA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2052) ContainerId creation after work preserving restart is broken
Date Wed, 18 Jun 2014 02:58:03 GMT

    [ https://issues.apache.org/jira/browse/YARN-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034746#comment-14034746

Tsuyoshi OZAWA commented on YARN-2052:

We should make it a long in the same release as the epoch number addition so that we dont
have to worry about that.

+1 to do this in the same release. We'll plan to do the improvement on another JIRA. It's
OK, but I think it's important for us that we decide the behavior when the overflow happens.
We have 2 options: just aborting RM for now or starting apps from a clean state after the
restart. We're planning to make id long just after this JIRA, so we can take aborting approach
to prevent unexpected behavior for the simplicity. [~bikassaha], [~jianhe], what do you think
about this?

> ContainerId creation after work preserving restart is broken
> ------------------------------------------------------------
>                 Key: YARN-2052
>                 URL: https://issues.apache.org/jira/browse/YARN-2052
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Tsuyoshi OZAWA
>            Assignee: Tsuyoshi OZAWA
>         Attachments: YARN-2052.1.patch, YARN-2052.2.patch, YARN-2052.3.patch
> Container ids are made unique by using the app identifier and appending a monotonically
increasing sequence number to it. Since container creation is a high churn activity the RM
does not store the sequence number per app. So after restart it does not know what the new
sequence number should be for new allocations.

This message was sent by Atlassian JIRA

View raw message