hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anubhav Dhoot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2229) ContainerId can overflow with RM restart
Date Fri, 25 Jul 2014 17:35:39 GMT

    [ https://issues.apache.org/jira/browse/YARN-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074635#comment-14074635

Anubhav Dhoot commented on YARN-2229:

We cannot simply add a field and have old code not know about it. That will cause it to silently
work with a wrong id (missing field). And because of the way we construct containerIds we
need to add the new field (details in YARN-2052).

The only way i see it working (without a cluster shutdown) is if we support deserializing
both the older format and newer format. When serializing we can choose to emit a new field
based on a condition (flag or version number of the daemon).
So the first rolling upgrade will not turn on the condition but will ensure all the code supports
deserializing the newer field if it exists. In the next rolling upgrade we can turn on the
condition to serialize the new field.

RM can ensure that  NMs are upgraded to a specific version (support deserializing the new
field) before allowing the flag to be turned on. That will take care of the case when someone
does not follow the approach above.
Any problems with this approach?

> ContainerId can overflow with RM restart
> ----------------------------------------
>                 Key: YARN-2229
>                 URL: https://issues.apache.org/jira/browse/YARN-2229
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Tsuyoshi OZAWA
>            Assignee: Tsuyoshi OZAWA
>         Attachments: YARN-2229.1.patch, YARN-2229.10.patch, YARN-2229.10.patch, YARN-2229.2.patch,
YARN-2229.2.patch, YARN-2229.3.patch, YARN-2229.4.patch, YARN-2229.5.patch, YARN-2229.6.patch,
YARN-2229.7.patch, YARN-2229.8.patch, YARN-2229.9.patch
> On YARN-2052, we changed containerId format: upper 10 bits are for epoch, lower 22 bits
are for sequence number of Ids. This is for preserving semantics of {{ContainerId#getId()}},
{{ContainerId#toString()}}, {{ContainerId#compareTo()}}, {{ContainerId#equals}}, and {{ConverterUtils#toContainerId}}.
One concern is epoch can overflow after RM restarts 1024 times.
> To avoid the problem, its better to make containerId long. We need to define the new
format of container Id with preserving backward compatibility on this JIRA.

This message was sent by Atlassian JIRA

View raw message