hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsuyoshi OZAWA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2052) ContainerId creation after work preserving restart is broken
Date Fri, 13 Jun 2014 21:49:02 GMT

    [ https://issues.apache.org/jira/browse/YARN-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031223#comment-14031223
] 

Tsuyoshi OZAWA commented on YARN-2052:
--------------------------------------

[~jianhe] and [~vinodkv], thank you for the comments and suggestions!

{quote}
This scheme won't work with a single reserved digit for epochs and a large number of restarts
over time.
{quote}

Yes, this is a case that integer overflow happens. We need to take it into account the case.

{quote}
Old code (state-store, history-server etc) will not read it and that's fine. The only problem
is users who are interpreting container_ID strings themselves. That is NOT supported. We should
modify ConverterUtils to support the new-field, and that should do.
{quote}

Adding RM Id + hostname as epoch sounds reasonable approach to me. If we suffixes the epoch
to the container id, following code is also valid with old {{ConverterUtils.toContainerId}}:

{code}
    ContainerId id = TestContainerId.newContainerId(0, 0, 0, 0);
    String cid = ConverterUtils.toString(id);
    ContainerId gen = ConverterUtils.toContainerId(cid + "_uuid_rm1");
    assertEquals(gen, id); // valid to parse even with old code
{code}

Therefore, I think {{container_XXX_000_uuid_rm1}} is better format. I'll create a patch based
on the idea.

> ContainerId creation after work preserving restart is broken
> ------------------------------------------------------------
>
>                 Key: YARN-2052
>                 URL: https://issues.apache.org/jira/browse/YARN-2052
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Tsuyoshi OZAWA
>            Assignee: Tsuyoshi OZAWA
>
> Container ids are made unique by using the app identifier and appending a monotonically
increasing sequence number to it. Since container creation is a high churn activity the RM
does not store the sequence number per app. So after restart it does not know what the new
sequence number should be for new allocations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message