hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chandni Singh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-9126) Container reinit always fails in branch-3.2 and trunk
Date Tue, 18 Dec 2018 01:55:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16723580#comment-16723580
] 

Chandni Singh commented on YARN-9126:
-------------------------------------

There were 2 changes that caused the issue:
- YARN-7644 : the cleanup of working directory is done asynchronously 
- YARN-8569: this introduced sysfs directory in container's working directory which needs
to be deleted during cleanup of working directory.

Attached is patch 001. [~eyang] could you please take a look.

> Container reinit always fails in branch-3.2 and trunk
> -----------------------------------------------------
>
>                 Key: YARN-9126
>                 URL: https://issues.apache.org/jira/browse/YARN-9126
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Eric Yang
>            Assignee: Chandni Singh
>            Priority: Major
>              Labels: docker
>         Attachments: YARN-9126.001.patch
>
>
> When upgrading container, container reinitialization always failed with code 33.  This
error code means the localizing file already exist while copying resource files.  The container
will retry with another container ID, hence the problem is masked.
> Hadoop 3.1.x relaunch logic seem to have some way to prevent this bug from happening.
 The same logic might be useful in branch 3.2 and trunk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message