hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-15284) Docker launch fails when user private filecache directory is missing
Date Mon, 05 Mar 2018 16:31:00 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Jason Lowe updated HADOOP-15284:
    Affects Version/s: 3.1.0
              Summary: Docker launch fails when user private filecache directory is missing
 (was: Could not determine real path of mount)

ContainerLocalizer, which is run for every user-specific localization (i.e.: PRIVATE and APPLICATION
visibility), creates both the usercache/_user_/filecache and usercache/_user_/appcache directories
whenever it runs (see ContainerLocalizer#initDirs).

If this directory is missing then I'm wondering if this is a case where _nothing_ was localized
for this user, not just PRIVATE but also no APPLICATION visibility resources (i.e.: only public
resources or no resources at all).  The only reason this would have worked before YARN-7815
is because the container executor creates the container work directory which exists under
the usercache/_user_ directory, and that's what it used to mount before tha changes in YARN-7815.

> Docker launch fails when user private filecache directory is missing
> --------------------------------------------------------------------
>                 Key: HADOOP-15284
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15284
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 3.1.0
>            Reporter: Eric Yang
>            Priority: Major
> Docker container is failing to launch in trunk.  The root cause is:
> {code}
> [COMPINSTANCE sleeper-1 : container_1520032931921_0001_01_000020]: [2018-03-02 23:26:09.196]Exception
from container-launch.
> Container id: container_1520032931921_0001_01_000020
> Exit code: 29
> Exception message: image: hadoop/centos:latest is trusted in hadoop registry.
> Could not determine real path of mount '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Could not determine real path of mount '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Invalid docker mount '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache:/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache',
> Error constructing docker command, docker error code=12, error message='Invalid docker
> Shell output: main : command provided 4
> main : run as user is hbase
> main : requested yarn user is hbase
> Creating script paths...
> Creating local dirs...
> [2018-03-02 23:26:09.240]Diagnostic message from attempt 0 : [2018-03-02 23:26:09.240]
> [2018-03-02 23:26:09.240]Container exited with a non-zero exit code 29.
> [2018-03-02 23:26:39.278]Could not find nmPrivate/application_1520032931921_0001/container_1520032931921_0001_01_000020//container_1520032931921_0001_01_000020.pid
in any of the directories
> [COMPONENT sleeper]: Failed 11 times, exceeded the limit - 10. Shutting down now...
> {code}
> The filecache cant not be mounted because it doesn't exist.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message