mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wang Qiang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-5028) Copy provisioner cannot replace directory with symlink
Date Mon, 20 Nov 2017 09:26:00 GMT

    [ https://issues.apache.org/jira/browse/MESOS-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258999#comment-16258999
] 

Wang Qiang commented on MESOS-5028:
-----------------------------------

I am not sure, I should put it here or not. Seems like I still have similar issue with mesos
1.2.1

I am trying to follow the mesos gpu tutorial (using UCR) but failed with 

```
message: 'Failed to launch container: Collect failed: Failed to remove the entries under the
directory labeled as opaque whiteout '/data/mesos/slave/provisioner/containers/d7651be4-5eb9-4973-86e5-e018e207a327/backends/copy/rootfses/d9515065-5806-4a19-82f5-eb80ddb040bf/usr/local/cuda-9.0':
No such file or directory'
```

The image I am using is nvida:cuda

The cuda image docker file has the 

```
RUN apt-get update && apt-get install -y --no-install-recommends \
        cuda-cudart-$CUDA_PKG_VERSION && \
    ln -s cuda-9.0 /usr/local/cuda && \
    rm -rf /var/lib/apt/lists/*
```



> Copy provisioner cannot replace directory with symlink
> ------------------------------------------------------
>
>                 Key: MESOS-5028
>                 URL: https://issues.apache.org/jira/browse/MESOS-5028
>             Project: Mesos
>          Issue Type: Bug
>          Components: containerization
>            Reporter: Zhitao Li
>            Assignee: Chun-Hung Hsiao
>             Fix For: 1.1.2, 1.2.1, 1.3.0
>
>
> I'm trying to play with the new image provisioner on our custom docker images, but one
of layer failed to get copied, possibly due to a dangling symlink.
> Error log with Glog_v=1:
> {quote}
> I0324 05:42:48.926678 15067 copy.cpp:127] Copying layer path '/tmp/mesos/store/docker/layers/5df0888641196b88dcc1b97d04c74839f02a73b8a194a79e134426d6a8fcb0f1/rootfs'
to rootfs '/var/lib/mesos/provisioner/containers/5f05be6c-c970-4539-aa64-fd0eef2ec7ae/backends/copy/rootfses/507173f3-e316-48a3-a96e-5fdea9ffe9f6'
> E0324 05:42:49.028506 15062 slave.cpp:3773] Container '5f05be6c-c970-4539-aa64-fd0eef2ec7ae'
for executor 'test' of framework 75932a89-1514-4011-bafe-beb6a208bb2d-0004 failed to start:
Collect failed: Collect failed: Failed to copy layer: cp: cannot overwrite directory ‘/var/lib/mesos/provisioner/containers/5f05be6c-c970-4539-aa64-fd0eef2ec7ae/backends/copy/rootfses/507173f3-e316-48a3-a96e-5fdea9ffe9f6/etc/apt’
with non-directory
> {quote}
> Content of _/tmp/mesos/store/docker/layers/5df0888641196b88dcc1b97d04c74839f02a73b8a194a79e134426d6a8fcb0f1/rootfs/etc/apt_
points to a non-existing absolute path (cannot provide exact path but it's a result of us
trying to mount apt keys into docker container at build time).
> I believe what happened is that we executed a script at build time, which contains equivalent
of:
> {quote}
> rm -rf /etc/apt/* && ln -sf /build-mount-point/ /etc/apt
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message