hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhankun Tang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-3854) Add localization support for docker images
Date Thu, 21 Jul 2016 04:09:20 GMT

    [ https://issues.apache.org/jira/browse/YARN-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387117#comment-15387117
] 

Zhankun Tang edited comment on YARN-3854 at 7/21/16 4:08 AM:
-------------------------------------------------------------

[~shanekumpf@gmail.com], thanks and I totally agree with your goals in the Docker images localization
topic. And whether we use HDFS distributed cache or HDFS backed private repo is fine to me.
I also mentioned the private repo way in the doc and saying that this doc and patch is a solution
for the users who don't want to maintain private repo. I believe that a well maintained Docker
private repo will be good choice for many people and don't need YARN to do extra work for
it.

For the docker pull while one are pushing new version image, I think it's a rolling update
problem. The new version should have a new tag. And the administrator manually rolling update
the application will be ok.

Let's back to this patch. In essence, "HDFS + save/load" tries to mimic the private Docker
repo. There are two parts to consider in the whole process. This patch brings in extra steps/issues
due to the simplicity: 
* 1. Docker image generation and upload to a storage
** This patch uses an *extra "docker save" step* comparing with Docker on image generation.
And it needs the *application remember the URI* while Docker  just need one to know the tag
name after upload to the storage.
* 2. Image distribution/localization
** This patch is *distributing a tar file* through distributed cache so it's hard to speed
up distribution by only download delta like Docker pull. And it consumes more network bandwidth.
** A big issue is that this patch has security risk as mentioned by [~sidharta-s]. Thanks
Sidharta pointing this out that I don't realized before. Because potential tag name conflicts,
different users may replace each other's Docker images. Currently, we cannot avoid this due
to YARN have no way to distinguish tag names of two Docker images tar files. YARN only know
this is a Docker image tar file, but cannot know whether load it will cause other's image
replaced. Although there's also no tag name conflicts check when we use "docker push", administrator
can avoid this conflicts when pushing so that each image has unique tag name. Anyway, it's
a fact that this patch opens a hole for user to attack existing Docker images. One way to
solve this is adding a option in Docker to avoid force load if the tag name is already exists.

To sum up, this patch eliminates the needs for setup private repo, but brings extra works
to admin/application and have potential risk due to attack surface of Docker load. I'll raise
this issue to Docker and thanks again, folks. And I think we should be more clear to the motivation
of this JIRA, [~sidharta-s]. Thoughts?




was (Author: tangzhankun):
[~shanekumpf@gmail.com], thanks and I totally agree with your goals in the Docker images localization
topic. And whether we use HDFS distributed cache or HDFS backed private repo is fine to me.
I also mentioned the private repo way in the doc and saying that this doc and patch is a solution
for the users who don't want to maintain private repo. I believe that a well maintained Docker
private repo will be good choice for many people and don't need YARN to do extra work for
it.

For the docker pull while one are pushing new version image, I think it's a rolling update
problem. The new version should have a new tag. And the administrator manually rolling update
the application will be ok.

Let's back to this patch. In essence, "HDFS + save/load" tries to mimic the private Docker
repo. There are two parts to consider in the whole process. This patch brings in extra steps/issues
due to the simplicity: 
* 1. Docker image generation and upload to a storage
** This patch uses an *extra "docker save" step* comparing with Docker on image generation.
And it needs the *application remember the URI* while Docker  just need one to know the tag
name after upload to the storage.
* 2. Image distribution/localization
** This patch is *distributing a tar file* through distributed cache so it's hard to speed
up distribution by only download delta like Docker pull. And it consumes more network bandwidth.
** A big issue is that this patch has security risk as mentioned by [~sidharta-s]. Thanks
Sidharta pointing this out that I don't realized before. Because potential tag name conflicts,
different users may replace each other's Docker images. Currently, we cannot avoid this due
to YARN have no way to distinguish tag names of two Docker images tar files. YARN only know
this is a Docker image tar file, but cannot know whether load it will cause other's image
replaced. Although there's also no tag name conflicts check when we use "docker push", administrator
can avoid this conflicts when pushing so that each image has unique tag name. Anyway, it's
a fact that this patch opens a hole for user to attack existing Docker images. One way to
solve this is adding a option in Docker to avoid force load if the tag name is already exists.

To sum up, this patch eliminates the needs for setup private repo, but brings extra works
to admin/application and have potential risk due to attack surface of Docker load. I'll raise
this issue to Docker and thanks again, folks. And I think we should be more clear the motivation
of this JIRA clearly, [~sidharta-s]. Thoughts?



> Add localization support for docker images
> ------------------------------------------
>
>                 Key: YARN-3854
>                 URL: https://issues.apache.org/jira/browse/YARN-3854
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn
>            Reporter: Sidharta Seethana
>            Assignee: Zhankun Tang
>         Attachments: YARN-3854-branch-2.8.001.patch, YARN-3854_Localization_support_for_Docker_image_v1.pdf,
YARN-3854_Localization_support_for_Docker_image_v2.pdf
>
>
> We need the ability to localize images from HDFS and load them for use when launching
docker containers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message