From yarn-issues-return-144529-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org  Wed May  9 01:38:05 2018
Return-Path: <yarn-issues-return-144529-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
	by mx-eu-01.ponee.io (Postfix) with SMTP id CABD1180674
	for <archive-asf-public@cust-asf.ponee.io>; Wed,  9 May 2018 01:38:04 +0200 (CEST)
Received: (qmail 51721 invoked by uid 500); 8 May 2018 23:38:03 -0000
Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:yarn-issues-help@hadoop.apache.org>
List-Unsubscribe: <mailto:yarn-issues-unsubscribe@hadoop.apache.org>
List-Post: <mailto:yarn-issues@hadoop.apache.org>
List-Id: <yarn-issues.hadoop.apache.org>
Delivered-To: mailing list yarn-issues@hadoop.apache.org
Received: (qmail 51710 invoked by uid 99); 8 May 2018 23:38:03 -0000
Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142)
    by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 May 2018 23:38:03 +0000
Received: from localhost (localhost [127.0.0.1])
	by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 63746C5DBE
	for <yarn-issues@hadoop.apache.org>; Tue,  8 May 2018 23:38:03 +0000 (UTC)
X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org
X-Spam-Flag: NO
X-Spam-Score: -110.301
X-Spam-Level:
X-Spam-Status: No, score=-110.301 tagged_above=-999 required=6.31
	tests=[ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_MED=-2.3,
	SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100]
	autolearn=disabled
Received: from mx1-lw-us.apache.org ([10.40.0.8])
	by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024)
	with ESMTP id N_e4Q9PzVlbb for <yarn-issues@hadoop.apache.org>;
	Tue,  8 May 2018 23:38:02 +0000 (UTC)
Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139])
	by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 0514B5FBBA
	for <yarn-issues@hadoop.apache.org>; Tue,  8 May 2018 23:38:02 +0000 (UTC)
Received: from jira-lw-us.apache.org (unknown [207.244.88.139])
	by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 081C6E12FE
	for <yarn-issues@hadoop.apache.org>; Tue,  8 May 2018 23:38:00 +0000 (UTC)
Received: from jira-lw-us.apache.org (localhost [127.0.0.1])
	by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 4D251212A1
	for <yarn-issues@hadoop.apache.org>; Tue,  8 May 2018 23:38:00 +0000 (UTC)
Date: Tue, 8 May 2018 23:38:00 +0000 (UTC)
From: "Eric Yang (JIRA)" <jira@apache.org>
To: yarn-issues@hadoop.apache.org
Message-ID: <JIRA.13124802.1513203439000.26506.1525822680313@Atlassian.JIRA>
In-Reply-To: <JIRA.13124802.1513203439000@Atlassian.JIRA>
References: <JIRA.13124802.1513203439000@Atlassian.JIRA> <JIRA.13124802.1513203439395@jira-lw-us.apache.org>
Subject: [jira] [Commented] (YARN-7654) Support ENTRY_POINT for docker
 container
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394


    [ https://issues.apache.org/jira/browse/YARN-7654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468111#comment-16468111 ] 

Eric Yang commented on YARN-7654:
---------------------------------

[~jlowe] {quote}I'll try to find time to take a closer look at this patch tomorrow, but I'm wondering if we really need to separate the detached vs. foreground launching for override vs. entry-point containers. The main problem with running containers in the foreground is that we have no idea how long it takes to actually start a container. As I mentioned above, any required localization for the image is likely to cause the container launch to fail due to docker inspect retries hitting the retry limit and failing, leaving the container uncontrolled or at best finally killed sometime later if Shane's lifecycle changes cause the container to get recognized long afterwards and killed.{quote}

Detach option is only obtaining a container id, and container process continues to update information in the background.  We call docker inspect by name reference instead of container id.  Detach does not produce more accurate result than running in the foreground from docker inspect point of view because operations to docker daemon via docker CLI are asynchronous via docker daemon's rest api.  Json output from docker inspect may have partial information.  Since we know exactly the information to parse, therefore retry provides better success rate.  For ENTRY_POINT, docker run in foreground to capture stdout and stderr of ENTRY_POINT process without reliant on mounting host log directory to docker container.  This helps to prevent host log path sticking out inside the container that may look odd to users.

{quote}I think a cleaner approach would be to always run containers as detached, so when the docker run command returns we will know the docker inspect command will work. If I understand correctly, the main obstacle to this approach is finding out what to do with the container's standard out and standard error streams which aren't directly visible when the container runs detached. However I think we can use the docker logs command after the container is launched to reacquire the container's stdout and stderr streams and tie them to the intended files. At least my local experiments show docker logs is able to obtain the separate stdout and stderr streams for containers whether they were started detached or not. Thoughts?{quote}

If we want to run in background, then we have problems to capture logs again base on issues found in prior meetings.  

# The docker logs command will show logs from beginning of the launch to the point where it was captured.  Without frequent calls to docker logs command, we don't get the complete log.  It is expensive to call docker logs with fork and exec than reading a local log file.  If we use --tail option, it is still one extra fork and managing the child process liveness and resource usage.  This complicates how the resource usage should be computed.
# docker logs does not seem to separate out stdout from stderr.  [This issue|https://github.com/moby/moby/issues/7440] is unresolved in docker. This is different from YARN log file management.  It would be nice to follow yarn approach to make the output less confusing in many situations.

After many experiments, I settled on foreground and dup for simplicity.  Foreground and retry docker inspect is a good concern.  However, there is a way to find the reasonable timeout value to decide if a docker container should be marked as failed.


> Support ENTRY_POINT for docker container
> ----------------------------------------
>
>                 Key: YARN-7654
>                 URL: https://issues.apache.org/jira/browse/YARN-7654
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn
>    Affects Versions: 3.1.0
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>            Priority: Blocker
>              Labels: Docker
>         Attachments: YARN-7654.001.patch, YARN-7654.002.patch, YARN-7654.003.patch, YARN-7654.004.patch, YARN-7654.005.patch, YARN-7654.006.patch, YARN-7654.007.patch, YARN-7654.008.patch, YARN-7654.009.patch, YARN-7654.010.patch, YARN-7654.011.patch, YARN-7654.012.patch, YARN-7654.013.patch, YARN-7654.014.patch, YARN-7654.015.patch, YARN-7654.016.patch, YARN-7654.017.patch, YARN-7654.018.patch, YARN-7654.019.patch, YARN-7654.020.patch
>
>
> Docker image may have ENTRY_POINT predefined, but this is not supported in the current implementation.  It would be nice if we can detect existence of {{launch_command}} and base on this variable launch docker container in different ways:
> h3. Launch command exists
> {code}
> docker run [image]:[version]
> docker exec [container_id] [launch_command]
> {code}
> h3. Use ENTRY_POINT
> {code}
> docker run [image]:[version]
> {code}


--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org