hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7654) Support ENTRY_POINT for docker container
Date Wed, 09 May 2018 22:32:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-7654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469613#comment-16469613
] 

Jason Lowe commented on YARN-7654:
----------------------------------

In the interest in trying to get this into 3.1, I'm OK with going with the foreground approach
for entry point for now as long as the pre-existing containers launch as before (which appears
to be the case with the patch).  Changing the entry-point launch logic is something we should
be able to update in the future since it's details are hidden from the user.

Comments on the latest patch:

In DockerProviderService#buildContainerLaunchContext it's calling processArtifact then super.buildContainerLaunchContext,
but the parent's buildContainerLaunchContext calls processArtifact as well.  Is the double-call
intentional?

It looks like the parent is called but in the end the result is smashed by the new setCommand
method, and much of the parent method's code was replicated in this method.  Rather than requiring
a setCommand interface be added to AbstractLauncher to clobber work previously done, would
it make more sense to refactor AbstractProviderService#buildContainerLaunchContext so the
pieces needed by DockerProviderService can be reused without requiring the launcher command
to be clobbered afterwards?

globalTokens and tokensForSubstitution are unconditionally computed but only needed if the
launchCommand is empty.  The computation should be moved inside the conditional.

typo: "Base on discussion" s/b "Based on the discussion"

ENV_DOCKER_COTAINER_ENV_FILE is not used and should be removed.

DockerLinuxContainerRuntime imports ENV_DOCKER_CONTAINER_RUN_OVERRIDE_DISABLE from itself.
dockerOverride is true when the override disable flag is true which is confusing.  That makes
me think dockerOverride actually means "use entry point" but the name implies it is true when
we are overriding docker and _not_ using the entry point.  A name like useEntryPoint would
make the code much easier to follow if that's indeed what the boolean means.  One example
of confusing code that looks like the logic is backwards:
{code}
    if (dockerOverride) {
       LOG.info("command override disabled");
       runCommand.setOverrideDisabled(true);
{code}

Nit: The patch is doing the double-env lookup pattern again.  Would be helpful to have a utility
method like getEnvBoolean that can be used to lookup environment variables whose values are
treated like booleans.

Is it necessary to do the following in isolation or can it be folded into the primary conditional
that does entry point logic vs. override logic a bit below?
{code}
    if (!dockerOverride) {
      runCommand.setContainerWorkDir(containerWorkDir.toString());
    }
{code}

Should we order the environment by dependencies as we do for normal container launches?  I
see DockerRunCommand is using a TreeMap, but would a LinkedHashMap make more sense?  Then
we're ordering things based on the order they were added.

DockerClient is creating the environment file in /tmp which has the same leaking problem we
had with the docker .cmd files.

writeEnvFile should use try-with-resources to make sure the writer is always closed even when
something throws.

writeCommandToTempFile should use try-with-resources so the printWriter is always closed rather
than placing explicit close calls on certain errors.

The instance checking and downcasting in writeCommandToTempFile looks pretty ugly.  It would
be cleaner to encapsulate this in the DockerCommand abstraction.  One example way to do this
is to move the logic of writing a docker command file into the DockerCommand abstract class.
 DockerRunCommand can then override that method to call the parent method and then separately
write the env file.  Worst case we can add a getEnv method to DockerCommand that returns the
collection of environment variables to write out for a command.  DockerCommand would return
null or an empty collection while DockerRunCommand can return its environment.

There's no need to have a containsEnv method and a getEnv method, especially when getEnv is
cheap.

Nit: It would be nice if DockerRunCommand had a method to add a Collection of environment
variables rather than forcing the caller to iterate themselves.

Nit: The following should just be {{String value = Boolean.toString(toggle)}}
{code}
    String value = "false";
    if (toggle) {
      value = "true";
    }
{code}

init_log_path should free tmp_buffer before setting it to NULL.

use_entry_point returns false when the entry point should be used.

The code is now writing "Launching docker container..." etc. even when not using the entry
point.  Are these smashed by the container_launch.sh script when not using the entry point?
 If not it could be an issue since it's changing what the user's code is writing to those
files today.

It would be be nice to factor out the entry-point-specific code path to do the docker inspect
with retries logic into a separate function for readability.  It looks like it would come
out pretty cleanly.



> Support ENTRY_POINT for docker container
> ----------------------------------------
>
>                 Key: YARN-7654
>                 URL: https://issues.apache.org/jira/browse/YARN-7654
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn
>    Affects Versions: 3.1.0
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>            Priority: Blocker
>              Labels: Docker
>         Attachments: YARN-7654.001.patch, YARN-7654.002.patch, YARN-7654.003.patch, YARN-7654.004.patch,
YARN-7654.005.patch, YARN-7654.006.patch, YARN-7654.007.patch, YARN-7654.008.patch, YARN-7654.009.patch,
YARN-7654.010.patch, YARN-7654.011.patch, YARN-7654.012.patch, YARN-7654.013.patch, YARN-7654.014.patch,
YARN-7654.015.patch, YARN-7654.016.patch, YARN-7654.017.patch, YARN-7654.018.patch, YARN-7654.019.patch,
YARN-7654.020.patch, YARN-7654.021.patch
>
>
> Docker image may have ENTRY_POINT predefined, but this is not supported in the current
implementation.  It would be nice if we can detect existence of {{launch_command}} and base
on this variable launch docker container in different ways:
> h3. Launch command exists
> {code}
> docker run [image]:[version]
> docker exec [container_id] [launch_command]
> {code}
> h3. Use ENTRY_POINT
> {code}
> docker run [image]:[version]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message