hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-8207) Docker container launch use popen have risk of shell expansion
Date Fri, 04 May 2018 14:25:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463048#comment-16463048
] 

Jason Lowe edited comment on YARN-8207 at 5/4/18 2:24 PM:
----------------------------------------------------------

Thanks for updating the patch!  It's looking much better.  I didn't finish getting through
the patch, but here's what I have so far.

MAX_RETRIES is unused and should be removed.

chosen_container_log_dir and init_log_dir are not used and should be removed.  In doing so
we'll need to go back to freeing container_log_dir directly.

Nit: It would improve readability to use a typedef for the struct so we don't have to keep
putting "struct" everywhere it's used, e.g.:
{code}
typedef struct args {
  [...]
} args_t;
{code}
or
{code}
typedef struct args args_t;
{code}

Nit: "out" is an odd name for a field in the args struct for the array of arguments.  Maybe
just "args"?  Similarly maybe "index" should be something like "num_args" since that's what
it is representing in the structure.

run_docker frees the vector of arguments but not the arguments themselves.

The following comment was updated but the permissions still appear to be 0700 in practice
(and should be all that is required)?
{noformat}
-  // Copy script file with permissions 700
+  // Copy script file with permissions 751
{noformat}

If the {{fork}} fails in launch_docker_container_as_user it would be good to print strerror(errno)
to the error file so there's a clue as to the nature of the error.

Is there a reason not to use stpcpy in flatten?  It would simplify it quite a bit and eliminate
the pointer arithmetic.

util.c has only whitespace changes.

docker-util.c added limits.h, but I don't think it was necessary.

add_to_args should be checking >= DOCKER_ARGS_MAX otherwise it will allow one more arg
than the buffer can hold.

add_to_args silently ignores (and leaks) the cloned argument if args->out is NULL.  args->out
should not be null in practice.  If a null check is deemed useful then it should be at the
beginning before work is done and return an error to indicate the arg was not added.   In
any case it should only increment the arg count when an argument was added.

free_args should set the argument count back to zero in case someone wants to reuse the structure
to build another argument list.

free_args should null out each argument pointer after it frees it.  The {{args->out[i]
= NULL}} statement should be inside the {{for}} loop or it just nulls out the element after
the last which isn't very useful.

This previous comment still applies.  Even though we are adjusting the index the arg is not
freed and nulled so it will end up being sent as an argument if the args buffer is subsequently
used (unless another argument is appended and smashes it).
bq. add_param_to_command_if_allowed is trying to reset the index on errors, but it fails to
re-NULL out the written index values (if there were any). Either we should assume all errors
are fatal and therefore the buffer doesn't need to be reset or the reset logic needs to be
fixed.

Why do many of the get docker command functions smash the args count to zero?  IMHO the caller
should be responsible for initializing the args structure.  At best the get docker command
functions should be calling free_args rather than smashing the arg count, otherwise they risk
leaking any arguments that were filled in.

get_docker_volume_command cleans up the arg structure on error but many other get_docker_\*_command
functions simply return with the args partially initialized on error.  This should be consistent.

get_docker_load_command should unconditionally call {{free(docker)}} then check the return
code for error since both code paths always call {{free(docker)}}.  Similar comment for get_docker_rm_command,
get_docker_stop_command, get_docker_kill_command, get_docker_run_command, 

get_docker_kill_command allocates a 40-byte buffer to signal_buffer then immediately leaks
it.

Why does add_mounts cast string literals to (char*)?  It compiles for me with ro_suffix remaining
const char*.  If for some reason they need to be char* to call make_string then it would be
simpler to cast it at the call point rather than at each initialization.

Nit: The end of get_mount_source should be simplified to just {{return strndup(mount, len);}}

reset_args should call free_args or old arguments will be leaked.

reset_args does not clear the memory that was allocated, so when we try to use args->out
as an array terminated by a NULL pointer when we call exec it may not actually be properly
terminated.  It should call calloc instead of malloc.

reset_args needs to allocate DOCKER_ARG_MAX + 1 pointers in order to hold DOCKER_ARG_MAX arguments
and still leave room for the NULL pointer terminator.

make_string does not check for vsnprintf returning an error.



was (Author: jlowe):
Thanks for updating the patch!  It's looking much better.  I didn't finish getting through
the patch, but here's what I have so far.

MAX_RETRIES is unused and should be removed.

chosen_container_log_dir and init_log_dir are not used and should be removed.  In doing so
we'll need to go back to freeing container_log_dir directly.

Nit: It would improve readability to use a typedef for the struct so we don't have to keep
putting "struct" everywhere it's used, e.g.:
{code}
typedef struct args {
  [...]
} args_t;
{code}
or
{code}
typedef struct args args_t;
{code}

Nit: "out" is an odd name for a field in the args struct for the array of arguments.  Maybe
just "args"?  Similarly maybe "index" should be something like "num_args" since that's what
it is representing in the structure.

run_docker frees the vector of arguments but not the arguments themselves.

The following comment was updated but the permissions still appear to be 0700 in practice
(and should be all that is required)?
{noformat}
-  // Copy script file with permissions 700
+  // Copy script file with permissions 751
{noformat}

If the {{fork}} fails in launch_docker_container_as_user it would be good to print strerror(errno)
to the error file so there's a clue as to the nature of the error.

Is there a reason not to use stpcpy in flatten?  It would simplify it quite a bit and eliminate
the pointer arithmetic.

util.c has only whitespace changes.

docker-util.c added limits.h, but I don't think it was necessary.

add_to_args should be checking >= DOCKER_ARGS_MAX otherwise it will allow one more arg
than the buffer can hold.

add_to_args silently ignores (and leaks) the cloned argument if args->out is NULL.  args->out
should not be null in practice.  If a null check is deemed useful then it should be at the
beginning before work is done and return an error to indicate the arg was not added.   In
any case it should only increment the arg count when an argument was added.

free_args should set the argument count back to zero in case someone wants to reuse the structure
to build another argument list.

free_args should null out each argument pointer after it frees it.  The {{args->out[i]
= NULL}} statement should be inside the {{for}} loop or it just nulls out the element after
the last which isn't very useful.

This previous comment still applies.  Even though we are adjusting the index the arg is not
freed and nulled so it will end up being sent as an argument if the args buffer is subsequently
used (unless another argument is appended and smashes it).
bq. add_param_to_command_if_allowed is trying to reset the index on errors, but it fails to
re-NULL out the written index values (if there were any). Either we should assume all errors
are fatal and therefore the buffer doesn't need to be reset or the reset logic needs to be
fixed.

Why do many of the get_docker_*_command functions smash the args count to zero?  IMHO the
caller should be responsible for initializing the args structure.  At best the get_docker_*_command
functions should be calling free_args rather than smashing the arg count, otherwise they risk
leaking any arguments that were filled in.

get_docker_volume_command cleans up the arg structure on error but many other get_docker_*_command
functions simply return with the args partially initialized on error.  This should be consistent.

get_docker_load_command should unconditionally call {{free(docker)}} then check the return
code for error since both code paths always call {{free(docker)}}.  Similar comment for get_docker_rm_command,
get_docker_stop_command, get_docker_kill_command, get_docker_run_command, 

get_docker_kill_command allocates a 40-byte buffer to signal_buffer then immediately leaks
it.

Why does add_mounts cast string literals to (char*)?  It compiles for me with ro_suffix remaining
const char*.  If for some reason they need to be char* to call make_string then it would be
simpler to cast it at the call point rather than at each initialization.

Nit: The end of get_mount_source should be simplified to just {{return strndup(mount, len);}}

reset_args should call free_args or old arguments will be leaked.

reset_args does not clear the memory that was allocated, so when we try to use args->out
as an array terminated by a NULL pointer when we call exec it may not actually be properly
terminated.  It should call calloc instead of malloc.

reset_args needs to allocate DOCKER_ARG_MAX + 1 pointers in order to hold DOCKER_ARG_MAX arguments
and still leave room for the NULL pointer terminator.

make_string does not check for vsnprintf returning an error.


> Docker container launch use popen have risk of shell expansion
> --------------------------------------------------------------
>
>                 Key: YARN-8207
>                 URL: https://issues.apache.org/jira/browse/YARN-8207
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn-native-services
>    Affects Versions: 3.0.0, 3.1.0, 3.0.1, 3.0.2
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>            Priority: Major
>              Labels: Docker
>         Attachments: YARN-8207.001.patch, YARN-8207.002.patch, YARN-8207.003.patch, YARN-8207.004.patch,
YARN-8207.005.patch
>
>
> Container-executor code utilize a string buffer to construct docker run command, and
pass the string buffer to popen for execution.  Popen spawn a shell to run the command.  Some
arguments for docker run are still vulnerable to shell expansion.  The possible solution is
to convert from char * buffer to string array for execv to avoid shell expansion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message