hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-8207) Docker container launch use popen have risk of shell expansion
Date Tue, 08 May 2018 14:42:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467488#comment-16467488

Jason Lowe commented on YARN-8207:

{quote}Struct args is still evolving. I think it would be safer to keep the data structure
private for opaque data structure and deep copy to caller.
Having an opaque data structure that the caller must deep copy to use correctly doesn't make
sense. Deep copies by clients is the antithesis of opacity, since the client must intimately
understand the structure to properly perform the deep copy. Even if the structure isn't changing,
clients are likely to get this copy wrong (as has already occurred), especially if each use
needs it done separately. At a bare minimum there should be a utility method, e.g: {{extract_execv_args(args*
args)}}, that encapsulates the extraction of execv arguments so each use of args for execv
doesn't require a separate, tricky deep copy loop with a trailing NULL appended. That utility
method could also avoid the full deep copy and simply transfer ownership of the individual
strings to the shallow-copied vector that is returned. But in the end I think avoiding copies
completely by heap allocating the args is a much more efficient and elegant solution.

Please create an init function, e.g.: init_args(args* args), or a macro to encapsulate initialization
of the structure. If the structure is likely to change then it's important to put this in
one place rather than chase down every separate initialization code instance.

In {{flatten}} this code:
  to = '\0';
is setting the pointer to NULL rather than terminating the string with a NUL character. It
needs to be:
  *to = '\0';
{quote}At this time, add_to_args returns no opts to avoid having to check for null on make_string.
I'm not sure what is meant by "returns no opts to avoid having to check for null on make_string."
If add_to_args is passed a NULL string then it will segfault because the first statement in
add_to_args tries to dereference it:
static int add_to_args(args *args, const char *string) {
  if (!string[0]) {
{quote}I think the proposal of making the reverse change will add more null pointer check,
which makes the code harder to read again.
What I'm proposing would make the code easier to read without triggering the segfault. Most
uses of add_to_args look something like this:
  char* x = make_string(fmt, ...);
  ret = add_to_args(args, x);
  if (ret != 0) { ...
As add_to_args works today, the lack of a NULL check on the make_string result will cause
the program to crash. I'm proposing to add this check at the beginning of add_to_args:
static int add_to_args(args *args, const char *string) {
  if (string == NULL) {
    return -1;
Then the existing make_string followed by add_to_args code pattern works because add_to_args
will properly handle a NULL argument and return an error. That precludes the need to add NULL
checks everywhere make_string is called when immediately followed by add_to_args.  That improves
readability while avoiding a segfault.

The length != 0 check in reset_args is unnecessary.

This code:
              size_t offset = tmp_ptr - values[i] + 1;
              dst = (char *) alloc_and_clear_memory(offset, sizeof(char));
              strncpy(dst, values[i], offset);
              dst[tmp_ptr - values[i]] = '\0';
              pattern = (char *) alloc_and_clear_memory((size_t)(strlen(permitted_values[j])
- 5), sizeof(char));
              strcpy(pattern, permitted_values[j] + 6);
would be safer and easier to understand written with strdup/strndup, e.g.:
              dst = strndup(values[i], tmp_ptr - values[i]);
              pattern = strdup(permitted_values[j] + 6);
make_string is still not checking for vsnprintf failure. If the first vsnprintf fails and
returns -1, the code will allocate a 0-byte buffer. The second vsnprintf may then actually
succeed, returning the length required to store all of the string without updating the zero-length
buffer. Then make_string will "succeed" by returning an unterminated buffer.

It would be good to fix the tab character issue that's been flagged in the past few patches
(in create_container_directories).

> Docker container launch use popen have risk of shell expansion
> --------------------------------------------------------------
>                 Key: YARN-8207
>                 URL: https://issues.apache.org/jira/browse/YARN-8207
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn-native-services
>    Affects Versions: 3.0.0, 3.1.0, 3.0.1, 3.0.2
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>            Priority: Blocker
>              Labels: Docker
>         Attachments: YARN-8207.001.patch, YARN-8207.002.patch, YARN-8207.003.patch, YARN-8207.004.patch,
YARN-8207.005.patch, YARN-8207.006.patch, YARN-8207.007.patch, YARN-8207.008.patch
> Container-executor code utilize a string buffer to construct docker run command, and
pass the string buffer to popen for execution.  Popen spawn a shell to run the command.  Some
arguments for docker run are still vulnerable to shell expansion.  The possible solution is
to convert from char * buffer to string array for execv to avoid shell expansion.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message