impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zoltan Ivanfi (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4006: impala-config.sh contains dangerous rm -rf statements
Date Mon, 22 Aug 2016 19:16:09 GMT
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-4006: impala-config.sh contains dangerous rm -rf statements
......................................................................


Patch Set 1:

(20 comments)

> (12 comments)
 > 
 > Thanks for cleaning things up. Not sure if there is a easier way to
 > fix this other than adding quote everywhere ?

I was deeply motivated to fix this after one of the scripts deleted my whole git checkout
including uncommited changes... Fortunately only a few. :)

Unfortunately there is no better fix to these problems than adding quotes. Please also see
http://www.dwheeler.com/essays/filenames-in-shell.html#doublequote

http://gerrit.cloudera.org:8080/#/c/4078/1/bin/clean.sh
File bin/clean.sh:

> It doesn't look like you quoted any instances of CLUSTER_DIR/IMPALA_HOME et
Done


Line 48: rm -rf "${IMPALA_LOGS_DIR}"/*
> Is this much safer if $IMPALA_LOGS_DIR is empty?
Thanks for spotting this. This is actually even easier to trigger than the one that hit me.


PS1, Line 49: $IMPALA_ALL_LOGS_DIRS
> Won't this line suffer from the same problem ?
Yes and no. On one hand, this is supposed to contain multiple log, so having multiple space-separated
values here is part of the normal workflow. On the other, that also means that any extra space
inside the directory names becomes undistinguishable from the separators. The problem is that
there is simply no safe way of storing multiple filenames in a single variable.

There are many other places in these scripts as well that do not handle special chars correctly
in filenames and some of them are very hard to fix. Refactoring may not be worth the effort
at this point, I just wanted to fix the most dangerous unquoted variable usages and while
doing so I also handled some other ones that I noticed and that were easy to fix.


PS1, Line 53: pushd $IMPALA_HOME/be
> Same problem ?
Done


PS1, Line 59: pushd $IMPALA_HOME/shell
> Same problem ?
Done


PS1, Line 71: rm -f $IMPALA_HOME/llvm-ir/impala*.ll
            : rm -f $IMPALA_HOME/be/generated-sources/impala-ir/*
> Are these lines also unsafe ?
Done


http://gerrit.cloudera.org:8080/#/c/4078/1/bin/impala-config.sh
File bin/impala-config.sh:

Line 39:     export IMPALA_HOME=$(dirname "$(cd $(dirname ${(%):-%x}) && pwd)")
> Why redirect for bash but not zsh?
Done


Line 120:   if [[ -z "$KUDU_BUILD_DIR" ]]; then
> Why quote KUDU_BUILD_DIR here but not above?
Done


http://gerrit.cloudera.org:8080/#/c/4078/1/bin/impala-python
File bin/impala-python:

Line 2: source "$(dirname "$0")/impala-python-common.sh"
> The nested quoting here seems strange - does this work as intended?
Yes, it does, because the quotes inside $() are not matched agains the quotes outside:

$ bash -x /home/ifi/git/Impala\ ASF/bin/impala-python 
++ dirname '/home/ifi/git/Impala ASF/bin/impala-python'
+ source '/home/ifi/git/Impala ASF/bin/impala-python-common.sh'


http://gerrit.cloudera.org:8080/#/c/4078/1/bin/run-all-tests.sh
File bin/run-all-tests.sh:

PS1, Line 101: LOG_DIR="${IMPALA_EE_TEST_LOGS_DIR}"
> Is this variable still used ?
Done


http://gerrit.cloudera.org:8080/#/c/4078/1/buildall.sh
File buildall.sh:

Line 248:     "$IMPALA_HOME/bin/clean.sh"
> Feel free to ignore but it should be sufficient to just have the quote arou
It's a matter of style, I usually include everything in the quotes unless there are wildcards
involved. Let me know if you prefer the other way.


Line 279: if [ -e $HADOOP_LZO/build/native/Linux-*-*/lib/libgplcompression.so ]
> Did you deliberately omit quoting HADOOP_LZO here?
Done


Line 281:   cp $HADOOP_LZO/build/native/Linux-*-*/lib/libgplcompression.* $HADOOP_HOME/lib/native
> What about HADOOP_HOME?
Done


PS1, Line 312: METASTORE_SNAPSHOT_FILE
> Quote this path too?
Done


PS1, Line 321: $IMPALA_LZO
> Same problem ?
Done


PS1, Line 323: $IMPALA_LZO
> Same problem ?
Done


PS1, Line 323: IMPALA_LZO
> Quote IMPALA_LZO?
Done


Line 400:   if [[ $SNAPSHOT_FILE  && $METASTORE_SNAPSHOT_FILE ]]; then
> Quote the SNAPSHOT_FILE paths?
Done


PS1, Line 407:  ${CREATE_LOAD_DATA_ARGS}
> I guess this will just expand the paths anyway so there may not be any poin
We can't quote it here, because it contains arguments, separated by spaces. This is another
example of a bash variable usage that can not be made safe. The script is doomed to fail at
this point with spec chars in the filenames and refactoring is both non-trivial and error-prone.


Line 429:   echo "   . ${IMPALA_HOME}/bin/impala-config.sh"
> quote IMPALA_HOME?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4078
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7503794180dee99eeb979e67f34e3b2edade70fe
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message