hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HADOOP-12830) Bash environment for quick command operations
Date Mon, 22 Feb 2016 01:45:18 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15156363#comment-15156363
] 

Allen Wittenauer edited comment on HADOOP-12830 at 2/22/16 1:45 AM:
--------------------------------------------------------------------

* Take a look at https://wiki.apache.org/hadoop/UnixShellScriptProgrammingGuide for some hints
on how the .sh file should be written.  (e.g., HSH_ should be HADOOP_, use the various shell
functions instead of duplicated code for stop/start/etc, declaring vars in the middle of the
code, not declaring some vars at all,  ... ). 

* As you acknowledge, this won't work on anything but LInux.  So this should fail gracefully
rather than spew errors all over the screen.

* This looks like it has a pretty massive security hole.  Anyone writing to the fifo (e.g.,
anyone with root) will be able to execute commands as the person who opened it.  To me, this
is pretty much an instant -1.

* Use "$\{BASH_SOURCE-$0\}" coupled with a bash regex here to cut the extra fork and to work
when executed directly with bash -x:

{code}
+# if this file is executed, start the shell
+if [[ $(basename $0) == "hadoop-shell.sh" ]]; then
{code}

* Instead of using "which", use "command" here:
{code}
+  if [[ -z $(which hadoop) ]]; then
{code}

* I don't think there is any guarantee that HADOOP_PREFIX has been defined at this point or
even point to the correct hadoop command. (There are a lot of reasons why, too many to go
into here.)

{code}
 +    export PATH=${HADOOP_PREFIX}/bin:${PATH}
{code}



was (Author: aw):
* Take a look at https://wiki.apache.org/hadoop/UnixShellScriptProgrammingGuide for some hints
on how the .sh file should be written.  (e.g., HSH_ should be HADOOP_, use the various shell
functions instead of duplicated code for stop/start/etc, declaring vars in the middle of the
code, not declaring some vars at all,  ... ). 

* As you acknowledge, this won't work on anything but LInux.  So this should fail gracefully
rather than spew errors all over the screen.

* This looks like it has a pretty massive security hole.  Anyone writing to the fifo (e.g.,
anyone with root) will be able to execute commands as the person who opened it.  To me, this
is pretty much an instant -1.

* Use "${BASH_SOURCE-$0}" coupled with a bash regex here to cut the extra fork and to work
when executed directly with bash -x:

{code}
+# if this file is executed, start the shell
+if [[ $(basename $0) == "hadoop-shell.sh" ]]; then
{code}

* Instead of using "which", use "command" here:
{code}
+  if [[ -z $(which hadoop) ]]; then
{code}

* I don't think there is any guarantee that HADOOP_PREFIX has been defined at this point or
even point to the correct hadoop command. (There are a lot of reasons why, too many to go
into here.)

{code}
 +    export PATH=${HADOOP_PREFIX}/bin:${PATH}
{code}


> Bash environment for quick command operations
> ---------------------------------------------
>
>                 Key: HADOOP-12830
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12830
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: bin
>            Reporter: Kazuho Fujii
>            Assignee: Kazuho Fujii
>         Attachments: HADOOP-12830.001.patch
>
>
> Hadoop file system shell commands are slow. This issue is about building a shell environment
for quick command operations.
> Previously an interactive shell is tried to build in HADOOP-6541. But, it seems to be
poor because users are used to powerful shells like bash. This issue is not about creating
a new shell, but just opening a new bash process. Therefore, user can operate commands as
before.
> {code}
> fjk@x240:~/hadoop-2.7.2$ ./bin/hadoop shell
> fjk@x240 hadoop> hadoop fs -ls /
> Found 2 items
> -rw-r--r--   3 fjk supergroup          0 2016-02-21 00:26 /file1
> -rw-r--r--   3 fjk supergroup          0 2016-02-21 00:26 /file2
> {code}
> The shell has a mini daemon process that is living until the shell is closed. The hadoop
fs command delegates the operation to the daemon. They communicate with named pipes. The daemon
conducts the operation and returns the result to the command.
> In this shell the hadoop fs commands operation becomes quick. In a local environment,
"hadoop fs -ls" command is about 100 times faster than the normal command.
> {code}
> fjk@x240 hadoop> time hadoop fs -ls hdfs://localhost:8020/ > /dev/null
> real	0m0.021s
> user	0m0.003s
> sys	0m0.011s
> {code}
> Using bash's function, commands and file names are automatically completed.
> {code}
> fjk@x240 hadoop> hadoop fs -ch<TAB><TAB>
> -checksum  -chgrp     -chmod     -chown
> fjk@x240 hadoop> hadoop fs -ls /file<TAB><TAB>
> /file1  /file2  /file3
> {code}
> Additionally, we can make equivalents with bash build-in commands, e.g., cd, umask. In
this shell, they can work because the daemon remembers the state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message