hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "shanyu zhao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10245) Hadoop command line always appends "-Xmx" option twice
Date Tue, 21 Jan 2014 22:02:21 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13877905#comment-13877905

shanyu zhao commented on HADOOP-10245:

[~ywskycn] Thank you for your comment! If we remove -Xmx512m from HADOOP_CLIENT_OPTS in hadoop_env.cmd,
there will be one and only one -Xmx, which is the $JAVA_HEAP_MAX in bin/hadoop. 

HADOOP-9870 may have solved the problem for you, but I think the fix in HADOOP-9870 might
be too complicated and hard to maintain. For example, what about user use "-Xmx" in HADOOP_OPTS
instead of HADOOP_CLIENT_OPTS? I think we should avoid using HADOOP_CLIENT_OPTS or HADOOP_OPTS
to specify memory, because the fact that we've defined HADOOP_HEAPSIZE but not using it for
memory specification is confusing. If you want to change heap size, just change HADOOP_HEAPSIZE,
I think this is simple and clear. Thoughts?

> Hadoop command line always appends "-Xmx" option twice
> ------------------------------------------------------
>                 Key: HADOOP-10245
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10245
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: bin
>    Affects Versions: 2.2.0
>            Reporter: shanyu zhao
>            Assignee: shanyu zhao
>         Attachments: HADOOP-10245.patch
> The Hadoop command line scripts (hadoop.sh or hadoop.cmd) will call java with "-Xmx"
options twice. The impact is that any user defined HADOOP_HEAP_SIZE env variable will take
no effect because it is overwritten by the second "-Xmx" option.
> For example, here is the java cmd generated for command "hadoop fs -ls /", Notice that
there are two "-Xmx" options: "-Xmx1000m" and "-Xmx512m" in the command line:
> java -Xmx1000m  -Dhadoop.log.dir=C:\tmp\logs -Dhadoop.log.file=hadoop.log -Dhadoop.root.logger=INFO,c
> onsole,DRFA -Xmx512m  -Dhadoop.security.logger=INFO,RFAS -classpath XXX org.apache.hadoop.fs.FsShell
-ls /
> Here is the root cause:
> The call flow is: hadoop.sh calls hadoop_config.sh, which in turn calls hadoop-env.sh.

> In hadoop.sh, the command line is generated by the following pseudo code:
> java $JAVA_HEAP_MAX $HADOOP_CLIENT_OPTS -classpath ...
> In hadoop-config.sh, $JAVA_HEAP_MAX is initialized as "-Xmx1000m" if user didn't set
$HADOOP_HEAP_SIZE env variable.
> In hadoop-env.sh, $HADOOP_CLIENT_OPTS is set as this:
> To fix this problem, we should remove the "-Xmx512m" from HADOOP_CLIENT_OPTS. If we really
want to change the memory settings we need to use $HADOOP_HEAP_SIZE env variable.

This message was sent by Atlassian JIRA

View raw message