hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivan Mitic (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4322) Fix command-line length abort issues on Windows
Date Mon, 11 Jun 2012 19:41:42 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293011#comment-13293011

Ivan Mitic commented on MAPREDUCE-4322:

bq. TaskLog.java - Any special reasons to perform the command line length check multiple times
instead of once at the end of buildCommandLine()?
There are multiple lines that we want to execute as part of the taskjvm.cmd, and I am checking
the length of every line. Example taskjvm.cmd is the following:
set SHELL="cmd"...
C:\...\jre\bin\java ...

bq. the advantage with the -classpath argument was isolation of the classpath to the specific
spawned JVM. But by changing the classpath env var we risk changing it for every spawned process
too. Maybe thats not much of a problem.
I thought of this as well. As we are starting a separate bash/cmd for every task, this will
only apply to that task.

bq. What if CLASSPATH is already set on the machine? Will this append to it or override it?
From the code it looks like generating the classpath list will pick up the parent classpath.
So if CLASSPATH env var is already set then it will be part of classpath list via the parent
jvm (TaskTracket jvm). So even if the taskjvm.cmd sets the CLASSPATH it will be a superset
of any existing CLASSPATH env var. Can you please verify this by having a pre-existing CLASSPATH
Thanks, I just checked, and we do not include the system level CLASSPATH. However, the setting
itself seems to be exclusive, if you pass classpath via {{-classpath}}, the CLASSPATH environment
variable is ignored. Just tested this out with a sample app that prints {{System.getProperty("java.class.path")}}.
It generally makes sense to be specific in this case, and not to include the system setting
as this can generally cause problems with resolution. Also, there are ways Hadoop users can
specify custom classpaths if needed. Agree?
> Fix command-line length abort issues on Windows
> -----------------------------------------------
>                 Key: MAPREDUCE-4322
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4322
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>         Environment: Windows, downstream applications with long aggregate classpaths
>            Reporter: John Gordon
>            Assignee: Ivan Mitic
>         Attachments: MAPREDUCE-4322-branch-1-win.patch
>   Original Estimate: 12h
>  Remaining Estimate: 12h
> When a task is started on the tasktracker, it creates a small batch file to invoke java
and runs that batch.  Within the batch file, the invocation of Java currently has -classpath
${CLASSPATH} inline to the command.  That line often exceeds 8000 characters.  This is ok
for most linux distributions because the line limit env variable is often set much higher
than this.  However, for Windows this cause cmd to abort execution.  This surfaces in Hadoop
as an unknown failure mode for the task.
> I think the easiest and most natural way to fix this is to push the -classpath option
into a config file to take the longest variable part of the line and put it somewhere that
scales better.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message