hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivan Mitic (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-4322) Fix command-line length abort issues on Windows
Date Thu, 07 Jun 2012 00:07:23 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ivan Mitic updated MAPREDUCE-4322:
----------------------------------

    Attachment: MAPREDUCE-4322-branch-1-win.patch

Attaching the patch.

The fix is to separate the classpath into an environment variable instead of passing it via
"java -classpath".

In some test cases we've seen command line length go slightly above 8192 characters what is
the Windows command line limit. ~4k goes into the classpath, and the rest goes on other command
line arguments. By separating out the classpath we now have plenty of room for other args.


The patch also introduces checks on the command length before it is executed, and surfaces
a nice error message if the length exceeds the limit. Otherwise, we would only see that the
child task exited with non 0 code, and we would not have any context on the reason for a failure.
                
> Fix command-line length abort issues on Windows
> -----------------------------------------------
>
>                 Key: MAPREDUCE-4322
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4322
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>         Environment: Windows, downstream applications with long aggregate classpaths
>            Reporter: John Gordon
>            Assignee: Ivan Mitic
>         Attachments: MAPREDUCE-4322-branch-1-win.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> When a task is started on the tasktracker, it creates a small batch file to invoke java
and runs that batch.  Within the batch file, the invocation of Java currently has -classpath
${CLASSPATH} inline to the command.  That line often exceeds 8000 characters.  This is ok
for most linux distributions because the line limit env variable is often set much higher
than this.  However, for Windows this cause cmd to abort execution.  This surfaces in Hadoop
as an unknown failure mode for the task.
> I think the easiest and most natural way to fix this is to push the -classpath option
into a config file to take the longest variable part of the line and put it somewhere that
scales better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message