hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikas Saha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4322) Fix command-line length abort issues on Windows
Date Tue, 26 Jun 2012 21:42:44 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401709#comment-13401709
] 

Bikas Saha commented on MAPREDUCE-4322:
---------------------------------------

1)TaskLog.java
{code}
if (s.length() > MAX_CMD_LINE_LENGTH) {
   throw new IOException("Command line length exceeds the OS limit " +
                         MAX_CMD_LINE_LENGTH);
}
{code}
Can you add something to the exception message about the actual command that is bad. It will
help in debugging which command is bad and also help in the next comment.

2)TestTaskLog.java
In the test, on the face of it, there seems to be no difference in the 2 times that captureOutAndError
is called. Verifying that setup failed in the first case and cmd failed in the second case
will help differentiate.

3)TestTaskLog.java
Would be good to actually use TaskLog.MAX_CMD_LINE_LENGTH so that if we change it then the
test captures that.

4)TestTaskLog.java
Why not directly call buildCommandLine() - the function we are actually testing instead of
captureOutAndError()? buildCommandLine() should be visible in the test because it would be
in the same package.

5)Would it be possible to refactor TaskLog.buildCommandLine() to reduce the number of Shell.WINDOWS
forks? It is getting hard to understand and error prone. e.g. the following code adds a new
command line (exec setsid) to the script but its length would get included with the length
of the actual cmd in the last check for MAX_CMD_LINE_LENGTH. Thats happens in Linux and it
does not matter but it makes the code readability hard and incorrect.
{code}
    if (tailLength > 0) {
      mergedCmd.append("(");
    } else if (ProcessTree.isSetsidAvailable && useSetSid 
        && !Shell.WINDOWS) {
      mergedCmd.append("exec setsid "); // <=== this is a new command line
    } else {
      if (!Shell.WINDOWS)
        mergedCmd.append("exec ");
    }
    // ...
    // add real cmd line
    // ...
    if (mergedCmd.length() - prevLength > MAX_CMD_LINE_LENGTH) {
      throw new IOException("Command line length exceeds the OS limit "
                            + MAX_CMD_LINE_LENGTH);
    }
{code}

                
> Fix command-line length abort issues on Windows
> -----------------------------------------------
>
>                 Key: MAPREDUCE-4322
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4322
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>         Environment: Windows, downstream applications with long aggregate classpaths
>            Reporter: John Gordon
>            Assignee: Ivan Mitic
>         Attachments: MAPREDUCE-4322-branch-1-win(2).patch, MAPREDUCE-4322-branch-1-win.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> When a task is started on the tasktracker, it creates a small batch file to invoke java
and runs that batch.  Within the batch file, the invocation of Java currently has -classpath
${CLASSPATH} inline to the command.  That line often exceeds 8000 characters.  This is ok
for most linux distributions because the line limit env variable is often set much higher
than this.  However, for Windows this cause cmd to abort execution.  This surfaces in Hadoop
as an unknown failure mode for the task.
> I think the easiest and most natural way to fix this is to push the -classpath option
into a config file to take the longest variable part of the line and put it somewhere that
scales better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message