hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amareshwari Sriramadasu (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-5198) NPE in Shell.runCommand()
Date Fri, 13 Feb 2009 09:13:59 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-5198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Amareshwari Sriramadasu updated HADOOP-5198:
--------------------------------------------

    Attachment: patch-5198.txt

Attaching patch with not-null check for pid, before passing it to kill-process.

The NPE occurs when jvmIdToRunner map contains the runner and pid file is already cleanedup.
The scenario: successful tasks (reducers or cleanup attempts) cleanup their files as early
as possible and jvmIdToRunner entry is deleted in updateOnJvmExit. If a LaunchTaskAction comes
before updateOnJvmExit call, reapJvm for new task still finds the jvmRunner and tries to kill
it, thereby NPE for the pid. 
So, one solution is reapJvm need not kill the process when pid is null, because the task has
reported done and jvm is on the way to exit already (in all the cases)

Thoughts?

> NPE in Shell.runCommand()
> -------------------------
>
>                 Key: HADOOP-5198
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5198
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred, util
>    Affects Versions: 0.21.0
>            Reporter: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: patch-5198.txt
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> I have seen one of the task failures with following exception:
> java.lang.NullPointerException
> 	at java.lang.ProcessBuilder.start(ProcessBuilder.java:441)
> 	at org.apache.hadoop.util.Shell.runCommand(Shell.java:149)
> 	at org.apache.hadoop.util.Shell.run(Shell.java:134)
> 	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:286)
> 	at org.apache.hadoop.util.ProcessTree.isAlive(ProcessTree.java:244)
> 	at org.apache.hadoop.util.ProcessTree.sigKillInCurrentThread(ProcessTree.java:67)
> 	at org.apache.hadoop.util.ProcessTree.sigKill(ProcessTree.java:115)
> 	at org.apache.hadoop.util.ProcessTree.destroyProcessGroup(ProcessTree.java:164)
> 	at org.apache.hadoop.util.ProcessTree.destroy(ProcessTree.java:180)
> 	at org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.kill(JvmManager.java:377)
> 	at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:249)
> 	at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:113)
> 	at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:76)
> 	at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:411)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message