hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hitesh Shah (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-3240) NM should send a SIGKILL for completed containers also
Date Sun, 23 Oct 2011 02:19:32 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hitesh Shah updated MAPREDUCE-3240:
-----------------------------------

    Attachment: MR-3240.wip.patch

Patch does the following: 

- introduced sending a sigterm followed by a sigkill when cleaning up a container
  - new config settings introduced for the delay between sigterm and sigkill 

- introduced activeContainers within the ContainerExecutor. Used by the launcher to set whether
a container should be launched or not. If cleanup is called before the process starts, this
flag ensures that the process is never started. Addresses race-kill issue in MR-3084 

- Getting the pid after the shell executor  has completed is unreliable so now task.sh writes
the pid into a local file which can be read by the containerlauncher and used to kill the
process. 
 


                
> NM should send a SIGKILL for completed containers also
> ------------------------------------------------------
>
>                 Key: MAPREDUCE-3240
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3240
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.0
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Hitesh Shah
>         Attachments: MR-3240.wip.patch
>
>
> This is to address the containers which exit properly after spawning sub-processes themselves.
We don't want to leave these sub-process-tree or else they can pillage the NM's resources.
> Today, we already have code to send SIGKILL to the whole process-trees (because of single
sessionId resulting from  setsid) when the container is alive. We need to obtain the PID of
the containers when they start and use that PID to send signal for completed containers' case
also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message