hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj K (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-3034) NM should act on a REBOOT command from RM
Date Thu, 09 Feb 2012 11:28:59 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Devaraj K updated MAPREDUCE-3034:
---------------------------------

    Target Version/s: 0.23.0, 0.24.0  (was: 0.24.0, 0.23.0)
              Status: Patch Available  (was: Open)

Thanks a lot Arun for looking into the patch.

bq. 1.set/get on isRebooted needs to be synchronized

I updated the patch with this change.

bq. 2.On reboot, we should be kill existing containers, if any?

I tested with the patch with/without having running containers in the NM. If any containers
are running, it will stop all those containers as part of NM service stop.

Thank you so much Eric for verifying and describing all the cases.
                
> NM should act on a REBOOT command from RM
> -----------------------------------------
>
>                 Key: MAPREDUCE-3034
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3034
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Devaraj K
>            Priority: Critical
>         Attachments: MAPREDUCE-3034-1.patch, MAPREDUCE-3034-2.patch, MAPREDUCE-3034-3.patch,
MAPREDUCE-3034-4.patch, MAPREDUCE-3034.patch, MR-3034.txt
>
>
> RM sends a reboot command to NM in some cases, like when it gets lost and rejoins back.
In such a case, NM should act on the command and reboot/reinitalize itself.
> This is akin to TT reinitialize on order from JT. We will need to shutdown all the services
properly and reinitialize - this should automatically take care of killing of containers,
cleaning up local temporary files etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message