hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohith (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-5734) Reducer preemption does not happen if node is blacklisted, intern job get hanged.
Date Fri, 24 Jan 2014 08:21:39 GMT
Rohith created MAPREDUCE-5734:
---------------------------------

             Summary: Reducer preemption does not  happen if node is blacklisted, intern job
get hanged.
                 Key: MAPREDUCE-5734
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5734
             Project: Hadoop Map/Reduce
          Issue Type: Bug
    Affects Versions: 2.2.0
         Environment: SuSE 11 SP2 + Hadoop-2.3 
            Reporter: Rohith


There are 4 NodeManagers with 8GB each.Total cluster capacity is 32GB.Cluster slow start is
set to 1.

Job is running reducer task occupied 29GB of cluster.One NodeManager(NM-4) is become unstable(3
Map got killed), MRAppMaster blacklisted unstable NodeManager(NM-4). All reducer task are
running in cluster now.

MRAppMaster does not preempt the reducers because for Reducer preemption calculation, headRoom
is considering blacklisted nodes memory. This makes jobs to hang forever(ResourceManager does
not assing any new containers on blacklisted nodes but returns availableResouce considers
cluster free memory). 





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message