hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Iván de Prado (JIRA) <j...@apache.org>
Subject [jira] [Resolved] (HADOOP-3420) Recover the deprecated mapred.tasktracker.tasks.maximum
Date Thu, 13 Dec 2012 16:38:13 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-3420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Iván de Prado resolved HADOOP-3420.
-----------------------------------

    Resolution: Won't Fix

Seems too old and not very relevant now.
                
> Recover the deprecated mapred.tasktracker.tasks.maximum
> -------------------------------------------------------
>
>                 Key: HADOOP-3420
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3420
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.16.0, 0.16.1, 0.16.2, 0.16.3, 0.16.4
>            Reporter: Iván de Prado
>
> https://issues.apache.org/jira/browse/HADOOP-1274 replaced the configuration attribute
mapred.tasktracker.tasks.maximum with mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum
because it sometimes make sense to have more mappers than reducers assigned to each node.
> But deprecating mapred.tasktracker.tasks.maximum could be an issue in some situations.
For example, when more than one job is running, reduce tasks + map tasks eat too many resources.
For avoid this cases an upper limit of tasks is needed. So I propose to have the configuration
parameter mapred.tasktracker.tasks.maximum as a total limit of task. It is compatible with
mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum.
> As an example:
> I have a 8 cores, 4GB, 4 nodes cluster. I want to limit the number of tasks per node
to 8. 8 tasks per nodes would use almost 100% cpu and 4 GB of the memory. I have set:
>   mapred.tasktracker.map.tasks.maximum -> 8
>   mapred.tasktracker.reduce.tasks.maximum -> 8 
> 1) When running only one Job at the same time, it works smoothly: 8 task average per
node, no swapping in nodes, almost 4 GB of memory usage and 100% of CPU usage. 
> 2) When running more than one Job at the same time, it works really bad: 16 tasks average
per node, 8 GB usage of memory (4 GB swapped), and a lot of System CPU usage.
> So, I think that have sense to restore the old attribute mapred.tasktracker.tasks.maximum
making it compatible with the new ones.
> Task trackers could not:
>  - run more than mapred.tasktracker.tasks.maximum tasks per node,
>  - run more than mapred.tasktracker.map.tasks.maximum mappers per node, 
>  - run more than mapred.tasktracker.reduce.tasks.maximum reducers per node. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message