hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Chen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1221) Kill tasks on a node if the free physical memory on that machine falls below a configured threshold
Date Fri, 05 Mar 2010 02:07:27 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841653#action_12841653
] 

Scott Chen commented on MAPREDUCE-1221:
---------------------------------------

@Arun: Sorry for the very late reply. Dhruba and I have been trying to call you but it seems
you are busy as well.

I think I got your point. The problem is that the bad job will never fail and its task gets
killed and rescheduled again and again which keeps hurting the cluster. So we should add per
task RSS limit in this patch so that we can fail the bad job. This is just like what we currently
do in the trunk for virtual memory. But we here offer the RSS memory limiting as an option
(a trade-off between memory utilization and stability).

I will make the change and resubmit the patch soon. Thanks again for the help.


> Kill tasks on a node if the free physical memory on that machine falls below a configured
threshold
> ---------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1221
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1221
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 0.22.0
>            Reporter: dhruba borthakur
>            Assignee: Scott Chen
>             Fix For: 0.22.0
>
>         Attachments: MAPREDUCE-1221-v1.patch, MAPREDUCE-1221-v2.patch, MAPREDUCE-1221-v3.patch
>
>
> The TaskTracker currently supports killing tasks if the virtual memory of a task exceeds
a set of configured thresholds. I would like to extend this feature to enable killing tasks
if the physical memory used by that task exceeds a certain threshold.
> On a certain operating system (guess?), if user space processes start using lots of memory,
the machine hangs and dies quickly. This means that we would like to prevent map-reduce jobs
from triggering this condition. From my understanding, the killing-based-on-virtual-memory-limits
(HADOOP-5883) were designed to address this problem. This works well when most map-reduce
jobs are Java jobs and have well-defined -Xmx parameters that specify the max virtual memory
for each task. On the other hand, if each task forks off mappers/reducers written in other
languages (python/php, etc), the total virtual memory usage of the process-subtree varies
greatly. In these cases, it is better to use kill-tasks-using-physical-memory-limits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message