hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1345) JobTracker is slowed down because it forks subprocesses to do a df command
Date Wed, 30 Dec 2009 01:30:29 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795215#action_12795215
] 

dhruba borthakur commented on MAPREDUCE-1345:
---------------------------------------------

This problem becomes acute when the JT is configured with more than 24GB of heap space and
a new job arrives once every 5 seconds or so.

On most unix-y systems, one can scan /proc/diskstats to determine the amount of disk space
used for each pf the local dirs.

> JobTracker is slowed down because it forks subprocesses to do a df command
> --------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1345
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1345
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: dhruba borthakur
>            Assignee: Scott Chen
>
> The JobTracker periodically does a df on the local directories. It forks a shell a shell
to run a df command. The creation of the separate process is very slow because the process
address space is copied by the OS on every subprocess creation. This becomes worse when the
JT is configured to use a large heap space. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message