hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathon Hare (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-2973) Logic for determining whether to create a new JVM can interfere with Capacity-Scheduler and JVM reuse
Date Fri, 09 Sep 2011 17:41:08 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101359#comment-13101359

Jonathon Hare commented on MAPREDUCE-2973:

We're actually using Cloudera's distribution (CDH3u1); I'm not sure how that matches to the
official Apache versions. 

> Logic for determining whether to create a new JVM can interfere with Capacity-Scheduler
and JVM reuse
> -----------------------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-2973
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2973
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.2
>         Environment: N/A
>            Reporter: Jonathon Hare
>            Priority: Minor
>         Attachments: hadoop-patch.txt
>   Original Estimate: 0h
>  Remaining Estimate: 0h
> We use the capacity scheduler to enable jobs with large memory requirements to be run
on our cluster. The individual tasks have a large initial overhead when they load cached data.
Using the JVM reuse option ({{mapred.job.reuse.jvm.num.tasks}}) and by caching data in a static
variable we can reduce the overhead. 
> The current {{JvmManager}} implementation will prefer creating new JVMs to reusing existing
ones if the number of already created JVMs is less than the maximum. In the extreme case where
the capacity scheduler is used to limit the number of tasks on a node to 1, but the number
of [map|reduce] tasks per node is set to say 16, then 16 JVMs will be created before one of
them is reused. Obviously, if the amount of cached data in the memory of each JVM is large,
then node can rapidly run out of memory! What should really happen in this case is that the
first created JVM should be reused, and others should not be spawned.
> To work-around this problem on our cluster, we have modified the logic in the {{reapJVM()}}
method in {{JvmManager}} to prefer to reuse an existing JVM (idle & belonging to the same
job) over starting a new JVM, or killing an existing idle JVM.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message