hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1266) Allow heartbeat interval smaller than 3 seconds for tiny clusters
Date Tue, 09 Feb 2010 20:17:27 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831644#action_12831644
] 

Todd Lipcon commented on MAPREDUCE-1266:
----------------------------------------

bq. if you are using jvm reuse, then that 1s disappears, right? 

Not really, since JVM reuse doesn't reuse between maps and reduces.

The time sequence of a small job looks like:

Client:
  Submit job
JT:
  Create tasks ("initialize job") on JT
  wait for a TT to heartbeat
TT:
  start JVM
child:
  process map task
TT:
  send accelerated heartbeat once map task is complete (I forget whether this is in 0.20 or
came later)
  receive reduce task, start reduce JVM (regardless of JVM reuse)
child:
  process reduce task
TT:
  send completion heartbeat

I guess there are also some setup/cleanup tasks going on in there as well. Since we're talking
about a hypothetical one map, one reduce, we're just cutting down the time between initting
the job and getting the first JVM on a TT.

In a multimapper or multireducer job, the cost shows up in how long it takes for all of the
tasks to get scheduled - it will only schedule one task per heartbeat with some schedulers.
The fair scheduler after MAPREDUCE-706 can assign multiple at the same time, which should
help substantially.

> Allow heartbeat interval smaller than 3 seconds for tiny clusters
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-1266
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1266
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker, task, tasktracker
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Priority: Minor
>
> For small clusters, the heartbeat interval has a large effect on job latency. This is
especially true on pseudo-distributed or other "tiny" (<5 nodes) clusters. It's not a big
deal for production, but new users would have a happier first experience if Hadoop seemed
snappier.
> I'd like to change the minimum heartbeat interval from 3.0 seconds to perhaps 0.5 seconds
(but have it governed by an undocumented config parameter in case people don't like this change).
The cluster size-based ramp up of interval will maintain the current scalable behavior for
large clusters with no negative effect.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message