hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma" <jssa...@facebook.com>
Subject RE: speculative task execution and writing side-effect files
Date Wed, 23 Jan 2008 06:07:15 GMT
while there is a willing audience - will take a moment to crib about speculative execution
(i did try to put these in a jira as well):

- currently speculative execution is focused on reducing task latency - and does not care
for cluster efficiency. in a busy cluster, current speculative execution causes dramatic drop
in efficiency as tasks are launched needlessly. To wit:

- we find reduces being speculatively executed almost all the time (current settings are too
aggressive)
- speculative execution does not consider the state of the cluster (busy/idle) while spawning
extra tasks
- redundant tasks are not killed aggressively enough (why keep duplicate tasks running when
both are progressing at reasonable speed?)

i am also not terribly sure about the progress counter on which speculative execution is based.
with compressed map outputs - the reduce progress counter goes above 100% and then back to
0 (this is not fixed in 0.14.4 at least) - and i don't understand what impact this has on
the (progress - averageProgress) criteria for launching speculative tasks.

the two biggest problems we have had with job latency (and i am sure different people have
different experiences) - is that tasks get stuck in:
a) 'Initializing' state with 0% progress
b) reduce copy speeds are inexplicably slow at times
in both these cases, restarting tasks helps - but i would much rather code in special hooks
for detecting these conditions rather than turn on speculative execution in general. not elegant,
not googlish, but practical.

ironically - when people care about job latency (daytime) - the cluster is really busy (and
hence speculative execution generally hurts) and when people don't care about job latency
(nighttime - batch jobs) - the cluster is relatively idle (and we could afford speculative
execution - but it would serve no purpose).

perhaps i am totally off - would like to learn about other people's experience.


-----Original Message-----
From: Devaraj Das [mailto:ddas@yahoo-inc.com]
Sent: Tue 1/22/2008 8:22 PM
To: core-user@hadoop.apache.org
Subject: RE: speculative task execution and writing side-effect files
 
> 1. In what situation would speculative task execution  kick 
> in if it's enabled

It would be based on tasks' progress. A speculative instance of a running
task is launched if the task is question is lagging behind the others in
terms of progress it has made. It also depends on whether there are
available slots in the cluster to execute speculative tasks (in addition to
the regular tasks).

> 2. how much performance gain we can 
> generally expect from enabling of this feature. 

This depends on the cluster. Speculative execution comes handy when, for
some reason (maybe transient or permanent), some nodes are slower than the
others in executing tasks. Without speculative execution jobs using those
nodes might have a long tail. With speculative execution, there is a good
chance that speculative tasks would be launched on some healthy nodes and
they run to completion faster.

> 3. If I want to write out side-effect files named with unque 
> names per task-attempt in the directory other than 
> ${mapred.output.dir}/_${taskid},  would framework discard 
> files attemped by unsuccessful task attempts?
> 4. If I write files into subdirectories of 
> ${mapred.output.dir}/_${taskid} (e.g. 
> ${mapred.output.dir}/_${taskid}/${sub_dir}),  would framework 
> take care of promoting ${sub_dir} to ${mapred.output.dir}?

Yes to both.

Devaraj

> -----Original Message-----
> From: Eric Zhang [mailto:ezhang@yahoo-inc.com] 
> Sent: Wednesday, January 23, 2008 7:21 AM
> To: core-user@hadoop.apache.org
> Subject: speculative task execution and writing side-effect files
> 
> I tried to find more details on speculative task execution on hadoop 
> site and mailing archive, but it didn't seem to get explained 
> a lot.   
> I'd appreciate if anybody can help me on following related questions:
> 1. In what situation would speculative task execution  kick 
> in if it's enabled 2. how much performance gain we can 
> generally expect from enabling of this feature. 
> 3. If I want to write out side-effect files named with unque 
> names per task-attempt in the directory other than 
> ${mapred.output.dir}/_${taskid},  would framework discard 
> files attemped by unsuccessful task attempts?
> 4. If I write files into subdirectories of 
> ${mapred.output.dir}/_${taskid} (e.g. 
> ${mapred.output.dir}/_${taskid}/${sub_dir}),  would framework 
> take care of promoting ${sub_dir} to ${mapred.output.dir}?
> 
> Thanks a lot,
> 
> Eric
> 



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message