hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das" <d...@yahoo-inc.com>
Subject RE: speculative task execution and writing side-effect files
Date Wed, 23 Jan 2008 04:22:39 GMT
> 1. In what situation would speculative task execution  kick 
> in if it's enabled

It would be based on tasks' progress. A speculative instance of a running
task is launched if the task is question is lagging behind the others in
terms of progress it has made. It also depends on whether there are
available slots in the cluster to execute speculative tasks (in addition to
the regular tasks).

> 2. how much performance gain we can 
> generally expect from enabling of this feature. 

This depends on the cluster. Speculative execution comes handy when, for
some reason (maybe transient or permanent), some nodes are slower than the
others in executing tasks. Without speculative execution jobs using those
nodes might have a long tail. With speculative execution, there is a good
chance that speculative tasks would be launched on some healthy nodes and
they run to completion faster.

> 3. If I want to write out side-effect files named with unque 
> names per task-attempt in the directory other than 
> ${mapred.output.dir}/_${taskid},  would framework discard 
> files attemped by unsuccessful task attempts?
> 4. If I write files into subdirectories of 
> ${mapred.output.dir}/_${taskid} (e.g. 
> ${mapred.output.dir}/_${taskid}/${sub_dir}),  would framework 
> take care of promoting ${sub_dir} to ${mapred.output.dir}?

Yes to both.

Devaraj

> -----Original Message-----
> From: Eric Zhang [mailto:ezhang@yahoo-inc.com] 
> Sent: Wednesday, January 23, 2008 7:21 AM
> To: core-user@hadoop.apache.org
> Subject: speculative task execution and writing side-effect files
> 
> I tried to find more details on speculative task execution on hadoop 
> site and mailing archive, but it didn't seem to get explained 
> a lot.   
> I'd appreciate if anybody can help me on following related questions:
> 1. In what situation would speculative task execution  kick 
> in if it's enabled 2. how much performance gain we can 
> generally expect from enabling of this feature. 
> 3. If I want to write out side-effect files named with unque 
> names per task-attempt in the directory other than 
> ${mapred.output.dir}/_${taskid},  would framework discard 
> files attemped by unsuccessful task attempts?
> 4. If I write files into subdirectories of 
> ${mapred.output.dir}/_${taskid} (e.g. 
> ${mapred.output.dir}/_${taskid}/${sub_dir}),  would framework 
> take care of promoting ${sub_dir} to ${mapred.output.dir}?
> 
> Thanks a lot,
> 
> Eric
> 


Mime
View raw message