hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das" <d...@yahoo-inc.com>
Subject RE: speculative task execution and writing side-effect files
Date Wed, 23 Jan 2008 19:24:59 GMT
Some of these utilization issues you raised should be addressed better when
we implement some of the global scheduler ideas discussed in - hadoop-2491,
2510 and 2573.

Please raise jiras for other issues if you see fit.

Devaraj

> -----Original Message-----
> From: Joydeep Sen Sarma [mailto:jssarma@facebook.com] 
> Sent: Wednesday, January 23, 2008 11:37 AM
> To: core-user@hadoop.apache.org; core-user@hadoop.apache.org
> Subject: RE: speculative task execution and writing side-effect files
> 
> while there is a willing audience - will take a moment to 
> crib about speculative execution (i did try to put these in a 
> jira as well):
> 
> - currently speculative execution is focused on reducing task 
> latency - and does not care for cluster efficiency. in a busy 
> cluster, current speculative execution causes dramatic drop 
> in efficiency as tasks are launched needlessly. To wit:
> 
> - we find reduces being speculatively executed almost all the 
> time (current settings are too aggressive)
> - speculative execution does not consider the state of the 
> cluster (busy/idle) while spawning extra tasks
> - redundant tasks are not killed aggressively enough (why 
> keep duplicate tasks running when both are progressing at 
> reasonable speed?)
> 
> i am also not terribly sure about the progress counter on 
> which speculative execution is based. with compressed map 
> outputs - the reduce progress counter goes above 100% and 
> then back to 0 (this is not fixed in 0.14.4 at least) - and i 
> don't understand what impact this has on the (progress - 
> averageProgress) criteria for launching speculative tasks.
> 
> the two biggest problems we have had with job latency (and i 
> am sure different people have different experiences) - is 
> that tasks get stuck in:
> a) 'Initializing' state with 0% progress
> b) reduce copy speeds are inexplicably slow at times in both 
> these cases, restarting tasks helps - but i would much rather 
> code in special hooks for detecting these conditions rather 
> than turn on speculative execution in general. not elegant, 
> not googlish, but practical.
> 
> ironically - when people care about job latency (daytime) - 
> the cluster is really busy (and hence speculative execution 
> generally hurts) and when people don't care about job latency 
> (nighttime - batch jobs) - the cluster is relatively idle 
> (and we could afford speculative execution - but it would 
> serve no purpose).
> 
> perhaps i am totally off - would like to learn about other 
> people's experience.
> 
> 
> -----Original Message-----
> From: Devaraj Das [mailto:ddas@yahoo-inc.com]
> Sent: Tue 1/22/2008 8:22 PM
> To: core-user@hadoop.apache.org
> Subject: RE: speculative task execution and writing side-effect files
>  
> > 1. In what situation would speculative task execution  kick 
> in if it's 
> > enabled
> 
> It would be based on tasks' progress. A speculative instance 
> of a running task is launched if the task is question is 
> lagging behind the others in terms of progress it has made. 
> It also depends on whether there are available slots in the 
> cluster to execute speculative tasks (in addition to the 
> regular tasks).
> 
> > 2. how much performance gain we can
> > generally expect from enabling of this feature. 
> 
> This depends on the cluster. Speculative execution comes 
> handy when, for some reason (maybe transient or permanent), 
> some nodes are slower than the others in executing tasks. 
> Without speculative execution jobs using those nodes might 
> have a long tail. With speculative execution, there is a good 
> chance that speculative tasks would be launched on some 
> healthy nodes and they run to completion faster.
> 
> > 3. If I want to write out side-effect files named with 
> unque names per 
> > task-attempt in the directory other than 
> > ${mapred.output.dir}/_${taskid},  would framework discard files 
> > attemped by unsuccessful task attempts?
> > 4. If I write files into subdirectories of 
> > ${mapred.output.dir}/_${taskid} (e.g.
> > ${mapred.output.dir}/_${taskid}/${sub_dir}),  would framework take 
> > care of promoting ${sub_dir} to ${mapred.output.dir}?
> 
> Yes to both.
> 
> Devaraj
> 
> > -----Original Message-----
> > From: Eric Zhang [mailto:ezhang@yahoo-inc.com]
> > Sent: Wednesday, January 23, 2008 7:21 AM
> > To: core-user@hadoop.apache.org
> > Subject: speculative task execution and writing side-effect files
> > 
> > I tried to find more details on speculative task execution 
> on hadoop 
> > site and mailing archive, but it didn't seem to get explained
> > a lot.   
> > I'd appreciate if anybody can help me on following related 
> questions:
> > 1. In what situation would speculative task execution  kick 
> in if it's 
> > enabled 2. how much performance gain we can generally expect from 
> > enabling of this feature.
> > 3. If I want to write out side-effect files named with 
> unque names per 
> > task-attempt in the directory other than 
> > ${mapred.output.dir}/_${taskid},  would framework discard files 
> > attemped by unsuccessful task attempts?
> > 4. If I write files into subdirectories of 
> > ${mapred.output.dir}/_${taskid} (e.g.
> > ${mapred.output.dir}/_${taskid}/${sub_dir}),  would framework take 
> > care of promoting ${sub_dir} to ${mapred.output.dir}?
> > 
> > Thanks a lot,
> > 
> > Eric
> > 
> 
> 
> 


Mime
View raw message