hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hemanth Yamijala <yhema...@gmail.com>
Subject Re: Speculative Execution and Streaming
Date Fri, 28 May 2010 09:57:58 GMT
Greg,

> Does anybody know whether or not speculative execution works with Hadoop
> streaming?
>
> If so, I have a script that does not appear to ever launch redundant mappers
> for the slow performers. This may be due to the fact that each mapper
> quickly reports (inaccurately) that it is 100% complete. I am using the
> NLineInputFormat and each mapper gets 17 lines of input. Each line requires
> a lot of computation. It appears that all 17 lines immediately get counted
> as being processed early on. Is there anyway to report or force accurate
> completion stats? Could this explain why speculative execution never gets
> triggered?
>

I am wondering if you are hitting
https://issues.apache.org/jira/browse/MAPREDUCE-1073.

In M/R pipes jobs, the map task progress moves to 100% as soon as the
input is read, because the processing happens asynchronously. As
Sreekanth notes, this would result in speculation not working as
expected.

Thanks
Hemanth

Mime
View raw message