hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pedro Costa <psdc1...@gmail.com>
Subject Re: What's speculative tasks
Date Mon, 26 Jul 2010 08:33:51 GMT
For the second question:

I'm running the wordcount example, bin/hadoop jar
hadoop-0.20.2-examples.jar wordcount gutenberg gutenberg-output, but
the directory gutenberg has no files. In my case, the execution of the
program blocks and nothing is done. Should this happen?

On Mon, Jul 26, 2010 at 5:21 AM, Hemanth Yamijala <yhemanth@gmail.com> wrote:
> Pedro,
> On Sun, Jul 25, 2010 at 7:33 PM, Pedro Costa <psdc1978@gmail.com> wrote:
> > Hi,
> >
> > - In hadoop MR it's used the term speculative tasks. What is speculative
> > tasks?
> >
> When the MR framework detects that some tasks are running slower than
> others in the job, it has an option to launch duplicates of those
> tasks on different nodes from the original ones, with the hope that
> they would complete sooner than the original slow tasks. The
> motivation for this feature is that it has been found that every job
> has 'stragglers' - a small percentage of tasks that are significantly
> slower than the rest of them and these slow down the overall execution
> time of the job. Typically these stragglers come around due to bad
> hardware.
> > - During the execution of a MR test, when we don't have splits to attribute
> > to reduce tasks, those reduce tasks will run? For example, if I set that
> > will run 6 reduce tasks and I don't have splits during the running of the
> > example, the reduce tasks will run? If so, where is verified that a reduce
> > task has a split assigned?
> Splits are related to map tasks, not reduce tasks. Reduce tasks get
> their input from the output of map tasks that is generated and stored
> in an intermediate fashion on the compute nodes. Can you clarify what
> you are looking for, with this context ?
> Thanks
> Hemanth


View raw message