hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cliff palmer <palmercl...@gmail.com>
Subject Re: Shuffle tasks getting killed
Date Thu, 23 Sep 2010 11:44:30 GMT
Aniket, I wonder if these tasks were run as Speculative Execution.  Have you
been able to determine whether the job runs successfully?

On Thu, Sep 23, 2010 at 12:52 AM, aniket ray <aniket.ray@gmail.com> wrote:

> Hi,
> I continuously run a series of batch job using Hadoop Map Reduce. I also
> have a managing daemon that moves data around on the hdfs making way for
> more jobs to be run.
> I use capacity scheduler to schedule many jobs in parallel.
> I see an issue on the Hadoop web monitoring UI at port 50030 which I
> believe
> may be causing a performance bottleneck and wanted to get more information.
> Approximately 10% of the reduce tasks show up as "Killed" in the UI. The
> logs say that the killed tasks are in the shuffle phase when they are
> killed
> but the logs don't show any exception.
> My understanding is that these killed tasks would be started again and this
> slows down the whole hadoop job.
> I was wondering what the possible issues maybe and how to debug this issue?
> I have tried on both the hadoop 0.20.2 and the latest version of hadoop
> from
> yahoo's github.
> I've monitored the nodes and there is a lot of free disk space and memory
> on
> all nodes (more than 1 TB free disk and 5 GB free memory at all times on
> all
> nodes).
> Since there are no exceptions and any other visible issues, I am finding it
> hard to figure out what the problem might be. Could anybody help?
> Thanks,
> -aniket

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message