hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Kimball <aa...@cloudera.com>
Subject Re: Too many fetch errors
Date Tue, 07 Apr 2009 18:21:19 GMT
Xiaolin,

Are you certain that the two nodes can fetch mapper outputs from one
another? If it's taking that long to complete, it might be the case that
what makes it "complete" is just that eventually it abandons one of your two
nodes and runs everything on a single node where it succeeds -- defeating
the point, of course.

Might there be a firewall between the two nodes that blocks the port used by
the reducer to fetch the mapper outputs? (I think this is on 50060 by
default.)

- Aaron

On Tue, Apr 7, 2009 at 8:08 AM, xiaolin guo <xiaolin@hulu.com> wrote:

> This simple map-recude application will take nearly 1 hour to finish
> running
> on the two-node cluster ,due to lots of Failed/Killed task attempts, while
> in the single node cluster this application only takes 1 minite ... I am
> quite confusing why there are so many Failed/Killed attempts ..
>
> On Tue, Apr 7, 2009 at 10:40 PM, xiaolin guo <xiaolin@hulu.com> wrote:
>
> > I am trying to setup a small hadoop cluster , everything was ok before I
> > moved from single node cluster to two-node cluster. I followed the
> article
> >
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster)<http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29>
> <
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29>to
> config master and slaves.However, when I tried to run the example
> > wordcount map-reduce application , the reduce task got stuck in 19% for a
> > log time . Then I got a notice:"INFO mapred.JobClient: TaskId :
> > attempt_200904072219_0001_m_000002_0, Status : FAILED too many fetch
> > errors"  and an error message : Error reading task outputslave.
> >
> > All map tasks in both task nodes had been finished which could be
> verified
> > in task tracker pages.
> >
> > Both nodes work well in single node mode . And the Hadoop file system
> seems
> > to be healthy in multi-node mode.
> >
> > Can anyone help me with this issue?  Have already got entangled in this
> > issue for a long time ...
> >
> > Thanks very much!
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message