Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: common-user@hadoop.apache.org
Received-SPF: neutral (athena.apache.org: local policy)
DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns;
	h=received:user-agent:date:subject:from:to:message-id:
	thread-topic:thread-index:in-reply-to:mime-version:content-type:
	content-transfer-encoding:x-originalarrivaltime;
	b=SBGqZ45YkRNETrYTbGbLUYvOg/yykpP8XNU16XTvs938QaO9JbXZ0J79JS5gV0o2
User-Agent: Microsoft-Entourage/12.19.0.090515
Date: Tue, 21 Jul 2009 13:05:56 +0530
Subject: Re: Too many fetch failures
From: Jothi Padmanabhan <jothipn@yahoo-inc.com>
To: <common-user@hadoop.apache.org>
Message-ID: <C68B6DB4.19686%jothipn@yahoo-inc.com>
Thread-Topic: Too many fetch failures
Thread-Index: AcoJ1eIh903jgZmlCUa/TcMSs/8Sxg==
In-Reply-To: <9c39bdeb0907210003i718f9f47saf94868debff45dd@mail.gmail.com>
Mime-version: 1.0
Content-type: text/plain;
	charset="US-ASCII"
Content-transfer-encoding: 7bit

This error occurs when several reducers are unable to fetch the given map
output ( attempt_200907202331_0001_m_000001_0 in your example).
I am guessing that there is a configuration issue in your setup -- the
reducers are not able to contact/transfer map outputs from the TaskTracker.
The TT log on the node where the map ran could throw some light on the
error. Could you verify if all the nodes in your cluster are able to connect
with others? You could also manually login to the reducer node and try
pulling the map output yourself and see what error you are getting.

Cheers
Jothi

On 7/21/09 12:33 PM, "George Pang" <p0941p@gmail.com> wrote:

> Hi users,
> 
> I got this "Too many fetch failures" in the following error message:
> 
> *09/07/20 23:33:39 INFO mapred.JobClient:  map 100% reduce 16%
> 09/07/20 23:46:22 INFO mapred.JobClient: Task Id :
> attempt_200907202331_0001_m_000001_0, Status : FAILED
> Too many fetch-failures
> 09/07/20 23:46:37 INFO mapred.JobClient: Job complete: job_200907202331_0001
> 
> *Don't know why it always stops at reduce 16% then assumes.  It take a long
> time even to run a small task.
> 
> I saw people asking the same question in previous mail list, but I don't get
> the help I need.
> 
> Hadoop version:  0.18.3
> Ubuntu version:  8.04
> 
> Thank you in advance!
> 
> George