hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Stein <charmal...@allthingshadoop.com>
Subject Re: Too many fetch-failures
Date Mon, 27 Sep 2010 19:01:22 GMT
I have seen this before if your hosts are not setup so every data node can contact every other
data node by the host name registered. Typically when adding a new node ( or initial cluster
setup ) making sure the internal DNS is updated or that the hosts file is updated on all data
nodes as such with the list of datanodes with the name and ip.

Since a or some data nodes can not contact each other ( maybe your situation has another reason
for this than the one I gave above ) then when mappers need data not local or while reducing
the data fetch fails from the node that it cannot contact the datanodes it is trying to get
data from.  

Joe Stein, 973-944-0094
Twitter: @allthingshadoop

On Sep 27, 2010, at 2:50 PM, Pramy Bhats <pramybhats@googlemail.com> wrote:

> Hello,
> I am trying to run a biagram count on a 12-node cluster setup. For an input
> file of 135 splits (around 7.5 GB), the job fails for some of the runs.
> The error that I get on the jobtracker that out of 135 mappers, 1 of the
> mapper fails because of
> "Too many fetch-failures
> Too many fetch-failures
> Too many fetch-failures
> Too many fetch-failures "
> As a result of this mapper failure, whole job fails -- the reducers
> which are making progress also stalls.
> Could anyone please help in regard of solving this error.
> thanks in advance.

View raw message