hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sharad Agarwal <sharad.apa...@gmail.com>
Subject Re: conf.setMaxMapAttempts, SkipBadRecords, etc.
Date Sat, 25 Dec 2010 12:55:54 GMT
>From the description, it looks like you are unable to set the max map
attempts to 1. This is completely different from Skip bad records feature.
Skip bad records feature let you run the same task by SKIPPING the records;
at which the last attempt failed.

If you are fine with all input records not being processed for a failing
mapper, then you don't need skip records feature. Just need to investigate
why setMaxMapAttempts doesn't work for you.

On Fri, Dec 24, 2010 at 3:02 AM, Keith Wiley <kwiley@keithwiley.com> wrote:

> Let's say I want to ditch an input record the very first time it fails
> (because I know it is a deterministic data-dependent failure) instead of
> retrying it the default four times.  I have already experimented with
> conf.setMaxMapAttempts() with no success.  For example, consider the
> following:
> int maxMapAttempts = conf.getMaxMapAttempts();
> conf.setMaxMapAttempts(1);
> int maxMapAttempts = conf.getMaxMapAttempts();
> Before calling conf.setMaxMapAttempts(1), getMaxMapAttempts() returns the
> default, 4, and after calling conf.setMaxMapAttempts(1), it returns 1.
>  However, despite that encouraging feedback, it doesn't work.  The Hadoop
> job still restarts each failed map task four times.  Furthermore, I have
> confirmed that the job.xml file on the job tracker has the following:
> mapred.map.max.attempts = 4
> ...which proves it really didn't change mapred.map.max.attempts!  I also
> added the following to my mapred-site.xml file:
> <property>
>    <name>mapred.map.max.attempts</name>
>    <value>1</value>
>    <final>true</final>
>    <description>Max map attempts.
>    </description>
> </property>
> When I do that, the initial call conf.getMaxMapAttempts() return 1, not 4,
> just as expected...but nonetheless, the job.xml file on the job tracker
> reports that the value has reverted to 4 once again.  I have sought a
> solution to this problem for a long time and have decided that no one knows
> how to fix it (if you have any ideas PLEASE let me know), so I'm moving on
> to a different approach.  I am now trying the following:
> SkipBadRecords.setMapperMaxSkipRecords(conf, 1);
> SkipBadRecords.setAttemptsToStartSkipping(conf, 1);
> First, can anyone confirm that this is the correct set of calls to make
> SkipBadRecords skip a record after its first failure?
> Second, this doesn't work either!  My map tasks still restart four times.
> I'm really desperate on this and so far my research has turned up nothing.
>  I would greatly appreciate any help on this matter.
> Thank you.
> ________________________________________________________________________________
> Keith Wiley               kwiley@keithwiley.com
> www.keithwiley.com
> "Yet mark his perfect self-contentment, and hence learn his lesson, that to
> be
> self-contented is to be vile and ignorant, and that to aspire is better
> than to
> be blindly and impotently happy."
>  -- Edwin A. Abbott, Flatland
> ________________________________________________________________________________

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message