hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Wiley <kwi...@keithwiley.com>
Subject Re: Task fails: starts over with first input key?
Date Tue, 14 Dec 2010 18:03:58 GMT

On Dec 13, 2010, at 17:58 , li ping wrote:

> I think the "*org.apache.hadoop.mapred.SkipBadRecords*" is you are looking
> for.

Yes, I considered that at one point.  I don't like how it insists on iteratively retrying
the records.  I wish it would simply skip the failed records and move on, just run the list
of input records in a line, skipping the bad ones, sending the good ones to the reducer, and
otherwise making no further attempts at processing.

I'll read up on it again.  Perhaps I missed something.


Keith Wiley               kwiley@keithwiley.com               www.keithwiley.com

"What I primarily learned in grad school is how much I *don't* know.
Consequently, I left grad school with a higher ignorance to knowledge ratio than
when I entered."
  -- Keith Wiley

View raw message