hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Business logic in cleanup?
Date Fri, 18 Nov 2011 05:35:11 GMT

On Fri, Nov 18, 2011 at 10:44 AM, Something Something
<mailinglists19@gmail.com> wrote:
> Thanks for the reply.  Here's another concern we have.  Let's say Mapper has
> finished processing 1000 lines from the input file & then the machine goes
> down.  I believe Hadoop is smart enough to re-distribute the input split
> that was assigned to this Mapper, correct?  After re-assigning will it
> reprocess the 1000 lines that were processed successfully before & start
> from line 1001  OR  would it reprocess ALL lines?

Attempts of any task start afresh. That's the default nature of Hadoop.

So, it would begin from start again and hence reprocess ALL lines.
Understand that cleanup is just a fancy API call here, thats called
after the input reader completes - not a "stage".

Harsh J

View raw message