Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 942639846 for ; Fri, 18 Nov 2011 05:36:03 +0000 (UTC) Received: (qmail 6453 invoked by uid 500); 18 Nov 2011 05:36:02 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 5889 invoked by uid 500); 18 Nov 2011 05:36:01 -0000 Mailing-List: contact mapreduce-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-user@hadoop.apache.org Delivered-To: mailing list mapreduce-user@hadoop.apache.org Received: (qmail 5881 invoked by uid 99); 18 Nov 2011 05:36:01 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Nov 2011 05:36:01 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of harsh@cloudera.com designates 209.85.161.176 as permitted sender) Received: from [209.85.161.176] (HELO mail-gx0-f176.google.com) (209.85.161.176) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Nov 2011 05:35:53 +0000 Received: by ggnp1 with SMTP id p1so2803865ggn.35 for ; Thu, 17 Nov 2011 21:35:32 -0800 (PST) Received: by 10.68.10.138 with SMTP id i10mr1351342pbb.92.1321594532143; Thu, 17 Nov 2011 21:35:32 -0800 (PST) MIME-Version: 1.0 Received: by 10.142.199.17 with HTTP; Thu, 17 Nov 2011 21:35:11 -0800 (PST) In-Reply-To: References: From: Harsh J Date: Fri, 18 Nov 2011 11:05:11 +0530 Message-ID: Subject: Re: Business logic in cleanup? To: mapreduce-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Hello, On Fri, Nov 18, 2011 at 10:44 AM, Something Something wrote: > Thanks for the reply. =A0Here's another concern we have. =A0Let's say Map= per has > finished processing 1000 lines from the input file & then the machine goe= s > down. =A0I believe Hadoop is smart enough to re-distribute the input spli= t > that was assigned to this Mapper, correct? =A0After re-assigning will it > reprocess the 1000 lines that were processed successfully before & start > from line 1001 =A0OR =A0would it reprocess ALL lines? Attempts of any task start afresh. That's the default nature of Hadoop. So, it would begin from start again and hence reprocess ALL lines. Understand that cleanup is just a fancy API call here, thats called after the input reader completes - not a "stage". --=20 Harsh J