Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 34151 invoked from network); 3 Dec 2008 17:23:40 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 3 Dec 2008 17:23:40 -0000 Received: (qmail 93477 invoked by uid 500); 3 Dec 2008 17:23:46 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 93428 invoked by uid 500); 3 Dec 2008 17:23:46 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 93417 invoked by uid 99); 3 Dec 2008 17:23:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Dec 2008 09:23:46 -0800 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [69.147.107.21] (HELO mrout2-b.corp.re1.yahoo.com) (69.147.107.21) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Dec 2008 17:22:15 +0000 Received: from [192.168.1.64] (snvvpn1-10-72-73-c252.hq.corp.yahoo.com [10.72.73.252]) by mrout2-b.corp.re1.yahoo.com (8.13.8/8.13.8/y.out) with ESMTP id mB3HMYml069287 for ; Wed, 3 Dec 2008 09:22:35 -0800 (PST) DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=message-id:from:to:in-reply-to:content-type: content-transfer-encoding:mime-version:subject:date:references:x-mailer; b=vXZ7vFRcMuv0DH/r1thTdNJmkHRxx4t4GO0Kq6Iy3x36meA5LYO5duo2dyCVxdoc Message-Id: <40839E4F-CA5D-42AB-9B22-56098756125E@yahoo-inc.com> From: Arun C Murthy To: core-user@hadoop.apache.org In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v929.2) Subject: Re: Can I ignore some errors in map step? Date: Wed, 3 Dec 2008 09:22:34 -0800 References: X-Mailer: Apple Mail (2.929.2) X-Virus-Checked: Checked by ClamAV on apache.org On Dec 3, 2008, at 5:49 AM, Zhou, Yunqing wrote: > I'm running a job on a data with size 5TB. But currently it reports > there is a checksum error block in the file. Then it cause a map task > failure then the whole job failed. > But the lack of a 64MB block will almost not affect the final result. > So can I ignore some map task failure and continue with reduce step? > > I'm using hadoop-0.18.2 with a replication factor of 1. > You can specify that your job can tolerate some percentage of failures: http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/JobConf.html#setMaxMapTaskFailuresPercent(int) http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/JobConf.html#setMaxReduceTaskFailuresPercent(int) Arun