Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 80405 invoked from network); 19 Oct 2010 22:45:22 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 19 Oct 2010 22:45:22 -0000 Received: (qmail 4420 invoked by uid 500); 19 Oct 2010 22:45:20 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 4252 invoked by uid 500); 19 Oct 2010 22:45:20 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 4244 invoked by uid 99); 19 Oct 2010 22:45:20 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Oct 2010 22:45:20 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of hadoopnode@gmail.com designates 209.85.215.48 as permitted sender) Received: from [209.85.215.48] (HELO mail-ew0-f48.google.com) (209.85.215.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Oct 2010 22:45:12 +0000 Received: by ewy28 with SMTP id 28so2111281ewy.35 for ; Tue, 19 Oct 2010 15:44:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=rKnRKMdRqExWlC1zQwtynxY0Uj08BlrWH3bQlYakGMk=; b=CAyPFvwcAx16RTbLSEijOVGAmqH0xe+WavFaXiI/bYCh9tTqudAZn25xjK0jv7Gqdj nzkFfwCGaqu1InvDd+ZZd2bv8FPYETlLs2GmNsL2CtaGCRF2wvX0x25ZKUpffRxJOm6U CgomsgKDRUnlXv7eTIV0MxHxyXAHhOIutSKuI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=LNB2pzIuPJQ1znkNPTg8cfuumWGRa//edVDajNrGCkG4yc3etKJvdMBlNdTjPSn0mT VRzP0BLfBgWqvDQowvvt//rLywHNc9NGC0tVOkH/LDy5zhdRw9snaEsHYPCMRHvehss1 KiH9v3MXohyBEe53BcdWsAFOu3dLuwE2u2IvE= MIME-Version: 1.0 Received: by 10.213.11.16 with SMTP id r16mr2432007ebr.56.1287528292188; Tue, 19 Oct 2010 15:44:52 -0700 (PDT) Received: by 10.213.29.81 with HTTP; Tue, 19 Oct 2010 15:44:52 -0700 (PDT) Date: Tue, 19 Oct 2010 18:44:52 -0400 Message-ID: Subject: How to stop a mapper within a map-reduce job when you detect bad input From: ed To: common-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=0015174c0fc67e89f00493000ad7 X-Virus-Checked: Checked by ClamAV on apache.org --0015174c0fc67e89f00493000ad7 Content-Type: text/plain; charset=ISO-8859-1 Hello, I have a simple map-reduce job that reads in zipped files and converts them to lzo compression. Some of the files are not properly zipped which results in Hadoop throwing an "java.io.EOFException: Unexpected end of input stream error" and causes the job to fail. Is there a way to catch this exception and tell hadoop to just ignore the file and move on? I think the exception is being thrown by the class reading in the Gzip file and not my mapper class. Is this correct? Is there a way to handle this type of error gracefully? Thank you! ~Ed --0015174c0fc67e89f00493000ad7--