Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 29802 invoked from network); 30 Jun 2010 10:25:26 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 30 Jun 2010 10:25:26 -0000 Received: (qmail 35347 invoked by uid 500); 30 Jun 2010 10:25:23 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 35017 invoked by uid 500); 30 Jun 2010 10:25:20 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 35009 invoked by uid 99); 30 Jun 2010 10:25:19 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Jun 2010 10:25:19 +0000 X-ASF-Spam-Status: No, hits=-1.6 required=10.0 tests=RCVD_IN_DNSWL_MED,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [207.126.228.149] (HELO rsmtp1.corp.yahoo.com) (207.126.228.149) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Jun 2010 10:25:12 +0000 Received: from [10.66.97.30] ([10.66.97.30]) by rsmtp1.corp.yahoo.com (8.13.8/8.13.8/y.rout) with ESMTP id o5UANVhx006097; Wed, 30 Jun 2010 03:23:32 -0700 (PDT) Message-ID: <4C2B1B22.8050307@yahoo-inc.com> Date: Wed, 30 Jun 2010 15:53:30 +0530 From: Sharad Agarwal User-Agent: Thunderbird 2.0.0.24 (Macintosh/20100228) MIME-Version: 1.0 To: "common-user@hadoop.apache.org" CC: "yhemanth@gmail.com" Subject: Re: how to figure out the range of a split that failed? References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org edward choi wrote: > Thanks for the quick response. > I know the SkipBadRecords feature but unfortunately I cannot use it since I > am running my job on Hadoop Streaming. > I had asked if there were any way to use SkipBadRecords in Hadoop Streaming > but never got an answer. I guess it is not possible at all. > Thanks for your concern. > SkipBadRecords feature can be used for streaming as well. Perhaps the best example is the testcase -> http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/contrib/streaming/src/test/org/apache/hadoop/streaming/TestStreamingBadRecords.java?view=markup Sharad