Return-Path: Delivered-To: apmail-hadoop-pig-dev-archive@www.apache.org Received: (qmail 98814 invoked from network); 1 Jul 2010 21:30:45 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 1 Jul 2010 21:30:45 -0000 Received: (qmail 27677 invoked by uid 500); 1 Jul 2010 21:30:45 -0000 Delivered-To: apmail-hadoop-pig-dev-archive@hadoop.apache.org Received: (qmail 27431 invoked by uid 500); 1 Jul 2010 21:30:44 -0000 Mailing-List: contact pig-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: pig-dev@hadoop.apache.org Delivered-To: mailing list pig-dev@hadoop.apache.org Received: (qmail 27412 invoked by uid 99); 1 Jul 2010 21:30:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Jul 2010 21:30:43 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Jul 2010 21:30:41 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o61LMoBh002201 for ; Thu, 1 Jul 2010 21:22:50 GMT Message-ID: <10752735.161571278019370194.JavaMail.jira@thor> Date: Thu, 1 Jul 2010 17:22:50 -0400 (EDT) From: "Ashutosh Chauhan (JIRA)" To: pig-dev@hadoop.apache.org Subject: [jira] Updated: (PIG-1449) RegExLoader hangs on lines that don't match the regular expression In-Reply-To: <16510916.81501276537153536.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/PIG-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-1449: ---------------------------------- Status: Patch Available (was: Open) Running through Hudson. > RegExLoader hangs on lines that don't match the regular expression > ------------------------------------------------------------------ > > Key: PIG-1449 > URL: https://issues.apache.org/jira/browse/PIG-1449 > Project: Pig > Issue Type: Bug > Affects Versions: 0.7.0 > Reporter: Justin Sanders > Priority: Minor > Attachments: PIG-1449-RegExLoaderInfiniteLoopFix.patch, RegExLoader.patch > > > In the 0.7.0 changes to RegExLoader there was a bug introduced where the code will stay in the while loop if the line isn't matched. Before 0.7.0 these lines would be skipped if they didn't match the regular expression. The result is the mapper will not respond and will time out with "Task attempt_X failed to report status for 600 seconds. Killing!". > Here are the steps to recreate the bug: > Create a text file in HDFS with the following lines: > test1 > testA > test2 > Run the following pig script: > REGISTER /usr/local/pig/contrib/piggybank/java/piggybank.jar; > test = LOAD '/path/to/test.txt' using org.apache.pig.piggybank.storage.MyRegExLoader('(test\\d)') AS (line); > dump test; > Expected result: > (test1) > (test3) > Actual result: > Job fails to complete after 600 second timeout waiting on the mapper to complete. The mapper hangs at 33% since it can process the first line but gets stuck into the while loop on the second line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.