Return-Path: Delivered-To: apmail-hadoop-hive-dev-archive@minotaur.apache.org Received: (qmail 84454 invoked from network); 6 Jan 2010 20:36:18 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 6 Jan 2010 20:36:18 -0000 Received: (qmail 46187 invoked by uid 500); 6 Jan 2010 20:36:18 -0000 Delivered-To: apmail-hadoop-hive-dev-archive@hadoop.apache.org Received: (qmail 46167 invoked by uid 500); 6 Jan 2010 20:36:18 -0000 Mailing-List: contact hive-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hive-dev@hadoop.apache.org Delivered-To: mailing list hive-dev@hadoop.apache.org Received: (qmail 46136 invoked by uid 99); 6 Jan 2010 20:36:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Jan 2010 20:36:18 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Jan 2010 20:36:16 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id C3D3D234C052 for ; Wed, 6 Jan 2010 12:35:54 -0800 (PST) Message-ID: <351016255.77781262810154800.JavaMail.jira@brutus.apache.org> Date: Wed, 6 Jan 2010 20:35:54 +0000 (UTC) From: "Ning Zhang (JIRA)" To: hive-dev@hadoop.apache.org Subject: [jira] Updated: (HIVE-988) mapjoin should throw an error if the input is too large In-Reply-To: <800814905.1260829405861.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HIVE-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ning Zhang updated HIVE-988: ---------------------------- Attachment: HIVE-988_4.patch According to offline discussions with Namit, here are the new changes: 1) change Operator.fatalError as a static variable so all operators share it. 2) change Operator.getDone() to check fatalError as well. 3) change ExecMapper.map() to check the operator.getDone() and early exit if so. 4) change the ExecDriver to hold a success variable and ExecDriver.progress will set it status rather than getting it from RunningJob.isSuccessful(). So it solves the case where the Counter was incrmented but the RunningJob is finished without checking for the counter. > mapjoin should throw an error if the input is too large > ------------------------------------------------------- > > Key: HIVE-988 > URL: https://issues.apache.org/jira/browse/HIVE-988 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor > Reporter: Namit Jain > Assignee: Ning Zhang > Fix For: 0.5.0 > > Attachments: HIVE-988.patch, HIVE-988_2.patch, HIVE-988_3.patch, HIVE-988_4.patch > > > If the input to the map join is larger than a specific threshold, it may lead to a very slow execution of the join. > It is better to throw an error, and let the user redo his query as a non map-join query. > However, the current map-reduce framework will retry the mapper 4 times before actually killing the job. > Based on a offline discussion with Dhruba, Ning and myself, we came up with the following algorithm: > Keep a threshold in the mapper for the number of rows to be processed for map-join. If the number of rows > exceeds that threshold, set a counter and kill that mapper. > The client (ExecDriver) monitors that job continuously - if this counter is set, it kills the job and also > shows an appropriate error message to the user, so that he can retry the query without the map join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.