Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 810A3EF75 for ; Wed, 27 Feb 2013 17:01:17 +0000 (UTC) Received: (qmail 20070 invoked by uid 500); 27 Feb 2013 17:01:16 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 20002 invoked by uid 500); 27 Feb 2013 17:01:16 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 19994 invoked by uid 500); 27 Feb 2013 17:01:16 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 19973 invoked by uid 99); 27 Feb 2013 17:01:16 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Feb 2013 17:01:16 +0000 Date: Wed, 27 Feb 2013 17:01:16 +0000 (UTC) From: "Phabricator (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-4080) Add Lead & Lag UDAFs MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588515#comment-13588515 ] Phabricator commented on HIVE-4080: ----------------------------------- ashutoshc has requested changes to the revision "HIVE-4080 [jira] Add Lead & Lag UDAFs". Some comments. INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:533 Use LEAD_FUNC_NAME and LAG_FUNC_NAME here to be consistent. ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:871 This null check should be done outside of nesting if() block. ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:875 I don't see a case where fInfo is != null but udafResolver could be. If so, than this null check is redundant and should be removed. ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:890 We are already know that we are dealing with lead/lag udaf, it must be of type GenericUDAFResolver2.. no ? ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:866 I am wondering if this function can be rewritten as following: WindowFunctionInfo finfo = windowFunctions.get(name.toLowerCase()); if (finfo == null) { return null;} f ( !name.toLowerCase().equals(LEAD_FUNC_NAME) && !name.toLowerCase().equals(LAG_FUNC_NAME) ) { return getGenericUDAFEvaluator(name, argumentOIs, isDistinct, isAllColumns); } // this must be lead/lag UDAF GenericUDAFResolver udafResolver = finfo.getfInfo().getGenericUDAFResolver(); GenericUDAFParameterInfo paramInfo = new SimpleGenericUDAFParameterInfo( argumentOIs.toArray(), isDistinct, isAllColumns); return ((GenericUDAFResolver2) udafResolver).getEvaluator(paramInfo); If not, than I have specific questions. See next few comments. ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:1467 It will be good to comments for this new boolean registerAsUDAF. Something like following There are certain UDAFs like lead/lag which we want as windowing functions, but don't want them to appear in mFunctions. Why? Because.... ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLag.java:29 This and GenericUDAFLead share lots of common code. It might be good to have an abstract class for these two, just the way you have it in GenericUDFLeadLag. ql/src/test/queries/clientpositive/leadlag_queries.q:20 I think currently we don't support over clause on expressions. Once we do, it will be good to add test like: select p_retailprice - lead (p_retail) over (partition by p_mfgr) from part; ql/src/test/queries/clientpositive/leadlag_queries.q:35 It will be good to add a test which has both lead and lag in same query. REVISION DETAIL https://reviews.facebook.net/D8961 BRANCH HIVE-4080 ARCANIST PROJECT hive To: JIRA, ashutoshc, hbutani > Add Lead & Lag UDAFs > -------------------- > > Key: HIVE-4080 > URL: https://issues.apache.org/jira/browse/HIVE-4080 > Project: Hive > Issue Type: Bug > Components: PTF-Windowing > Reporter: Harish Butani > Assignee: Harish Butani > Attachments: HIVE-4080.1.patch.txt, HIVE-4080.D8961.1.patch > > > Currently we support Lead/Lag as navigation UDFs usable with Windowing. > To be standard compliant we need to support Lead & Lag UDAFs. > Will continue to support Lead/Lag UDFs as arguments to UDAFs when Windowing is in play. > Currently allow Lead/Lag expressions to appear in SelectLists even when they are not arguments to UDAFs. Support for this feature will probably be removed. Causes ambiguities when Query contains different partition clauses. Will provide more details with associated Jira to remove this feature. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira