Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 3F829200CA8 for ; Thu, 1 Jun 2017 00:53:09 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 3DF2A160BDB; Wed, 31 May 2017 22:53:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 8401E160BCB for ; Thu, 1 Jun 2017 00:53:08 +0200 (CEST) Received: (qmail 46631 invoked by uid 500); 31 May 2017 22:53:07 -0000 Mailing-List: contact dev-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list dev@phoenix.apache.org Received: (qmail 46619 invoked by uid 99); 31 May 2017 22:53:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 31 May 2017 22:53:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 3C649C0C0B for ; Wed, 31 May 2017 22:53:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id otaF9FcpkVEc for ; Wed, 31 May 2017 22:53:06 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 36DFF5F659 for ; Wed, 31 May 2017 22:53:05 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 66A14E0373 for ; Wed, 31 May 2017 22:53:04 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 1A49F21B55 for ; Wed, 31 May 2017 22:53:04 +0000 (UTC) Date: Wed, 31 May 2017 22:53:04 +0000 (UTC) From: "James Taylor (JIRA)" To: dev@phoenix.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (PHOENIX-3773) Implement FIRST_VALUES aggregate function MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 31 May 2017 22:53:09 -0000 [ https://issues.apache.org/jira/browse/PHOENIX-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16032152#comment-16032152 ] James Taylor commented on PHOENIX-3773: --------------------------------------- Thanks for the updated patch, [~singamteja]. It's looking very good. Here's some feedback: Instead of deserializing the byte arrays to objects (a relatively expensive operation) in the evaluate method of FirstLastValueBaseClientAggregator, store the bytes pointer in a List. {code} + if (getMultipleValues) { + multipleValuesResult.add(PDataType.fromTypeId(dataType.getSqlType() - PDataType.ARRAY_TYPE_BASE).toObject(it.next())); + if (++counter == offset) { + break; + } + } else { {code} Then, once you have all the values, combine them together using the PArrayDataType.appendItemToArray() method outside the loop. {code} - //not enought values to return Nth + if (getMultipleValues && multipleValuesResult.size() > 0) { + ptr.set(dataType.toBytes(PArrayDataType.instantiatePhoenixArray(PDataType.fromTypeId(dataType.getSqlType() - PDataType.ARRAY_TYPE_BASE), multipleValuesResult.toArray()))); + return true; + } + + //not enough values to return Nth or top N'values or trying to retreive (N+1)th value {code} I think the handling of the case in which there are no values may not be handled correctly. You need to move this check to the top of the aggregate method: {code} if (topValue == null) { return false; } {code} And this would be the only case that would return false. Otherwise, if there aren't enough values, do this: {code} ptr.set(ByteUtil.EMPTY_BYTE_ARRAY); return true; {code} Probably a good idea to have a test that asks for the top 3 values when there are only 2 values to make sure that case works too (if you don't have that already). Also, one minor nit. How about instead of calling the new member variable getMultipleValues you call it isArrayReturnType? > Implement FIRST_VALUES aggregate function > ----------------------------------------- > > Key: PHOENIX-3773 > URL: https://issues.apache.org/jira/browse/PHOENIX-3773 > Project: Phoenix > Issue Type: New Feature > Reporter: James Taylor > Assignee: Loknath Priyatham Teja Singamsetty > Labels: SFDC > Fix For: 4.11.0 > > Attachments: PHOENIX-3773_4.x-HBase-0.98.patch, PHOENIX-3773_master.patch, PHOENIX-3773.patch, PHOENIX-3773.v2.patch, PHOENIX-3773.v3.patch > > > Similar to FIRST_VALUE, but would allow the user to specify how many values to keep. This could use a MinMaxPriorityQueue under the covers and be much more efficient than using multiple NTH_VALUE calls to do the same like this: > {code} > SELECT entity_id, > NTH_VALUE(user_id,1) WITHIN GROUP (ORDER BY last_read_date DESC) as nth1_user_id, > NTH_VALUE(user_id,2) WITHIN GROUP (ORDER BY last_read_date DESC) as nth2_user_id, > NTH_VALUE(user_id,3) WITHIN GROUP (ORDER BY last_read_date DESC) as nth3_user_id, > count(*) > FROM MY_TABLE > WHERE tenant_id='00Dx0000000XXXX' > AND entity_id in ('0D5x000000ABCD','0D5x000000ABCE') > GROUP BY entity_id; > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)