phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Loknath Priyatham Teja Singamsetty (JIRA)" <>
Subject [jira] [Commented] (PHOENIX-3773) Implement FIRST_VALUES aggregate function
Date Fri, 02 Jun 2017 19:51:04 GMT


Loknath Priyatham Teja Singamsetty  commented on PHOENIX-3773:

bq. You might want to check out ArrayAppendFunctionIT which exercises ARRAY_APPEND 
Thanks for the pointer James. Was looking into the same and was understanding how things were

[~jamestaylor] Looks like I found the reason. The PArrayDataType.appendItemToArray can be
used when you already have an Array serialized to bytes with atleast one element in it. We
cannot leverage this without having an array pre-constructed.

In our case, the requirement is to convert the multiple <T>PDataType to single <T>PArrayDataType.
There is no such util method which can construct the Array from scratch given element one
by one to array.

We have to perform serialization/deserialization for one element in order to construct the
Array, post which we can make use of PArrayDataType.appendItemToArray. This would save serialization/deserialization
cost on the rest of items in first values array result set.

Let me know if this approach is fine with you. 


> Implement FIRST_VALUES aggregate function
> -----------------------------------------
>                 Key: PHOENIX-3773
>                 URL:
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: James Taylor
>            Assignee: Loknath Priyatham Teja Singamsetty 
>              Labels: SFDC
>             Fix For: 4.11.0
>         Attachments: PHOENIX-3773_4.x-HBase-0.98.patch, PHOENIX-3773_master.patch, PHOENIX-3773.patch,
PHOENIX-3773.v2.patch, PHOENIX-3773.v3.patch
> Similar to FIRST_VALUE, but would allow the user to specify how many values to keep.
This could use a MinMaxPriorityQueue under the covers and be much more efficient than using
multiple NTH_VALUE calls to do the same like this:
> {code}
> SELECT entity_id,
>        NTH_VALUE(user_id,1) WITHIN GROUP (ORDER BY last_read_date DESC) as nth1_user_id,
>        NTH_VALUE(user_id,2) WITHIN GROUP (ORDER BY last_read_date DESC) as nth2_user_id,
>        NTH_VALUE(user_id,3) WITHIN GROUP (ORDER BY last_read_date DESC) as nth3_user_id,
>        count(*)
> WHERE tenant_id='00Dx0000000XXXX'
> AND entity_id in ('0D5x000000ABCD','0D5x000000ABCE')
> GROUP BY entity_id;
> {code}

This message was sent by Atlassian JIRA

View raw message