phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Loknath Priyatham Teja Singamsetty (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-3773) Implement FIRST_VALUES aggregate function
Date Wed, 24 May 2017 08:19:04 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16022503#comment-16022503
] 

Loknath Priyatham Teja Singamsetty  commented on PHOENIX-3773:
--------------------------------------------------------------

[~jamestaylor] 
bq. Have FIRST_VALUES return an ARRAY type so that you can return all values in a single row.
It's not going to work to change the semantics of SQL (it's pretty well established). A function
can't return multiple rows like that.

Gone through Oracle/SQL rank over and grouped_concat behaviours. Probably what you are suggesting
here is to implement Group_concat https://stackoverflow.com/questions/2129693/using-limit-within-group-by-to-get-n-results-per-group.

 Kindly help me with the expectation here.

TEST.TEST
||id||page_id||date||val||
|2|8|1|7|
|3|8|2|9|
|4|8|3|4|
|5|8|4|2|
|6|9|5|10|
|7|9|6|13|

For the above table with below queries,

a) FIRST_VALUES With Group By Clause
SELECT page_id, FIRST_VALUES(val, 2) WITHIN GROUP (ORDER BY dates DESC) as first_values FROM
TEST.TEST GROUP BY page_id

Expected Output?
-----------------------
||page_id||first_values||
|8|2,4|
|9|13,10|

b) FIRST_VALUES without group by
SELECT FIRST_VALUES(val, 2) as first_values WITHIN GROUP (ORDER BY dates DESC) as first_values
FROM TEST.TEST 
||first_values||
|13,10|

[~jamestaylor] Let me know if the above looks as expected behaviour.

 

> Implement FIRST_VALUES aggregate function
> -----------------------------------------
>
>                 Key: PHOENIX-3773
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3773
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: James Taylor
>            Assignee: Loknath Priyatham Teja Singamsetty 
>              Labels: SFDC
>             Fix For: 4.11.0
>
>         Attachments: PHOENIX-3773.patch, PHOENIX-3773.v2.patch
>
>
> Similar to FIRST_VALUE, but would allow the user to specify how many values to keep.
This could use a MinMaxPriorityQueue under the covers and be much more efficient than using
multiple NTH_VALUE calls to do the same like this:
> {code}
> SELECT entity_id,
>        NTH_VALUE(user_id,1) WITHIN GROUP (ORDER BY last_read_date DESC) as nth1_user_id,
>        NTH_VALUE(user_id,2) WITHIN GROUP (ORDER BY last_read_date DESC) as nth2_user_id,
>        NTH_VALUE(user_id,3) WITHIN GROUP (ORDER BY last_read_date DESC) as nth3_user_id,
>        count(*)
> FROM  MY_TABLE 
> WHERE tenant_id='00Dx0000000XXXX'
> AND entity_id in ('0D5x000000ABCD','0D5x000000ABCE')
> GROUP BY entity_id;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message