hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-259) Add PERCENTILE aggregate function
Date Sun, 28 Feb 2010 09:41:05 GMT

    [ https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839393#action_12839393

Zheng Shao commented on HIVE-259:

> (1) I am not familiar with the exact definition of percentile function. Is the percentile()'s
result must be a member of input data?
See the link above.

> (2) HashMap and ArrayList is used to copy and sort. Can we use tree map here? this is
a small and can be ignored.
In the beginning of new test case, 
I think HashMap is better here. The reason is that the number of "iterate" is usually much
higher than the number of unique numbers (the size of the HashMap). By using HashMap we reduce
the cost of "iterate".

> In the beginning of new test case, .. appears two times
Fixed in HIVE-259.5.patch

> Add PERCENTILE aggregate function
> ---------------------------------
>                 Key: HIVE-259
>                 URL: https://issues.apache.org/jira/browse/HIVE-259
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Venky Iyer
>            Assignee: Jerome Boulon
>         Attachments: HIVE-259-2.patch, HIVE-259-3.patch, HIVE-259.1.patch, HIVE-259.4.patch,
HIVE-259.5.patch, HIVE-259.patch, jb2.txt, Percentile.xlsx
> Compute atleast 25, 50, 75th percentiles

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message