hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-6459) Change the precison/scale for intermediate sum result in the avg() udf
Date Thu, 20 Feb 2014 12:13:19 GMT

    [ https://issues.apache.org/jira/browse/HIVE-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906901#comment-13906901
] 

Hive QA commented on HIVE-6459:
-------------------------------



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12629858/HIVE-6459.1.patch

{color:red}ERROR:{color} -1 due to 27 failed/errored test(s), 5168 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_genericudaf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_sets3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_number_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_aggregate
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_short_regress
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_mapjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_shufflejoin
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_parallel_orderby
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testAvgDecimal
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testAvgDecimalNegative
org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby3
{noformat}

Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1422/testReport
Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1422/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 27 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12629858

> Change the precison/scale for intermediate sum result in the avg() udf 
> -----------------------------------------------------------------------
>
>                 Key: HIVE-6459
>                 URL: https://issues.apache.org/jira/browse/HIVE-6459
>             Project: Hive
>          Issue Type: Improvement
>          Components: UDF
>    Affects Versions: 0.13.0
>            Reporter: Xuefu Zhang
>            Assignee: Xuefu Zhang
>         Attachments: HIVE-6459.1.patch, HIVE-6459.patch
>
>
> The avg() udf, when applied to a decimal column, selects the precision/scale of the intermediate
sum field as (p+4, s+4), which is the same for the precision/scale of the avg() result. However,
the additional scale increase is unnecessary, and the problem of data overflow may occur.
The requested change is that for the intermediate sum result,  the precsion/scale is set to
(p+10, s), which is consistent to sum() udf. The avg() result still keeps its precision/scale.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message