drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venki Korukanti (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-2438) Query on views with Avg on integer column returns wrong result
Date Tue, 17 Mar 2015 23:40:38 GMT

    [ https://issues.apache.org/jira/browse/DRILL-2438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14366315#comment-14366315
] 

Venki Korukanti commented on DRILL-2438:
----------------------------------------

This doesn't look related to views not storing nullability. For some reason an extra cast
is inserted to cast the result to integer.

{code}
00-01      Project(i_item_id=[$0], agg1=[$1])
00-02        SelectionVectorRemover
00-03          Limit(fetch=[5])
00-04            SelectionVectorRemover
00-05              TopN(limit=[5])
00-06                Project(i_item_id=[$0], agg1=[CAST(/(CastHigh(CASE(=($2, 0), null, $1)),
$2)):INTEGER])
00-07                  HashAgg(group=[{0}], agg#0=[$SUM0($1)], agg#1=[COUNT($1)])
00-08                    Project(i_item_id=[CASE(=(ITEM($0, 1), ''), null, CAST(ITEM($0, 1)):VARCHAR(200)
CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary")], i_manufact_id=[CASE(=(ITEM($0,
13), ''), null, CAST(ITEM($0, 13)):INTEGER)])
00-09                      Scan(groupscan=[EasyGroupScan [selectionRoot=/Users/hadoop/data/scale1/item.dat,
numFiles=1, columns=[`columns`[1], `columns`[13]], files=[file:/Users/hadoop/data/scale1/item.dat]]])
{code}

> Query on views with Avg on integer column returns wrong result
> --------------------------------------------------------------
>
>                 Key: DRILL-2438
>                 URL: https://issues.apache.org/jira/browse/DRILL-2438
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 0.8.0
>            Reporter: Abhishek Girish
>            Assignee: Mehant Baid
>            Priority: Critical
>             Fix For: 0.9.0
>
>
> Git.Commit.ID: b3bdc27 (Mar 10)
> Average on an integer column returns an (inaccurate) integer value, instead of an (accurate)
decimal value. 
> *The following query returns wrong results:*
> {code:sql}
> > SELECT i_item_id, avg(i_manufact_id) agg1
> . . . . . . . . . . . . . . . . . > FROM item
> . . . . . . . . . . . . . . . . . > GROUP  BY i_item_id 
> . . . . . . . . . . . . . . . . . > ORDER  BY i_item_id
> . . . . . . . . . . . . . . . . . > LIMIT 5; 
> +------------+------------+
> | i_item_id  |    agg1    |
> +------------+------------+
> | AAAAAAAAAAABAAAA | 152        |
> | AAAAAAAAAAACAAAA | 187        |
> | AAAAAAAAAAAEAAAA | 251        |
> | AAAAAAAAAABAAAAA | 199        |
> | AAAAAAAAAABBAAAA | 636        |
> +------------+------------+
> 5 rows selected (0.324 seconds)
> {code}
> *Postgres results:*
> {code:sql}
> # SELECT i_item_id, avg(i_manufact_id) agg1
> tpcds1_new-# FROM item
> tpcds1_new-# GROUP  BY i_item_id 
> tpcds1_new-# ORDER  BY i_item_id
> tpcds1_new-# LIMIT 5; 
>     i_item_id     |         agg1         
> ------------------+----------------------
>  AAAAAAAAAAABAAAA | 152.3333333333333333
>  AAAAAAAAAAACAAAA | 373.0000000000000000
>  AAAAAAAAAAAEAAAA | 251.0000000000000000
>  AAAAAAAAAABAAAAA | 198.6666666666666667
>  AAAAAAAAAABBAAAA | 636.0000000000000000
> (5 rows)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message