phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Csaba Skrabak (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (PHOENIX-4139) select distinct with identical aggregations return weird values
Date Fri, 10 Nov 2017 09:51:01 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243646#comment-16243646
] 

Csaba Skrabak edited comment on PHOENIX-4139 at 11/10/17 9:50 AM:
------------------------------------------------------------------

In ExpressionCompiler.wrapGroupByExpression(Expression) method, there is an indexOf call:
            int index = groupBy.getExpressions().indexOf(expression);

If there are two equal expressions in the groupBy, they should have different index in their
accessors but both get the return from the above indexOf (which by design gives the first
found element) just a few lines below:

                RowKeyValueAccessor accessor = new RowKeyValueAccessor(groupBy.getKeyExpressions(),
index);
                expression = new RowKeyColumnExpression(expression, accessor, groupBy.getKeyExpressions().get(index).getDataType());

This makes me think that the GroupBy fields should have more powerful data structures than
Lists to store keyExpressions and expressions. But now I'm not sure what the whole GroupBy
class is really used for. I don't want to tinker with it and maybe break the design until
I understand.

So the list is what we're doing a GROUP BY over I think but its elements are wrong.


was (Author: cskrabak):
In ExpressionCompiler.wrapGroupByExpression(Expression) method, there is an indexOf call:
            int index = groupBy.getExpressions().indexOf(expression);

If there are two equal expressions in the groupBy, they should have different index in their
accessors but both get the return from an indexOf (which by design gives the first found element
just a few lines below,)

                RowKeyValueAccessor accessor = new RowKeyValueAccessor(groupBy.getKeyExpressions(),
index);
                expression = new RowKeyColumnExpression(expression, accessor, groupBy.getKeyExpressions().get(index).getDataType());

This makes me think that the GroupBy fields should have more powerful data structures than
Lists to store keyExpressions and expressions. But now I'm not sure what the whole GroupBy
class is really used for. I don't want to tinker with it and maybe break the design until
I understand.

So the list is what we're doing a GROUP BY over I think but its elements are wrong.

> select distinct with identical aggregations return weird values 
> ----------------------------------------------------------------
>
>                 Key: PHOENIX-4139
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4139
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.12.0
>         Environment: minicluster
>            Reporter: Csaba Skrabak
>            Assignee: Csaba Skrabak
>            Priority: Minor
>             Fix For: 4.14.0
>
>         Attachments: PHOENIX-4139.patch
>
>
> From sme-hbase hipchat room:
> Pulkit Bhardwaj·10:31
> i'm seeing a weird issue with phoenix, appreciate some thoughts
> Created a simple table in phoenix
> {noformat}
> 0: jdbc:phoenix:> create table test_select(nam VARCHAR(20), address VARCHAR(20), id
BIGINT
> . . . . . . . . > constraint my_pk primary key (id));
> 0: jdbc:phoenix:> upsert into test_select (nam, address,id) values('pulkit','badaun',1);
> 0: jdbc:phoenix:> select * from test_select;
> +---------+----------+-----+
> |   NAM   | ADDRESS  | ID  |
> +---------+----------+-----+
> | pulkit  | badaun   | 1   |
> +---------+----------+-----+
> 0: jdbc:phoenix:> select distinct 'harshit' as "test_column", nam from test_select;
> +--------------+---------+
> | test_column  |   NAM   |
> +--------------+---------+
> | harshit      | pulkit  |
> +--------------+---------+
> 0: jdbc:phoenix:> select distinct 'harshit' as "test_column", trim(nam), trim(nam)
from test_select;
> +--------------+----------------+----------------+
> | test_column  |   TRIM(NAM)    |   TRIM(NAM)    |
> +--------------+----------------+----------------+
> | harshit      | pulkitpulkit  | pulkitpulkit  |
> +--------------+----------------+----------------+
> {noformat}
> When I apply a trim on the nam column and use it multiple times, the output has the cell
data duplicated!
> {noformat}
> 0: jdbc:phoenix:> select distinct 'harshit' as "test_column", trim(nam), trim(nam),
trim(nam) from test_select;
> +--------------+-----------------------+-----------------------+-----------------------+
> | test_column  |       TRIM(NAM)       |      
TRIM(NAM)       |       TRIM(NAM)       |
> +--------------+-----------------------+-----------------------+-----------------------+
> | harshit      | pulkitpulkitpulkit  | pulkitpulkitpulkit  | pulkitpulkitpulkit  |
> +--------------+-----------------------+-----------------------+-----------------------+
> {noformat}
> Wondering if someone has seen this before??
> One thing to note is, if I remove the —— distinct 'harshit' as "test_column" ——  The
issue is not seen
> {noformat}
> 0: jdbc:phoenix:> select trim(nam), trim(nam), trim(nam) from test_select;
> +------------+------------+------------+
> | TRIM(NAM)  | TRIM(NAM)  | TRIM(NAM)  |
> +------------+------------+------------+
> | pulkit     | pulkit     | pulkit     |
> +------------+------------+------------+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message