phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dumindu Buddhika (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-4167) Phoenix SELECT query returns duplicate data in the same varchar/char column if a trim() is applied on the column AND a distinct arbitrary column is generated in the query
Date Thu, 19 Oct 2017 04:34:01 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210552#comment-16210552
] 

Dumindu Buddhika commented on PHOENIX-4167:
-------------------------------------------

[~jamestaylor] This occurs when distinct is combined with repeated function (such as UPPER
or TRIM). Another example would be,

{code:sql}
select distinct  id, upper(nam), upper(nam) from test_select;
{code}

I debugged and  I think the problems is "hasSeparatorByte" in RowKeyValueAccessor has not
been set to true, therefore they can get the proper value and instead returns the whole thing
after the beginning. I could not see why this happens. What do you think?

> Phoenix SELECT query returns duplicate data in the same varchar/char column if a trim()
is applied on the column AND a distinct arbitrary column is generated in the query
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-4167
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4167
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.7.0
>            Reporter: Pulkit Bhardwaj
>            Assignee: Dumindu Buddhika
>            Priority: Minor
>
> 1. Created a simple table in phoenix
> {code:sql}
> create table test_select(nam VARCHAR(20), address VARCHAR(20), id BIGINT constraint my_pk
primary key (id));
> {code}
> 2. Insert a sample row
> {code:sql}
> upsert into test_select (nam, address,id) values('user','place',1);
> {code}
> 3. Confirm that the row is present
> {code:sql}
> 0: jdbc:phoenix:> select * from test_select;
> +---------+----------+-----+
> |   NAM   | ADDRESS  | ID  |
> +---------+----------+-----+
> | user  | place   | 1   |
> +---------+----------+-----+
> {code}
> 4. Now run the following query
> {code:sql}
> 0: jdbc:phoenix:> select distinct 'arbitrary' as "test_column", trim(nam), trim(nam)
from test_select;
> This would generate the following output
> +--------------+----------------+----------------+
> | test_column  |   TRIM(NAM)    |   TRIM(NAM)    |
> +--------------+----------------+----------------+
> | arbitrary      | useruser  | useruser  |
> +--------------+----------------+----------------+
> {code}
> As we can see the output for the trim(name) which should have been 'user' is actually
printed as 'useruser'
> The concatenation to the string is actually the number of times the column is printed.
> The following
> {code:sql}
> 0: jdbc:phoenix:> select distinct 'arbitrary' as "test_column", trim(nam), trim(nam),
trim(nam) from test_select;
> {code}
> Would generate
> {code:sql}
> +--------------+-----------------------+-----------------------+-----------------------+
> | test_column  |       TRIM(NAM)       |      
TRIM(NAM)       |       TRIM(NAM)       |
> +--------------+-----------------------+-----------------------+-----------------------+
> | arbitrary      | useruseruser  | useruseruser  | useruseruser  |
> +--------------+-----------------------+-----------------------+-----------------------+
> {code}
> A couple of things to notice
> 1. If I remove the —— distinct 'harshit' as "test_column" ——  The issue is
not seen
> {code:sql}
> 0: jdbc:phoenix:> select trim(nam), trim(nam), trim(nam) from test_select;
> +------------+------------+------------+
> | TRIM(NAM)  | TRIM(NAM)  | TRIM(NAM)  |
> +------------+------------+------------+
> | user     | user     | user     |
> +------------+------------+------------+
> {code}
> 2. If I remove the trim() again the issue is not seen
> {code:sql}
> 0: jdbc:phoenix:> select distinct 'arbitrary' as "test_column" ,nam, nam from test_select;
> +--------------+---------+---------+
> | test_column  |   NAM   |   NAM   |
> +--------------+---------+---------+
> | arbitrary    | user  | user  |
> +--------------+---------+---------+
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message