phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ethan Wang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (PHOENIX-4283) Group By statement truncating BIGINTs
Date Sun, 15 Oct 2017 19:47:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205268#comment-16205268
] 

Ethan Wang edited comment on PHOENIX-4283 at 10/15/17 7:46 PM:
---------------------------------------------------------------

Yes, when cut off happens, the "actualType" is Decimal, "this" is BigInt.  The "actualType"
is pass from RowKeyColumnExpression.fromType, which is Decimal in the nested groupby case.
So, as comparison, in RowKeyColumnExpression,

 logic  :         type.coerceBytes(ptr, fromType);
*nested*:      BIGINT.coerceBytes(ptr, DECIMAL);
*normal*:     BIGINT.coerceBytes(ptr, BIGINT);

I think a issue may be that, in PLong, coerceBytes() is overriding *regardless* the "actualType",
before passing into super.coerceBytes(). Therefore, the cut off get executed always.

{code}  @Override
    public void coerceBytes(ImmutableBytesWritable ptr, Object object, PDataType actualType,
            Integer maxLength, Integer scale, SortOrder actualModifier, Integer desiredMaxLength,
Integer desiredScale,
            SortOrder expectedModifier) {
        // Decrease size of TIMESTAMP to size of LONG and continue coerce
        if (ptr.getLength() > getByteSize()) {
            ptr.set(ptr.get(), ptr.getOffset(), getByteSize());
        }
        super.coerceBytes(ptr, object, actualType, maxLength, scale, actualModifier, desiredMaxLength,
                desiredScale, expectedModifier);
    }
{code}  


was (Author: aertoria):
Yes, when cut off happens, the "actualType" is Decimal, "this" is BigInt.  The "actualType"
is pass from RowKeyColumnExpression.fromType, which is Decimal in nested groupby. So, as comparison,
in RowKeyColumnExpression,

 logic  :         type.coerceBytes(ptr, fromType);
*nested*:      BIGINT.coerceBytes(ptr, DECIMAL);
*normal*:     BIGINT.coerceBytes(ptr, BIGINT);

I think the issue may be that, in PLong, coerceBytes() is override *regardless* the "actualType",
before passing into super.coerceBytes(). Therefore, the cut off get executed always.

{code}  @Override
    public void coerceBytes(ImmutableBytesWritable ptr, Object object, PDataType actualType,
            Integer maxLength, Integer scale, SortOrder actualModifier, Integer desiredMaxLength,
Integer desiredScale,
            SortOrder expectedModifier) {
        // Decrease size of TIMESTAMP to size of LONG and continue coerce
        if (ptr.getLength() > getByteSize()) {
            ptr.set(ptr.get(), ptr.getOffset(), getByteSize());
        }
        super.coerceBytes(ptr, object, actualType, maxLength, scale, actualModifier, desiredMaxLength,
                desiredScale, expectedModifier);
    }
{code}  

> Group By statement truncating BIGINTs
> -------------------------------------
>
>                 Key: PHOENIX-4283
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4283
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.11.0
>            Reporter: Steven Sadowski
>            Assignee: Ethan Wang
>             Fix For: 4.12.1
>
>
> *Versions:*
> Phoenix 4.11.0
> HBase: 1.3.1
> (Amazon EMR: 5.8.0)
> *Steps to reproduce:*
> 1. From the `sqlline-thin.py` client setup the following table:
> {code:sql}
> CREATE TABLE test_table (
>     a BIGINT NOT NULL, 
>     c BIGINT NOT NULL
>     CONSTRAINT PK PRIMARY KEY (a, c)
> );
> UPSERT INTO test_table(a,c) VALUES(4444444444444444444, 5555555555555555555);
> SELECT a FROM (SELECT a, c FROM test_table GROUP BY a, c) GROUP BY a, c;
> {code}
> *Expected Result:*
> {code:sql}
> +----------------------+
> |          A           |
> +----------------------+
> | 4444444444444444444  |
> +----------------------+
> {code}
> *Actual Result:*
> {code:sql}
> +----------------------+
> |          A           |
> +----------------------+
> | 4444444444444000000  |
> +----------------------+
> {code}
> *Comments:*
> Having the two Group By statements together seems to truncate the last 6 or so digits
of the final result. Removing the outer (or either) group by will produce the correct result.
> Please fix the Group by statement to not truncate the outer result's value.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message