tajo-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TAJO-1283) ORDER BY with the first descending order causes wrong results
Date Tue, 03 Feb 2015 06:32:35 GMT

    [ https://issues.apache.org/jira/browse/TAJO-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302826#comment-14302826
] 

ASF GitHub Bot commented on TAJO-1283:
--------------------------------------

Github user hyunsik commented on the pull request:

    https://github.com/apache/tajo/pull/364#issuecomment-72601391
  
    Hi @sirpkt, 
    Could you trigger this unit test?


> ORDER BY with the first descending order causes wrong results
> -------------------------------------------------------------
>
>                 Key: TAJO-1283
>                 URL: https://issues.apache.org/jira/browse/TAJO-1283
>             Project: Tajo
>          Issue Type: Bug
>          Components: distributed query plan, planner/optimizer
>            Reporter: Hyunsik Choi
>            Assignee: Keuntae Park
>            Priority: Critical
>             Fix For: 0.10
>
>
> Each order key by can be specified with ascending or descending order. 
> Recently, I found that ORDER BY with the first descending order key causes wrong result.
> If second key is a descending order, it works well. Other cases work correctly.
> {code}
> select l_orderkey, l_partkey from lineitem order by l_orderkey, l_partkey desc;
> l_orderkey,  l_partkey
> -------------------------------
> 1,  155190
> 1,  67310
> 1,  63700
> 1,  24027
> 1,  15635
> 1,  2132
> 2,  106170
> 3,  183095
> 3,  128449
> 3,  62143
> 3,  29380
> 3,  19036
> 3,  4297
> ...
> {code}
> But, if the first sort key is a descending order, it causes wrong row number and shows
wrong range part as follows:
> {code}
> default> select l_orderkey, l_partkey from lineitem order by l_orderkey desc, l_partkey;
> l_orderkey,  l_partkey
> -------------------------------
> 3000000,  61045
> 3000000,  159113
> 3000000,  167695
> 3000000,  167904
> 3000000,  196339
> ...
> {code}
> According to my investigation, it seems to be related to offset problem of RowFile or
index problem. The final result includes duplicated rows and the final row was wrong as follows:
> {code:title=part-02-000000-000}
> 3000000|61045
> 3000000|159113
> 3000000|167695
> 3000000|167904
> 3000000|196339
> 2999975|28334
> 2999975|194023
> 2999974|8020
> 2999974|124152
> 2999974|129921
> 2999974|139248
> 2999974|168914
> 2999974|187923
> 2999973|30533
> 2999973|36196
> ...
> 2919713|133486
> 2919713|195963
> 2919712|86257
> 2919712|94542
> 2919712|107370
> 2919712|166342 <- duplicated rows
> 2919712|178277
> ....
> 1|63700
> 1|67310
> 1|155190
> [EOF]
> {code}
> {code:title=part-02-000001-000}
> |96127                     <- looks wrong
> 6000000|32255
> 6000000|96127
> 5999975|6452
> 5999975|7272
> 5999975|37131
> ....
> ....
> 2919713|133486
> 2919713|195963
> 2919712|94542
> 2919712|107370
> 2919712|166342    <- duplicated rows
> [EOF]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message