impala-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Tauber-Marshall (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (IMPALA-397) ORDER BY rand() does not work.
Date Thu, 27 Apr 2017 18:23:04 GMT

     [ https://issues.apache.org/jira/browse/IMPALA-397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thomas Tauber-Marshall resolved IMPALA-397.
-------------------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 2.9.0

commit 6cddb952cefedd373b2a1ce71a1b3cff2e774d70
Author: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Date:   Tue Jan 31 10:33:07 2017 -0800

    IMPALA-4731/IMPALA-397/IMPALA-4728: Materialize sort exprs
    
    Previously, exprs used in sorts were evaluated lazily. This can
    potentially be bad for performance if the exprs are expensive to
    evaluate, and it can lead to crashes if the exprs are
    non-deterministic, as this violates assumptions of our sorting
    algorithm.
    
    This patch addresses these issues by materializing ordering exprs.
    It does so when the expr is non-deterministic (including when it
    contains a UDF, which we cannot currently know if they are
    non-deterministic), or when its cost exceeds a threshold (or the
    cost is unknown).
    
    Testing:
    - Added e2e tests in test_sort.py.
    - Updated planner tests.
    
    Change-Id: Ifefdaff8557a30ac44ea82ed428e6d1ffbca2e9e
    Reviewed-on: http://gerrit.cloudera.org:8080/6322
    Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
    Tested-by: Impala Public Jenkins

> ORDER BY rand() does not work.
> ------------------------------
>
>                 Key: IMPALA-397
>                 URL: https://issues.apache.org/jira/browse/IMPALA-397
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 1.0.1, Impala 2.3.0
>            Reporter: Alexander Behm
>            Assignee: Thomas Tauber-Marshall
>            Priority: Minor
>              Labels: correctness, downgraded, planner, usability
>             Fix For: Impala 2.9.0
>
>
> The cause of the issue below is that r is not materialized.
> {code}
> select id, name, rand() r from countries order by r limit 10;
> Query: select id, name, rand() r from countries order by r limit 10
> Query finished, fetching results ...
> +----+----------------+-----------------------+
> | id | name           | r                     |
> +----+----------------+-----------------------+
> | 3  | Canada         | 0.0004714746030380365 |
> | 5  | Australia      | 0.5895895192351144    |
> | 1  | United States  | 0.4431900859080209    |
> | 4  | Ireland        | 0.0739258840093044    |
> | 6  | Netherlands    | 0.4621509646354946    |
> | 2  | United Kingdom | 0.6679162032287178    |
> | 9  | France         | 0.8352529978543767    |
> | 8  | Germany        | 0.1610932858479644    |
> | 7  | New Zealand    | 0.4815021690360746    |
> | 91 | Antigua        | 0.5511845208477156    |
> +----+----------------+-----------------------+
> Returned 10 row(s) in 0.48s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message