drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timothy Farkas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5755) TOP_N_SORT operator does not free memory while running
Date Fri, 15 Sep 2017 21:58:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16168585#comment-16168585

Timothy Farkas commented on DRILL-5755:

Hey [~cchang@maprtech.com] I'm wrapping up some other PRs today. I will start working on this
first thing next week. Let me know what your timeline is, and I will try to get this done.


> TOP_N_SORT operator does not free memory while running
> ------------------------------------------------------
>                 Key: DRILL-5755
>                 URL: https://issues.apache.org/jira/browse/DRILL-5755
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>    Affects Versions: 1.11.0
>            Reporter: Boaz Ben-Zvi
>            Assignee: Timothy Farkas
>            Priority: Blocker
>         Attachments: 2658c253-20b6-db90-362a-139aae4a327e.sys.drill
>  The TOP_N_SORT operator should keep the top N rows while processing its input, and free
the memory used to hold all rows below the top N.
> For example, the following query uses a table with 125M rows:
> {code}
> select row_count, sum(row_count), avg(double_field), max(double_rand), count(float_rand)
from dfs.`/data/tmp` group by row_count order by row_count limit 30;
> {code}
> And failed with an OOM when each of the 3 TOP_N_SORT operators was holding about 2.44
GB !! (see attached profile).  It should take far less memory to hold 30 rows !!

This message was sent by Atlassian JIRA

View raw message