drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timothy Farkas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5755) TOP_N_SORT operator does not free memory while running
Date Fri, 15 Sep 2017 21:58:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16168585#comment-16168585
] 

Timothy Farkas commented on DRILL-5755:
---------------------------------------

Hey [~cchang@maprtech.com] I'm wrapping up some other PRs today. I will start working on this
first thing next week. Let me know what your timeline is, and I will try to get this done.

Thanks,
Tim

> TOP_N_SORT operator does not free memory while running
> ------------------------------------------------------
>
>                 Key: DRILL-5755
>                 URL: https://issues.apache.org/jira/browse/DRILL-5755
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>    Affects Versions: 1.11.0
>            Reporter: Boaz Ben-Zvi
>            Assignee: Timothy Farkas
>            Priority: Blocker
>         Attachments: 2658c253-20b6-db90-362a-139aae4a327e.sys.drill
>
>
>  The TOP_N_SORT operator should keep the top N rows while processing its input, and free
the memory used to hold all rows below the top N.
> For example, the following query uses a table with 125M rows:
> {code}
> select row_count, sum(row_count), avg(double_field), max(double_rand), count(float_rand)
from dfs.`/data/tmp` group by row_count order by row_count limit 30;
> {code}
> And failed with an OOM when each of the 3 TOP_N_SORT operators was holding about 2.44
GB !! (see attached profile).  It should take far less memory to hold 30 rows !!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message