drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Boaz Ben-Zvi (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-5755) TOP_N_SORT operator does not free memory while running
Date Wed, 30 Aug 2017 23:00:00 GMT
Boaz Ben-Zvi created DRILL-5755:

             Summary: TOP_N_SORT operator does not free memory while running
                 Key: DRILL-5755
                 URL: https://issues.apache.org/jira/browse/DRILL-5755
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Relational Operators
    Affects Versions: 1.11.0
            Reporter: Boaz Ben-Zvi

 The TOP_N_SORT operator should keep the top N rows while processing its input, and free the
memory used to hold all rows below the top N.

For example, the following query uses a table with 125M rows:
select row_count, sum(row_count), avg(double_field), max(double_rand), count(float_rand) from
dfs.`/data/tmp` group by row_count order by row_count limit 30;

And failed with an OOM when each of the 3 TOP_N_SORT operators was holding about 2.44 GB !!
(see attached profile).  It should take far less memory to hold 30 rows !!

This message was sent by Atlassian JIRA

View raw message