drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rahul Challapalli (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-5528) Sorting 19GB data with 14GB memory in a single fragment takes ~150 minutes
Date Fri, 19 May 2017 16:43:04 GMT
Rahul Challapalli created DRILL-5528:
----------------------------------------

             Summary: Sorting 19GB data with 14GB memory in a single fragment takes ~150 minutes
                 Key: DRILL-5528
                 URL: https://issues.apache.org/jira/browse/DRILL-5528
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Relational Operators
    Affects Versions: 1.10.0
            Reporter: Rahul Challapalli
            Assignee: Paul Rogers


Configuration :
{code}
git.commit.id.abbrev=1e0a14c
DRILL_MAX_DIRECT_MEMORY="32G"
DRILL_MAX_HEAP="4G"
{code}

Based on the runtime of the below query, I suspect there is a performance bottleneck somewhere
{code}
[root@qa-node190 external-sort]# /opt/drill/bin/sqlline -u jdbc:drill:zk=10.10.100.190:5181
apache drill 1.11.0-SNAPSHOT
"start your sql engine"
0: jdbc:drill:zk=10.10.100.190:5181> ALTER SESSION SET `exec.sort.disable_managed` = false;
+-------+-------------------------------------+
|  ok   |               summary               |
+-------+-------------------------------------+
| true  | exec.sort.disable_managed updated.  |
+-------+-------------------------------------+
1 row selected (0.975 seconds)
0: jdbc:drill:zk=10.10.100.190:5181> alter session set `planner.width.max_per_node` = 1;
+-------+--------------------------------------+
|  ok   |               summary                |
+-------+--------------------------------------+
| true  | planner.width.max_per_node updated.  |
+-------+--------------------------------------+
1 row selected (0.371 seconds)
0: jdbc:drill:zk=10.10.100.190:5181> alter session set `planner.disable_exchanges` = true;
+-------+-------------------------------------+
|  ok   |               summary               |
+-------+-------------------------------------+
| true  | planner.disable_exchanges updated.  |
+-------+-------------------------------------+
1 row selected (0.292 seconds)
0: jdbc:drill:zk=10.10.100.190:5181> alter session set `planner.memory.max_query_memory_per_node`
= 14106127360;
+-------+----------------------------------------------------+
|  ok   |                      summary                       |
+-------+----------------------------------------------------+
| true  | planner.memory.max_query_memory_per_node updated.  |
+-------+----------------------------------------------------+
1 row selected (0.316 seconds)
0: jdbc:drill:zk=10.10.100.190:5181> select count(*) from (select * from dfs.`/drill/testdata/resource-manager/250wide.tbl`
order by columns[0])d where d.columns[0] = 'ljdfhwuehnoiueyf';
+---------+
| EXPR$0  |
+---------+
| 0       |
+---------+
1 row selected (8530.719 seconds)
{code}

I attached the logs and profile files. The data is too large to attach to a jira. Reach out
to me if you need any more information



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message