drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5528) Sorting 19GB data with 14GB memory in a single fragment takes ~150 minutes
Date Thu, 08 Jun 2017 23:17:18 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043607#comment-16043607
] 

Paul Rogers commented on DRILL-5528:
------------------------------------

Preliminary results suggest that Drill is severely CPU bound for this query. Running on a
Mac, the thread running the query is at 100% of CPU. But, disk I/O is very small, only about
40MB/s for reads, 20 MB/s for writes peak, with many valleys in between -- on an SSD that
can do 700 MB/s. Earlier tests got upwards of 70+ MB/s.

This suggests that, somehow, something is using far to much CPU to do the work of the sort.
Will have to investigate. Note, the above is on the DRILL-5325 branch, not master.

The original goal of the "managed sort" project was just to manage the memory used by the
sort. Fixing performance issues, while important, starts wandering a bit off of the original
goals...

> Sorting 19GB data with 14GB memory in a single fragment takes ~150 minutes
> --------------------------------------------------------------------------
>
>                 Key: DRILL-5528
>                 URL: https://issues.apache.org/jira/browse/DRILL-5528
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>    Affects Versions: 1.10.0
>            Reporter: Rahul Challapalli
>            Assignee: Paul Rogers
>         Attachments: 26e2183f-2c5d-4044-0e23-4aeb45c40be5.sys.drill, drillbit.log, drillbit.out,
drill-env.sh
>
>
> Configuration :
> {code}
> git.commit.id.abbrev=1e0a14c
> DRILL_MAX_DIRECT_MEMORY="32G"
> DRILL_MAX_HEAP="4G"
> {code}
> Based on the runtime of the below query, I suspect there is a performance bottleneck
somewhere
> {code}
> [root@qa-node190 external-sort]# /opt/drill/bin/sqlline -u jdbc:drill:zk=10.10.100.190:5181
> apache drill 1.11.0-SNAPSHOT
> "start your sql engine"
> 0: jdbc:drill:zk=10.10.100.190:5181> ALTER SESSION SET `exec.sort.disable_managed`
= false;
> +-------+-------------------------------------+
> |  ok   |               summary               |
> +-------+-------------------------------------+
> | true  | exec.sort.disable_managed updated.  |
> +-------+-------------------------------------+
> 1 row selected (0.975 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> alter session set `planner.width.max_per_node`
= 1;
> +-------+--------------------------------------+
> |  ok   |               summary                |
> +-------+--------------------------------------+
> | true  | planner.width.max_per_node updated.  |
> +-------+--------------------------------------+
> 1 row selected (0.371 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> alter session set `planner.disable_exchanges`
= true;
> +-------+-------------------------------------+
> |  ok   |               summary               |
> +-------+-------------------------------------+
> | true  | planner.disable_exchanges updated.  |
> +-------+-------------------------------------+
> 1 row selected (0.292 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> alter session set `planner.memory.max_query_memory_per_node`
= 14106127360;
> +-------+----------------------------------------------------+
> |  ok   |                      summary                       |
> +-------+----------------------------------------------------+
> | true  | planner.memory.max_query_memory_per_node updated.  |
> +-------+----------------------------------------------------+
> 1 row selected (0.316 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> select count(*) from (select * from dfs.`/drill/testdata/resource-manager/250wide.tbl`
order by columns[0])d where d.columns[0] = 'ljdfhwuehnoiueyf';
> +---------+
> | EXPR$0  |
> +---------+
> | 0       |
> +---------+
> 1 row selected (8530.719 seconds)
> {code}
> I attached the logs and profile files. The data is too large to attach to a jira. Reach
out to me if you need any more information



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message