drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Hou (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5478) Spill file size parameter is not honored by the managed external sort
Date Fri, 15 Sep 2017 17:48:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16168247#comment-16168247
] 

Robert Hou commented on DRILL-5478:
-----------------------------------

We should still test it on behalf of Support.  We don't have to test it extensively, but ensure
it still works in general.

The file size in this example is 256 MB.  The memory is 1 GB.  Is this a reasonable set of
values?

> Spill file size parameter is not honored by the managed external sort
> ---------------------------------------------------------------------
>
>                 Key: DRILL-5478
>                 URL: https://issues.apache.org/jira/browse/DRILL-5478
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>    Affects Versions: 1.10.0
>            Reporter: Rahul Challapalli
>            Assignee: Paul Rogers
>             Fix For: 1.12.0
>
>
> git.commit.id.abbrev=1e0a14c
> Query:
> {code}
> ALTER SESSION SET `exec.sort.disable_managed` = false;
> alter session set `planner.width.max_per_node` = 1;
> alter session set `planner.disable_exchanges` = true;
> alter session set `planner.width.max_per_query` = 1;
> alter session set `planner.memory.max_query_memory_per_node` = 1052428800;
> alter session set `planner.enable_decimal_data_type` = true;
> select count(*) from (
>   select * from dfs.`/drill/testdata/resource-manager/all_types_large` d1
>   order by d1.map.missing
> ) d;
> {code}
> Boot Options (spill file size is set to 256MB)
> {code}
> 0: jdbc:drill:zk=10.10.100.190:5181> select * from sys.boot where name like '%spill%';
> +--------------------------------------------------+---------+-------+---------+----------+----------------------------------------------------+-----------+------------+
> |                       name                       |  kind   | type  | status  | num_val
 |                     string_val                     | bool_val  | float_val  |
> +--------------------------------------------------+---------+-------+---------+----------+----------------------------------------------------+-----------+------------+
> | drill.exec.sort.external.spill.directories       | STRING  | BOOT  | BOOT    | null
    | [
>     # drill-override.conf: 26
>     "/tmp/test"
> ]  | null      | null       |
> | drill.exec.sort.external.spill.file_size         | STRING  | BOOT  | BOOT    | null
    | "256M"                                             | null      | null       |
> | drill.exec.sort.external.spill.fs                | STRING  | BOOT  | BOOT    | null
    | "maprfs:///"                                       | null      | null       |
> | drill.exec.sort.external.spill.group.size        | LONG    | BOOT  | BOOT    | 40000
   | null                                               | null      | null       |
> | drill.exec.sort.external.spill.merge_batch_size  | STRING  | BOOT  | BOOT    | null
    | "16M"                                              | null      | null       |
> | drill.exec.sort.external.spill.spill_batch_size  | STRING  | BOOT  | BOOT    | null
    | "8M"                                               | null      | null       |
> | drill.exec.sort.external.spill.threshold         | LONG    | BOOT  | BOOT    | 40000
   | null                                               | null      | null       |
> +--------------------------------------------------+---------+-------+---------+----------+----------------------------------------------------+-----------+------------+
> {code}
> Below are the spill files while the query is still executing. The size of the spill files
is ~34MB
> {code}
> -rwxr-xr-x   3 root root   34957815 2017-05-05 11:26 /tmp/test/26f33c36-4235-3531-aeaa-2c73dc4ddeb5_major0_minor0_op5_sort/run1
> -rwxr-xr-x   3 root root   34957815 2017-05-05 11:27 /tmp/test/26f33c36-4235-3531-aeaa-2c73dc4ddeb5_major0_minor0_op5_sort/run2
> -rwxr-xr-x   3 root root          0 2017-05-05 11:27 /tmp/test/26f33c36-4235-3531-aeaa-2c73dc4ddeb5_major0_minor0_op5_sort/run3
> {code}
> The data set is too large to attach here. Reach out to me if you need anything



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message