drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Hou (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-6180) Use System Option "output_batch_size" for External Sort
Date Fri, 23 Feb 2018 00:48:14 GMT

    [ https://issues.apache.org/jira/browse/DRILL-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16373744#comment-16373744

Robert Hou commented on DRILL-6180:

I would agree with [~paul-rogers] in general about having a uniform batch size for data that
is shared among operators.  There may be ways to work with different batch sizes, but we
can revisit this if we find the need (some databases do this, but it can get complicated).

> Use System Option "output_batch_size" for External Sort
> -------------------------------------------------------
>                 Key: DRILL-6180
>                 URL: https://issues.apache.org/jira/browse/DRILL-6180
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Flow
>    Affects Versions: 1.12.0
>            Reporter: Padma Penumarthy
>            Assignee: Padma Penumarthy
>            Priority: Critical
>             Fix For: 1.13.0
> External Sort has boot time configuration for output batch size "drill.exec.sort.external.spill.merge_batch_size"
which is defaulted to 16M.
> To make batch sizing configuration uniform across all operators, change this to use
new system option that is added "drill.exec.memory.operator.output_batch_size". This option
has default value of 32M.
> So, what are the implications if default is changed to 32M for external sort ?
> Instead, should we change the output batch size default to 16M for all operators ?

This message was sent by Atlassian JIRA

View raw message