drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5284) Roll-up of final fixes for managed sort
Date Sat, 25 Feb 2017 03:36:45 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883995#comment-15883995
] 

ASF GitHub Bot commented on DRILL-5284:
---------------------------------------

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/761#discussion_r103067805
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/ExternalSortBatch.java
---
    @@ -392,22 +448,31 @@ private void configure(DrillConfig config) {
         // Set too large and the ratio between memory and input data sizes becomes
         // small. Set too small and disk seek times dominate performance.
     
    -    spillBatchSize = config.getBytes(ExecConstants.EXTERNAL_SORT_SPILL_BATCH_SIZE);
    -    spillBatchSize = Math.max(spillBatchSize, MIN_SPILL_BATCH_SIZE);
    +    preferredSpillBatchSize = config.getBytes(ExecConstants.EXTERNAL_SORT_SPILL_BATCH_SIZE);
    +
    +    // In low memory, use no more than 1/4 of memory for each spill batch. Ensures we
    +    // can merge.
    +
    +    preferredSpillBatchSize = Math.min(preferredSpillBatchSize, memoryLimit / 4);
    --- End diff --
    
    Why restrict the spill batch size so low ? This would create more runs and increase the
risk of needing those intermediate merges.  Otherwise during a merge, only a single batch
at a time is read from each run, not the whole run (I believe -- if we spill all the remaining
batches at the end ...)



> Roll-up of final fixes for managed sort
> ---------------------------------------
>
>                 Key: DRILL-5284
>                 URL: https://issues.apache.org/jira/browse/DRILL-5284
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.10.0
>
>
> The managed external sort was introduced in DRILL-5080. Since that time, extensive testing
has identified a number of minor fixes and improvements. Given the long PR cycles, it is not
practical to spend a week or two to do a PR for each fix individually. This ticket represents
a roll-up of a combination of a number of fixes. Small fixes are listed here, larger items
appear as sub-tasks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message