drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5601) Rollup of External Sort memory management fixes
Date Thu, 13 Jul 2017 22:21:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16086501#comment-16086501
] 

ASF GitHub Bot commented on DRILL-5601:
---------------------------------------

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r124132387
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/PriorityQueueCopierWrapper.java
---
    @@ -245,29 +250,35 @@ private BatchMerger(PriorityQueueCopierWrapper holder, BatchSchema
schema, List<
     
         @Override
         public boolean next() {
    -      Stopwatch w = Stopwatch.createStarted();
           long start = holder.getAllocator().getAllocatedMemory();
    +
    +      // Allocate an outgoing container the "dumb" way (based on static sizes)
    +      // for testing, or the "smart" way (based on actual observed data sizes)
    +      // for production code.
    +
    +      if (allocHelper == null) {
    +        VectorAccessibleUtilities.allocateVectors(outputContainer, targetRecordCount);
    +      } else {
    +        allocHelper.allocateBatch(outputContainer, targetRecordCount);
    +      }
    +      logger.trace("Initial output batch allocation: {} bytes",
    +                   holder.getAllocator().getAllocatedMemory() - start);
    +      Stopwatch w = Stopwatch.createStarted();
           int count = holder.copier.next(targetRecordCount);
    -      copyCount += count;
           if (count > 0) {
             long t = w.elapsed(TimeUnit.MICROSECONDS);
             batchCount++;
    -        logger.trace("Took {} us to merge {} records", t, count);
             long size = holder.getAllocator().getAllocatedMemory() - start;
    +        logger.trace("Took {} us to merge {} records, consuming {} bytes of memory",
    +                     t, count, size);
             estBatchSize = Math.max(estBatchSize, size);
           } else {
             logger.trace("copier returned 0 records");
           }
     
    -      // Identify the schema to be used in the output container. (Since
    -      // all merged batches have the same schema, the schema we identify
    -      // here should be the same as that which we already had.
    +      // Initialize output container metadata.
    --- End diff --
    
    Why remove the original comments ?  They still look valid.



> Rollup of External Sort memory management fixes
> -----------------------------------------------
>
>                 Key: DRILL-5601
>                 URL: https://issues.apache.org/jira/browse/DRILL-5601
>             Project: Apache Drill
>          Issue Type: Task
>    Affects Versions: 1.11.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.12.0
>
>
> Rollup of a set of specific JIRA entries that all relate to the very difficult problem
of managing memory within Drill in order for the external sort to stay within a memory budget.
In general, the fixes relate to better estimating memory used by the three ways that Drill
allocates vector memory (see DRILL-5522) and to predicting the size of vectors that the sort
will create, to avoid repeated realloc-copy cycles (see DRILL-5594).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message