drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-6126) Allocate memory for value vectors upfront in flatten operator
Date Sat, 03 Mar 2018 01:13:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-6126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384397#comment-16384397
] 

ASF GitHub Bot commented on DRILL-6126:
---------------------------------------

Github user ppadma commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1125#discussion_r171999276
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchSizer.java
---
    @@ -245,16 +251,30 @@ private void buildVectorInitializer(VectorInitializer initializer)
{
           else if (width > 0) {
             initializer.variableWidth(name, width);
           }
    +
    +      for (ColumnSize columnSize : childColumnSizes.values()) {
    +        columnSize.buildVectorInitializer(initializer);
    +      }
         }
    +
       }
     
       public static ColumnSize getColumn(ValueVector v, String prefix) {
         return new ColumnSize(v, prefix);
       }
     
    +  public ColumnSize getColumn(String name) {
    +    return allColumnSizes.get(name);
    +  }
    +
       public static final int MAX_VECTOR_SIZE = ValueVector.MAX_BUFFER_SIZE; // 16 MiB
     
    -  private Map<String, ColumnSize> columnSizes = CaseInsensitiveMap.newHashMap();
    +  // This keeps information for all columns i.e. all top columns and nested columns underneath
    +  private Map<String, ColumnSize> allColumnSizes = CaseInsensitiveMap.newHashMap();
    --- End diff --
    
    yes, I got rid of allColumnSizes. We will have only top level columns.


> Allocate memory for value vectors upfront in flatten operator
> -------------------------------------------------------------
>
>                 Key: DRILL-6126
>                 URL: https://issues.apache.org/jira/browse/DRILL-6126
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Padma Penumarthy
>            Assignee: Padma Penumarthy
>            Priority: Critical
>             Fix For: 1.12.0
>
>
> With recent changes to control batch size for flatten operator, we figure out row count
in the output batch based on memory. Since we know how many rows we are going to include in
the batch, we can also allocate the memory needed upfront instead of starting with initial
value (4096) and doubling, copying every time we need more. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message