drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-6032) Use RecordBatchSizer to estimate size of columns in HashAgg
Date Tue, 30 Jan 2018 01:10:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344333#comment-16344333
] 

ASF GitHub Bot commented on DRILL-6032:
---------------------------------------

Github user ilooner commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1101#discussion_r164614812
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashTableTemplate.java
---
    @@ -805,7 +803,12 @@ private IntVector allocMetadataVector(int size, int initialValue)
{
       }
     
       @Override
    -  public void setMaxVarcharSize(int size) { maxVarcharSize = size; }
    +  public void setKeySizes(Map<String, Integer> keySizes) {
    +    Preconditions.checkNotNull(keySizes);
    +
    +    this.keySizes = CaseInsensitiveMap.newHashMap();
    --- End diff --
    
    It helps to avoid bugs. It is assumed that the keySizes map will never change once it
is set, copying the map helps enforce that constraint. If we don't copy the map and a user
calls this method and passes a keySizes map and then later updates the keySizes map or reuses
it, errors would occur. Some languages like Scala have immutable flavors of data structures
for this reason.


> Use RecordBatchSizer to estimate size of columns in HashAgg
> -----------------------------------------------------------
>
>                 Key: DRILL-6032
>                 URL: https://issues.apache.org/jira/browse/DRILL-6032
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Timothy Farkas
>            Assignee: Timothy Farkas
>            Priority: Major
>             Fix For: 1.13.0
>
>
> We need to use the RecordBatchSize to estimate the size of columns in the Partition batches
created by HashAgg.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message