drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From paul-rogers <...@git.apache.org>
Subject [GitHub] drill pull request #822: DRILL-5457: Spill implementation for Hash Aggregate
Date Sat, 27 May 2017 05:38:06 GMT
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/822#discussion_r118811633
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggBatch.java
---
    @@ -136,15 +136,21 @@ public IterOutcome innerNext() {
           return IterOutcome.NONE;
         }
     
    -    if (aggregator.buildComplete() && !aggregator.allFlushed()) {
    -      // aggregation is complete and not all records have been output yet
    -      return aggregator.outputCurrentBatch();
    +    // if aggregation is complete and not all records have been output yet
    +    if (aggregator.buildComplete() ||
    +        // or: 1st phase need to return (not fully grouped) partial output due to memory
pressure
    +        aggregator.earlyOutput()) {
    +      // then output the next batch downstream
    +      IterOutcome out = aggregator.outputCurrentBatch();
    --- End diff --
    
    Since `HashAggregator` is not an operator executor (AKA record batch), it does not have
to follow the iterator protocol and use the `IterOutcome` enum. Instead, you can define your
own. You won't need the `OK_NEW_SCHEMA`, `OUT_OF_MEMORY`, `FAIL` or `NOT_YET` values. All
you seem to need is `OK`, `NONE` and `RESTART`.
    
    This approach will avoid the need to change the `IterOutcome` enum and export your states
to all of the Drill iterator protocol.
    
    Did something similar in Sort for the iterator class that returns either in-memory or
merged spilled batches.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message