drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From adeneche <...@git.apache.org>
Subject [GitHub] drill pull request: Drill 4236: ExternalSort should use the new al...
Date Tue, 05 Jan 2016 21:50:54 GMT
Github user adeneche commented on a diff in the pull request:

    https://github.com/apache/drill/pull/317#discussion_r48901098
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/ExternalSortBatch.java
---
    @@ -588,36 +571,29 @@ public BatchGroup mergeAndSpill(LinkedList<BatchGroup> batchGroups)
throws Schem
         } finally {
           hyperBatch.clear();
         }
    -    long bufSize = getBufferSize(c1);
    -    totalSizeInMemory += bufSize;
    -    logger.debug("mergeAndSpill: final total size in memory = {}", totalSizeInMemory);
    +    logger.debug("mergeAndSpill: final total size in memory = {}", oAllocator.getAllocatedMemory());
         logger.info("Completed spilling to {}", outputFile);
         return newGroup;
       }
     
    -  private long getBufferSize(VectorAccessible batch) {
    -    long size = 0;
    -    for (VectorWrapper<?> w : batch) {
    -      DrillBuf[] bufs = w.getValueVector().getBuffers(false);
    -      for (DrillBuf buf : bufs) {
    -        size += buf.getPossibleMemoryConsumed();
    -      }
    -    }
    -    return size;
    -  }
    -
       private SelectionVector2 newSV2() throws OutOfMemoryException, InterruptedException
{
    -    SelectionVector2 sv2 = new SelectionVector2(oContext.getAllocator());
    +    SelectionVector2 sv2 = new SelectionVector2(oAllocator);
         if (!sv2.allocateNewSafe(incoming.getRecordCount())) {
           try {
    -        // Not being able to allocate sv2 means this operator's allocator reached it's
maximum capacity.
    -        // Spilling this.batchGroups won't help here as they are owned by upstream operator,
    -        // but spilling spilledBatchGroups may free enough memory
    -        final BatchGroup merged = mergeAndSpill(spilledBatchGroups);
    +        final BatchGroup merged = mergeAndSpill(batchGroups);
             if (merged != null) {
    -          spilledBatchGroups.addFirst(merged);
    +          spilledBatchGroups.add(merged);
    --- End diff --
    
    One possible explanation, and @StevenMPhillips could confirm it:
    
    sort tries to prevent `spilledBatchGroups` from becoming too large by spilling batches
from this group regularly to disk, by putting the result of those spills at the beginning
of spilledBatchGroups  we avoid spilling the same batches over and over again


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message