drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abdel Hakim Deneche <adene...@maprtech.com>
Subject ExternalSort doesn't properly account for sliced buffers
Date Fri, 20 Nov 2015 19:25:48 GMT
I'm looking at the external sort code and it uses the following method to
compute the allocated size of a batch:

  private long getBufferSize(VectorAccessible batch) {
>     long size = 0;
>     for (VectorWrapper<?> w : batch) {
>       DrillBuf[] bufs = w.getValueVector().getBuffers(false);
>       for (DrillBuf buf : bufs) {
>         if (*buf.isRootBuffer()*) {
>           size += buf.capacity();
>         }
>       }
>     }
>     return size;
>   }


This method only accounts for root buffers, but when we have a receiver
below the sort, most of (if not all) buffers are child buffers. This may
delay spilling, and increase the memory usage of the drillbit. If my
computations are correct, for a single query, one drillbit can allocate up
to 40GB without spilling once to disk.

Is there a specific reason we only account for root buffers ?

-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message