drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hanifi Gunes <hgu...@maprtech.com>
Subject Re: ExternalSort doesn't properly account for sliced buffers
Date Sat, 21 Nov 2015 01:10:22 GMT
@Abdel back to your question

*Is there a specific reason we only account for root buffers ?*

My understanding was that we'd like to avoid double counting as slices are
zero copy at any point you'd either account root buffer or slices but not
both together. That being said, however, I do not see any reason why a
vector would internally slice its underlying buffer except an invocation to
VV#load in which case the vector is given a non-root buffer as well. So
making this check on root buffer-ness seems unneeded to me since
VV#getBuffers() should never return a root buffer and a slice from the same
buffer. I would like to be proven wrong though.


I guess, perhaps, Steven is referring to previous implementation of
DrillBuf but returning slice width rather than root buffer width seems
conforming to ByteBuf#capacity(), which is the current code anyway.

On Fri, Nov 20, 2015 at 4:26 PM, Abdel Hakim Deneche <adeneche@maprtech.com>
wrote:

> I'm confused here. Looking at DrillBuf.capacity() it already returns the
> length of the slice:
>
>   @Override
> >   public int capacity() {
> >     return length;
> >   }
>
>
>
>
> On Fri, Nov 20, 2015 at 4:11 PM, Hanifi GUNES <hanifigunes@gmail.com>
> wrote:
>
> > Seems like I am missing some input here. Are we talking about changing
> the
> > behavior of DrillBuf#capacity()? If so, I would +1 the following:
> >
> > *- It seems like capacity should return the length of the slice (since
> > I believe that fits with the general ByteBuf interface). If I
> > reset writerIndex(0), I should be able write to capacity() without
> issue.*
> >
> > 2015-11-20 16:03 GMT-08:00 Jacques Nadeau <jacques@dremio.com>:
> >
> > > My gut is that any changes other than correct accounting will have
> little
> > > impact on real world situations. Do you think the change will make
> enough
> > > difference to be valuable?
> > >
> > > It seems like capacity should return the length of the slice (since I
> > > believe that fits with the general ByteBuf interface). If I reset
> > > writerIndex(0), I should be able write to capacity() without issue.
> Seems
> > > weird to return some other (mostly disconnect) value.
> > >
> > >
> > >
> > > --
> > > Jacques Nadeau
> > > CTO and Co-Founder, Dremio
> > >
> > > On Fri, Nov 20, 2015 at 3:28 PM, Abdel Hakim Deneche <
> > > adeneche@maprtech.com>
> > > wrote:
> > >
> > > > @Steven, I think DrillBuf.capacity() was changed at some point, I was
> > > > looking at the code and it seems to only return the size of the
> "slice"
> > > and
> > > > not the root buffer.
> > > >
> > > > While waiting for the new allocator and transfer of ownership, would
> it
> > > > make sense to remove the check for root buffers like this ?
> > > >
> > > >   private long getBufferSize(VectorAccessible batch) {
> > > > >     long size = 0;
> > > > >     for (VectorWrapper<?> w : batch) {
> > > > >       DrillBuf[] bufs = w.getValueVector().getBuffers(false);
> > > > >       for (DrillBuf buf : bufs) {
> > > > >         size += buf.capacity();
> > > > >       }
> > > > >     }
> > > > >     return size;
> > > > >   }
> > > >
> > > >
> > > >
> > > > On Fri, Nov 20, 2015 at 2:50 PM, Hanifi GUNES <hanifigunes@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Problem with the above code is that not all vectors operate on root
> > > > buffers
> > > > > rendering the accounting above inaccurate. In fact your example is
> > one
> > > > > perfect instance where vectors would point to non-root buffers for
> > sure
> > > > > because of the slicing taking place at RecordBatchLoader#load [1]
> > > > >
> > > > >
> > > > > 1:
> > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchLoader.java#L117
> > > > >
> > > > > 2015-11-20 11:41 GMT-08:00 Steven Phillips <steven@dremio.com>:
> > > > >
> > > > > > I think it is because we can't actually properly account for
> sliced
> > > > > > buffers. I don't remember for sure, but I think it might be
> because
> > > > > calling
> > > > > > buf.capacity() on a sliced buffer returns the the capacity of
> root
> > > > > buffer,
> > > > > > not the size of the slice. That may not be correct, but I think
> it
> > > was
> > > > > > something like that. Whatever it is, I am pretty sure it was
> giving
> > > > wrong
> > > > > > results when they are sliced buffers.
> > > > > >
> > > > > > I think we need to get the new allocator, along with proper
> > transfer
> > > of
> > > > > > ownership in order to do this correctly. Then we can just query
> the
> > > > > > allocator rather than trying to track it separately.
> > > > > >
> > > > > > On Fri, Nov 20, 2015 at 11:25 AM, Abdel Hakim Deneche <
> > > > > > adeneche@maprtech.com
> > > > > > > wrote:
> > > > > >
> > > > > > > I'm looking at the external sort code and it uses the following
> > > > method
> > > > > to
> > > > > > > compute the allocated size of a batch:
> > > > > > >
> > > > > > >   private long getBufferSize(VectorAccessible batch) {
> > > > > > > >     long size = 0;
> > > > > > > >     for (VectorWrapper<?> w : batch) {
> > > > > > > >       DrillBuf[] bufs = w.getValueVector().getBuffers(false);
> > > > > > > >       for (DrillBuf buf : bufs) {
> > > > > > > >         if (*buf.isRootBuffer()*) {
> > > > > > > >           size += buf.capacity();
> > > > > > > >         }
> > > > > > > >       }
> > > > > > > >     }
> > > > > > > >     return size;
> > > > > > > >   }
> > > > > > >
> > > > > > >
> > > > > > > This method only accounts for root buffers, but when we
have a
> > > > receiver
> > > > > > > below the sort, most of (if not all) buffers are child
buffers.
> > > This
> > > > > may
> > > > > > > delay spilling, and increase the memory usage of the drillbit.
> If
> > > my
> > > > > > > computations are correct, for a single query, one drillbit
can
> > > > allocate
> > > > > > up
> > > > > > > to 40GB without spilling once to disk.
> > > > > > >
> > > > > > > Is there a specific reason we only account for root buffers
?
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Abdelhakim Deneche
> > > > > > >
> > > > > > > Software Engineer
> > > > > > >
> > > > > > >   <http://www.mapr.com/>
> > > > > > >
> > > > > > >
> > > > > > > Now Available - Free Hadoop On-Demand Training
> > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Abdelhakim Deneche
> > > >
> > > > Software Engineer
> > > >
> > > >   <http://www.mapr.com/>
> > > >
> > > >
> > > > Now Available - Free Hadoop On-Demand Training
> > > > <
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > >
> > > >
> > >
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message