drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacques Nadeau <jacq...@apache.org>
Subject Re: connection allocator in rpc layer is using too much memory
Date Tue, 07 Jul 2015 20:20:06 GMT
I understand that.  However, if the top level allocator is out of memory,
child allocators aren't going to help us.  The one thing that a child
allocator may allow you to do is do a child reservation so we hold back
memory for other uses.  Is that what you're proposing?

Do we even check allocation limits when we are doing ownership transfers?

On Tue, Jul 7, 2015 at 12:10 PM, Abdel Hakim Deneche <adeneche@maprtech.com>
wrote:

> The main issue I'm seeing in DRILL-3241 is when we hit an out of memory in
> the RPC layer it's too soon to figure out which fragment executor we should
> fail, so the query just forever. I'm hoping that by using a sub-allocator
> for the RPC layer we will hit the out of memory later (e.g. when
> transferring the batch to the fragment's allocator) and we will be able to
> properly fail the query.
>
> On Tue, Jul 7, 2015 at 10:46 AM, Jacques Nadeau <jacques@apache.org>
> wrote:
>
> > Yes, I believe it using the TopLevelAllocator.  We could have a
> > suballocator but I can't really see how that would help with the JIRA
> > issue.  The one thing that might be a good idea is that we could then
> have
> > extra memory reservation for the RPC layer.  (In general, we don't want
> to
> > run out of memory inside the RPC layer.)
> >
> > On Tue, Jul 7, 2015 at 10:10 AM, Abdel Hakim Deneche <
> > adeneche@maprtech.com>
> > wrote:
> >
> > > My bad, I didn't explain the problem well. The value displayed in the
> log
> > > is the amount currently allocated by the
> ProtobufLengthDecoder.allocator
> > > and not the size we are trying to allocate, I will add the size we are
> > > trying to the log message and report back here.
> > >
> > > I was assuming the RPC layer uses it's own child allocator and it
> didn't
> > > make sense to me this allocator reached > 1GB because it should
> transfer
> > > the batch their corresponding fragment context (we are on the data
> server
> > > side). But then while investigating further I think the
> > > ProtobufLengthDecoder is actually using the drillbit top level
> allocator.
> > > am I right ?
> > > This would explain why the allocator reached it's limit. Any reason the
> > RPC
> > > layer isn't using it's own child allocator ?
> > >
> > > Thanks!
> > >
> > > On Tue, Jul 7, 2015 at 10:02 AM, Jacques Nadeau <jacques@apache.org>
> > > wrote:
> > >
> > > > There is a time where data is read off the socket before we know what
> > > type
> > > > of message it is.  This socket read buffer is outside the normal flow
> > and
> > > > could grow (although it shouldn't get this big).  However, the memory
> > > > you're talking about here is memory allocated due to the size of the
> > > > incoming message.  My guess would be either you have unusually large
> > > > records or the length of the message being sent was corrupted.
> > (Assuming
> > > > you are talking about the allocation at [1]).
> > > >
> > > > I would start logging unusually large record batches and see if
> > something
> > > > weird is going on.  A record batch shouldn't be larger than 65k
> records
> > > so
> > > > for the batch to be 1gb in size would require each record to be 16k
> in
> > > size
> > > > and for the batch to be the maximum number of records.  More
> > > realistically,
> > > > we generally target 4k records in a batch which would suggest records
> > > that
> > > > are 256k.
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/rpc/ProtobufLengthDecoder.java#L87
> > > >
> > > > On Tue, Jul 7, 2015 at 9:13 AM, Abdel Hakim Deneche <
> > > adeneche@maprtech.com
> > > > >
> > > > wrote:
> > > >
> > > > > Trying to investigate DRILL-3241
> > > > > <https://issues.apache.org/jira/browse/DRILL-3241> (query hangs
if
> > out
> > > > of
> > > > > memory in RPC layer), I see the following warning in the logs:
> > > > >
> > > > > WARN: Failure allocating buffer on incoming stream due to
> > > > >
> > > > >  memory limits.  Current Allocation: 1372678764.
> > > > >
> > > > >
> > > > > This happening in ProtobufLengthDecoder.decode() on the receiver
> side
> > > > (data
> > > > > server).
> > > > >
> > > > > Is it expected for the connection allocation to allocation > 1GB
of
> > > > memory
> > > > > ? shouldn't the allocated batches be transferred to the receiving
> > > > > fragment's allocator ?
> > > > >
> > > > > Thanks!
> > > > >
> > > > > --
> > > > >
> > > > > Abdelhakim Deneche
> > > > >
> > > > > Software Engineer
> > > > >
> > > > >   <http://www.mapr.com/>
> > > > >
> > > > >
> > > > > Now Available - Free Hadoop On-Demand Training
> > > > > <
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Abdelhakim Deneche
> > >
> > > Software Engineer
> > >
> > >   <http://www.mapr.com/>
> > >
> > >
> > > Now Available - Free Hadoop On-Demand Training
> > > <
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > >
> > >
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message