drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yuliya Feldman <yufeld...@yahoo.com.INVALID>
Subject Re: Suspicious direct memory consumption when running queries concurrently
Date Sat, 01 Aug 2015 05:40:26 GMT
How much memory your jvm is taking?
Do you even have enough disk space to dump it.
      From: Abdel Hakim Deneche <adeneche@maprtech.com>
 To: "dev@drill.apache.org" <dev@drill.apache.org> 
 Sent: Friday, July 31, 2015 9:19 PM
 Subject: Re: Suspicious direct memory consumption when running queries concurrently
   
I tried getting a jmap dump multiple times without success, each time it
crashes the jvm with the following exception:

Dumping heap to /home/mapr/private-sql-hadoop-test/framework/myfile.hprof
> ...
> Exception in thread "main" java.io.IOException: Premature EOF
>        at
> sun.tools.attach.HotSpotVirtualMachine.readInt(HotSpotVirtualMachine.java:248)
>        at
> sun.tools.attach.LinuxVirtualMachine.execute(LinuxVirtualMachine.java:199)
>        at
> sun.tools.attach.HotSpotVirtualMachine.executeCommand(HotSpotVirtualMachine.java:217)
>        at
> sun.tools.attach.HotSpotVirtualMachine.dumpHeap(HotSpotVirtualMachine.java:180)
>        at sun.tools.jmap.JMap.dump(JMap.java:242)
>        at sun.tools.jmap.JMap.main(JMap.java:140)


On Mon, Jul 27, 2015 at 3:45 PM, Jacques Nadeau <jacques@dremio.com> wrote:

> A allocate -> release cycle all on the same thread goes into a per thread
> cache.
>
> A bunch of Netty arena settings are configurable.  The big issue I believe
> is that the limits are soft limits implemented by the allocation-time
> release mechanism.  As such, if you allocate a bunch of memory, then
> release it all, that won't necessarily trigger any actual chunk releases.
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Mon, Jul 27, 2015 at 12:47 PM, Abdel Hakim Deneche <
> adeneche@maprtech.com
> > wrote:
>
> > @Jacques, my understanding is that chunks are not owned by specific a
> > thread but they are part of a specific memory arena which is in turn only
> > accessed by specific threads. Do you want me to find which threads are
> > associated with the same arena where we have hanging chunks ?
> >
> >
> > On Mon, Jul 27, 2015 at 11:04 AM, Jacques Nadeau <jacques@dremio.com>
> > wrote:
> >
> > > It sounds like your statement is that we're cacheing too many unused
> > > chunks.  Hanifi and I previously discussed implementing a separate
> > flushing
> > > mechanism to release unallocated chunks that are hanging around.  The
> > main
> > > question is, why are so many chunks hanging around and what threads are
> > > they associated with.  A Jmap dump and analysis should allow you to do
> > > determine which thread owns the excess chunks.  My guess would be the
> RPC
> > > pool since those are long lasting (as opposed to the WorkManager pool,
> > > which is contracting).
> > >
> > > --
> > > Jacques Nadeau
> > > CTO and Co-Founder, Dremio
> > >
> > > On Mon, Jul 27, 2015 at 9:53 AM, Abdel Hakim Deneche <
> > > adeneche@maprtech.com>
> > > wrote:
> > >
> > > > When running a set of, mostly window function, queries concurrently
> on
> > a
> > > > single drillbit with a 8GB max direct memory. We are seeing a
> > continuous
> > > > increase of direct memory allocation.
> > > >
> > > > We repeat the following steps multiple times:
> > > > - we launch in "iteration" of tests that will run all queries in a
> > random
> > > > order, 10 queries at a time
> > > > - after the iteration finishes, we wait for a couple of minute to
> give
> > > > Drill time to release the memory being held by the finishing
> fragments
> > > >
> > > > Using Drill's memory logger ("drill.allocator") we were able to get
> > > > snapshots of how memory was internally used by Netty, we only focused
> > on
> > > > the number of allocated chunks, if we take this number and multiply
> it
> > by
> > > > 16MB (netty's chunk size) we get approximately the same value
> reported
> > by
> > > > Drill's direct memory allocation.
> > > > Here is a graph that shows the evolution of the number of allocated
> > > chunks
> > > > on a 500 iterations run (I'm working on improving the plots) :
> > > >
> > > > http://bit.ly/1JL6Kp3
> > > >
> > > > In this specific case, after the first iteration Drill was allocating
> > > ~2GB
> > > > of direct memory, this number kept rising after each iteration to
> ~6GB.
> > > We
> > > > suspect this caused one of our previous runs to crash the JVM.
> > > >
> > > > If we only focus on the log lines between iterations (when Drill's
> > memory
> > > > usage is below 10MB) then all allocated chunks are at most 2% usage.
> At
> > > > some point we end up with 288 nearly empty chunks, yet the next
> > iteration
> > > > will cause more chunks to be allocated!!!
> > > >
> > > > is this expected ?
> > > >
> > > > PS: I am running more tests and will update this thread with more
> > > > informations.
> > > >
> > > > --
> > > >
> > > > Abdelhakim Deneche
> > > >
> > > > Software Engineer
> > > >
> > > >  <http://www.mapr.com/>
> > > >
> > > >
> > > > Now Available - Free Hadoop On-Demand Training
> > > > <
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available


> > > > >
> > > >
> > >
> >
> >
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >  <http://www.mapr.com/>
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >
> >
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>


  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message