drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Westin <chriswesti...@gmail.com>
Subject Re: Suspicious direct memory consumption when running queries concurrently
Date Mon, 27 Jul 2015 19:22:45 GMT
The graph does look like it's reached some kind of asymptote -- is that
true, does it stop increasing at that point?

If it is, there's still the question of why it's so high.

Parth: yes, we're handing these buffers off from the RPC thread that
receives them to a worker thread that works on them. As Jacques mentions,
or pools should be dying off, but its the RPC pool that we may need to look
at more closely. What are its characteristics? If Netty hangs on to buffers
to recycle them, there's a pool per thread, which might explain the
apparent asymptotic approach I see in this graph: we've finally reached the
limit that can be cached per-thread for all the threads. Can we control the
size of that pool, possibly reducing it in size, or at least making it
smaller but more elastic (so it shrinks back down when the threads aren't
in use)? That might be an easy experiment to do to see how the graph is
affected.

On Mon, Jul 27, 2015 at 11:04 AM, Jacques Nadeau <jacques@dremio.com> wrote:

> It sounds like your statement is that we're cacheing too many unused
> chunks.  Hanifi and I previously discussed implementing a separate flushing
> mechanism to release unallocated chunks that are hanging around.  The main
> question is, why are so many chunks hanging around and what threads are
> they associated with.  A Jmap dump and analysis should allow you to do
> determine which thread owns the excess chunks.  My guess would be the RPC
> pool since those are long lasting (as opposed to the WorkManager pool,
> which is contracting).
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Mon, Jul 27, 2015 at 9:53 AM, Abdel Hakim Deneche <
> adeneche@maprtech.com>
> wrote:
>
> > When running a set of, mostly window function, queries concurrently on a
> > single drillbit with a 8GB max direct memory. We are seeing a continuous
> > increase of direct memory allocation.
> >
> > We repeat the following steps multiple times:
> > - we launch in "iteration" of tests that will run all queries in a random
> > order, 10 queries at a time
> > - after the iteration finishes, we wait for a couple of minute to give
> > Drill time to release the memory being held by the finishing fragments
> >
> > Using Drill's memory logger ("drill.allocator") we were able to get
> > snapshots of how memory was internally used by Netty, we only focused on
> > the number of allocated chunks, if we take this number and multiply it by
> > 16MB (netty's chunk size) we get approximately the same value reported by
> > Drill's direct memory allocation.
> > Here is a graph that shows the evolution of the number of allocated
> chunks
> > on a 500 iterations run (I'm working on improving the plots) :
> >
> > http://bit.ly/1JL6Kp3
> >
> > In this specific case, after the first iteration Drill was allocating
> ~2GB
> > of direct memory, this number kept rising after each iteration to ~6GB.
> We
> > suspect this caused one of our previous runs to crash the JVM.
> >
> > If we only focus on the log lines between iterations (when Drill's memory
> > usage is below 10MB) then all allocated chunks are at most 2% usage. At
> > some point we end up with 288 nearly empty chunks, yet the next iteration
> > will cause more chunks to be allocated!!!
> >
> > is this expected ?
> >
> > PS: I am running more tests and will update this thread with more
> > informations.
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >   <http://www.mapr.com/>
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message