drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boaz Ben-Zvi <bben-...@mapr.com>
Subject Re: direct.used on the Metrics page exceeded planner.memory.max_query_memory_per_node while running a single query
Date Tue, 15 Aug 2017 22:06:37 GMT
 Here are the relevant lines of code (from exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractBase.java)

    public static long MAX_ALLOCATION = 10_000_000_000L;
    protected long maxAllocation = MAX_ALLOCATION;

So this applies to every (instance == minor fragment) of every operator. The External Sort
and Hash Aggregate (since 1.11) override this allocator setting by calling setMaxAllocation()
.

The value set for each instance of a “buffering operator” (i.e. External Sort or Hash
Aggregate) is computed as:

    planner.memory.max_query_memory_per_node DIVIDE BY  (   num_buffering_operators   *  num_minor_fragments_per_node
)

For example, a plan for a query like   SELECT ….. GROUP BY X ORDER BY Y ,  would have one
External Sort OP, and two Hash Aggr OPS (two phase). And if there are 10 minor fragments,
and memory of 4GB , then each instance would get   4GB / 30  = 136 MB

      Boaz

On 8/15/17, 7:43 AM, "Muhammad Gelbana" <m.gelbana@gmail.com> wrote:

    By "instance", you mean minor fragments, correct ? And does the
    *planner.memory.max_query_memory_per_node* limit apply to each *type* of
    minor fragments individually ? Assuming the memory limit is set to *4 GB*,
    and the running query involves external sort and hash aggregates, should I
    expect the query to consume at least *8 GB* ?
    
    Would you please point out to me, where in the code can I look into the
    implementation of this 10GB memory limit ?
    
    Thanks,
    Gelbana
    
    On Tue, Aug 15, 2017 at 2:40 AM, Boaz Ben-Zvi <bben-zvi@mapr.com> wrote:
    
    > There is this page: https://drill.apache.org/docs/
    > sort-based-and-hash-based-memory-constrained-operators/
    > But it seems out of date (correct for 1.10). It does not explain about the
    > hash operators, except that they run till they can not allocate any more
    > memory. This will happen (undocumented) at either 10GB per instance, or
    > when there is no more memory at the node.
    >
    >      Boaz
    >
    > On 8/14/17, 1:16 PM, "Muhammad Gelbana" <m.gelbana@gmail.com> wrote:
    >
    >     I'm not sure which version I was using when that happened. But that's
    > some
    >     precise details you've mentioned! Is this mentioned somewhere in the
    > docs ?
    >
    >     Thanks a lot.
    >
    >     On Aug 14, 2017 9:00 PM, "Boaz Ben-Zvi" <bben-zvi@mapr.com> wrote:
    >
    >     > Did your query include a hash join ?
    >     >
    >     > As of 1.11, only the External Sort and Hash Aggregate operators obey
    > the
    >     > memory limit (that is, the “max query memory per node” figure is
    > divided
    >     > among all the instances of these operators).
    >     > The Hash Join (as was before 1.11) still does not take part in this
    > memory
    >     > allocation scheme, and each instance may use up to 10GB.
    >     >
    >     > Also in 1.11, the Hash Aggregate may “fall back” to the 1.10 behavior
    >     > (same as the Hash Join; i.e. up to 10GB) in case there is too little
    > memory
    >     > per an instance (because it cannot perform memory spilling, which
    > requires
    >     > some minimal memory to hold multiple batches).
    >     >
    >     >    Thanks,
    >     >
    >     >           Boaz
    >     >
    >     > On 8/11/17, 4:25 PM, "Muhammad Gelbana" <m.gelbana@gmail.com> wrote:
    >     >
    >     >     Sorry for the long subject !
    >     >
    >     >     I'm running a single query on a single node Drill setup.
    >     >
    >     >     I assumed that setting the *planner.memory.max_query_
    > memory_per_node*
    >     > property
    >     >     controls the max amount of memory (in bytes) for each running on
    > a
    >     > single
    >     >     node. Which means that in my setup, the *direct.used* metric in
    > the
    >     > metrics
    >     >     page should never exceed that value in my case.
    >     >
    >     >     But it did and drastically. I assigned *34359738368* (32 GB) to
    > the
    >     >     *planner.memory.max_query_memory_per_node* option but while
    >     > monitoring the
    >     >     *direct.used* metric, I found that it reached *51640484458* (~48
    > GB).
    >     >
    >     >     What did I mistakenly do\interpret ?
    >     >
    >     >     Thanks,
    >     >     Gelbana
    >     >     ​​
    >     >
    >     >
    >     >
    >
    >
    >
    

Mime
View raw message