Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id AC4E3200C1C for ; Wed, 1 Feb 2017 00:30:53 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id AAD72160B5F; Tue, 31 Jan 2017 23:30:53 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A718A160B52 for ; Wed, 1 Feb 2017 00:30:52 +0100 (CET) Received: (qmail 17303 invoked by uid 500); 31 Jan 2017 23:30:51 -0000 Mailing-List: contact user-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@impala.incubator.apache.org Delivered-To: mailing list user@impala.incubator.apache.org Received: (qmail 17290 invoked by uid 99); 31 Jan 2017 23:30:51 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 31 Jan 2017 23:30:51 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 6655E189AD0 for ; Tue, 31 Jan 2017 23:30:51 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.779 X-Spam-Level: * X-Spam-Status: No, score=1.779 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=cloudera-com.20150623.gappssmtp.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id hBKych09JIRw for ; Tue, 31 Jan 2017 23:30:49 +0000 (UTC) Received: from mail-it0-f45.google.com (mail-it0-f45.google.com [209.85.214.45]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 535615F239 for ; Tue, 31 Jan 2017 23:30:49 +0000 (UTC) Received: by mail-it0-f45.google.com with SMTP id c7so6041422itd.1 for ; Tue, 31 Jan 2017 15:30:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudera-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=+/n00jv2PQeh7tRxk6UV2zRv28wSGO6g2d3WHLeKWuY=; b=VOphutZEeHHKqPO9Uy1Z7tR2eKgC2LY7g4paua+cilFaVcosT641SzjXsuFwbXLGzD GvOhr8Z0IYEgehsJ067jVSe+Stkp81wN5nkyRlx4umiondGS17I36VsvtzBDTImQeLAA RC7F+hhc19rkbk5Elg93wTezFDTE6CMrVRCHavBLHw+uouO71j0SyRSBYFytQCds1Dvo jf5Es53GE0v1h2LxfCW7qhMmDJKQWx0FkbY1a17dhRr2/ZLpdNjw1t3AdpOjqclKVJcF ncQuzi20NoipfaTz/axyM239CZfZZw0kv/hWEFzpFpJ9PmB0tIjZcVB2lLuGtoRNKCZJ e9kA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=+/n00jv2PQeh7tRxk6UV2zRv28wSGO6g2d3WHLeKWuY=; b=a65b7c78aCX5sjTqZmrno7Qg00F4hlPQDkH0XM2hu0AI656/0U09p5xvhAJ5NZ1XNR NUi5kK5OJrUMTK11JeMvbeeMn8BilCBq7C6ms2tH9IfhMZDGI+hiL67tfjYadlwUlr+b CTWONfkY18YQnQB4HU2AtC6KdCga2ml4pBIQhxzhqhBqelCfbZ4oMEuXg5j+m/suN5vP Ib2OJtYuS09Gp2HmAkKnqh/JGKHkFAWNZMkApLAk34J5SA2KZiH6dIX4FR8vhCvh1Yc2 RPcXYpJViXnVPkDXnsPlIcV2y5CF14ulN3lSp696HlyN9Ox2rUkBCPb2sNZt0koN3VZs WcBA== X-Gm-Message-State: AIkVDXICSm1ygxxo2lrAhdRuEhYMwHWtNoO7i6plqxAYwOGEWUWFs//xLrfWAuGNxdJhxSKlmtRY1qSptkQ10y+q X-Received: by 10.36.53.78 with SMTP id k75mr22860628ita.45.1485905442300; Tue, 31 Jan 2017 15:30:42 -0800 (PST) MIME-Version: 1.0 Received: by 10.107.137.194 with HTTP; Tue, 31 Jan 2017 15:30:21 -0800 (PST) In-Reply-To: References: From: Matthew Jacobs Date: Tue, 31 Jan 2017 15:30:21 -0800 Message-ID: Subject: Re: queries not being submitted in Impala cluster despite free resources To: user@impala.incubator.apache.org Content-Type: multipart/alternative; boundary=001a114a94e20f14bb05476c5263 archived-at: Tue, 31 Jan 2017 23:30:53 -0000 --001a114a94e20f14bb05476c5263 Content-Type: text/plain; charset=UTF-8 William, This guide is helpful for understanding a good approach to configuring admission control /w memory: https://blog.cloudera.com/blog/2016/12/resource-management-for-apache-impala-incubating/ Hope that helps On Tue, Jan 31, 2017 at 11:08 AM, Tim Armstrong wrote: > Do you have a default query memory limit set? Admission control does not > generally work well if it's relying on the estimated memory requirement - > you really need to have query memory limits set. If you have the default > query memory limit set to 25GB, then admission control assumes that the > query will use that amount on each node. I assume you mean 700GB memory > total across all nodes - how much memory do you have per node? > > On Tue, Jan 31, 2017 at 7:31 AM, Jeszy wrote: > >> That would be good. If they eventually run successfully, a query profile >> would also be welcome. >> >> Thanks >> >> On Tue, Jan 31, 2017 at 4:28 PM, William Cox < >> william.cox@distilnetworks.com> wrote: >> >>> Jeszy, >>> >>> Thanks for the suggestion. We also have a 25GB per-query limit set up. >>> Queries that estimate a large size are rejected with an error stating they >>> exceeded the memory limit. The queries I'm having trouble with are ones >>> that have no such error but simply wait in the CREATED state. Next time it >>> happens I'll see if I can grab the memory estimates and check. >>> Thanks. >>> -William >>> >>> >>> On Tue, Jan 31, 2017 at 7:08 AM, Jeszy wrote: >>> >>>> Hey William, >>>> >>>> IIUC you have configured both a memory-based upper bound and a # >>>> queries upper bound for the default pool. A query can get queued if it >>>> would exceed either of these limits. If you're not hitting the number of >>>> queries one, then it's probably memory, which can happen even if not fully >>>> utilized - unless you specify a mem_limit for the query, the estimated >>>> memory requirement will be used for deciding whether the query should be >>>> admitted. This can get out of hand when the cardinality estimation is off, >>>> either due to a very complex query or because of missing / old stats. >>>> >>>> This is about memory-based admission control exclusively, but I think >>>> it will be helpful: http://www.cloudera.com/docume >>>> ntation/enterprise/latest/topics/impala_admission.html#admission_memory >>>> >>>> HTH >>>> >>>> On Mon, Jan 30, 2017 at 8:31 PM, William Cox < >>>> william.cox@distilnetworks.com> wrote: >>>> >>>>> I'm running CDH CDH-5.8.0-1 and Impala =version 2.6.0-cdh5.8.0 >>>>> RELEASE (build 8d8652f69461f0dd8d5f474573fb5de7ceb0ee6b). We have >>>>> enabled resource management and allocated ~700Gb of memory with 30 running >>>>> queries for the default. Our background data jobs are Unlimited. >>>>> >>>>> >>>>> In spite of this setup, we still encounter times where queries will be >>>>> marked as CREATED and waiting for allocation when the number of running >>>>> queries is well below 30 and the amount of used memory, as listed in the >>>>> CDH UI, is well below 700GB. >>>>> >>>>> This is seemingly unpredicable. We've created extensive monitors to >>>>> track # of running queries and memory usage but there seems to be no >>>>> pattern to why/when these queries won't be submitted to the cluster. >>>>> >>>>> Is there some key metric that I might be missing or is there any >>>>> suggestions folks have for tracking down these queries that won't be >>>>> submitted? >>>>> Thanks. >>>>> -William >>>>> >>>>> >>>> >>> >> > --001a114a94e20f14bb05476c5263 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
William,

This guide is helpful for unde= rstanding a good approach to configuring admission control /w memory:
=

Hope that helps

On Tue, Jan 31, 2017 at 11:08 AM, Tim Armstrong <tarmst= rong@cloudera.com> wrote:
<= div dir=3D"ltr">
Do you have a default query memory limit set? Adm= ission control does not generally work well if it's relying on the esti= mated memory requirement - you really need to have query memory limits set.= If you have the default query memory limit set to 25GB, then admission con= trol assumes that the query will use that amount on each node. I assume you= mean 700GB memory total across all nodes - how much memory do you have per= node?

On Tue, Jan 31, 2017 at = 7:31 AM, Jeszy <jeszyb@gmail.com> wrote:
That would be good. If they eventually run s= uccessfully, a query profile would also be welcome.

Than= ks

On Tue, Jan 31, 2017 at 4:28 PM, William Cox <william.cox= @distilnetworks.com> wrote:
Jeszy,

Thanks for the suggestion.= We also have a 25GB per-query limit set up. Queries that estimate a large = size are rejected with an error stating they exceeded the memory limit. The= queries I'm having trouble with are ones that have no such error but s= imply wait in the CREATED state. Next time it happens I'll see if I can= grab the memory estimates and check.
Thanks.
-William


On Tue, Jan 31, 2017 at 7:08 AM, Jeszy <= jeszyb@gmail.com&= gt; wrote:
Hey Wi= lliam,

IIUC you have configured both a memory-based uppe= r bound and a # queries upper bound for the default pool. A query can get q= ueued if it would exceed either of these limits. If you're not hitting = the number of queries one, then it's probably memory, which can happen = even if not fully utilized - unless you specify a mem_limit for the query, = the estimated memory requirement will be used for deciding whether the quer= y should be admitted. This can get out of hand when the cardinality estimat= ion is off, either due to a very complex query or because of missing / old = stats.

This is about memory-based admission contro= l exclusively, but I think it will be helpful: http://www.cloudera.com/documentation/ent= erprise/latest/topics/impala_admission.html#admission_memory<= /div>

HTH

On Mon, Jan 30, 2017 at 8:31 PM,= William Cox <william.cox@distilnetworks.com> wrote:
I'm running CDH=C2=A0CDH-5.8.0-1= =C2=A0and Impala=C2=A0=3Dversi= on 2.6.0-cdh5.8.0 RELEASE (build 8d8652f69461f0dd8d5f474573fb5de7ceb0e= e6b). We have enabled resource mana= gement and allocated =C2=A0~700Gb of memory with 30 running queries for the= default. Our background data jobs are Unlimited.

3D""

In spite o= f this setup, we still encounter times where queries will be marked as CREA= TED and waiting for allocation when the number of running queries is well b= elow 30 and the amount of used memory, as listed in the CDH UI, is well bel= ow 700GB.

This is seemingly unpredicable. We'v= e created extensive monitors to track # of running queries and memory usage= but there seems to be no pattern to why/when these queries won't be su= bmitted to the cluster.

Is there some key metric t= hat I might be missing or is there any suggestions folks have for tracking = down these queries that won't be submitted?
Thanks.
-William
<= div>





--001a114a94e20f14bb05476c5263--