Return-Path: X-Original-To: apmail-drill-dev-archive@www.apache.org Delivered-To: apmail-drill-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8477918CC0 for ; Wed, 20 Jan 2016 21:42:45 +0000 (UTC) Received: (qmail 18398 invoked by uid 500); 20 Jan 2016 21:42:45 -0000 Delivered-To: apmail-drill-dev-archive@drill.apache.org Received: (qmail 18345 invoked by uid 500); 20 Jan 2016 21:42:45 -0000 Mailing-List: contact dev-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list dev@drill.apache.org Received: (qmail 18332 invoked by uid 99); 20 Jan 2016 21:42:44 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Jan 2016 21:42:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 4F83CC028B for ; Wed, 20 Jan 2016 21:42:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.921 X-Spam-Level: ** X-Spam-Status: No, score=2.921 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, T_KAM_HTML_FONT_INVALID=0.01, T_REMOTE_IMAGE=0.01, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=maprtech.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id DXZRNWp8FvvX for ; Wed, 20 Jan 2016 21:42:31 +0000 (UTC) Received: from mail-wm0-f45.google.com (mail-wm0-f45.google.com [74.125.82.45]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 8E88920CB8 for ; Wed, 20 Jan 2016 21:42:30 +0000 (UTC) Received: by mail-wm0-f45.google.com with SMTP id n5so51098970wmn.0 for ; Wed, 20 Jan 2016 13:42:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=maprtech.com; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=ADunvOLyjVo/mR0EW6NpqXIc8TBJDK4kAa08LUodIvo=; b=X+kNDF8tg5z+8Usczj6r5sCEIWGRWKan4SbpXVWtlZyrgz9WcMXKeWHsEGQd+VNM7t Q1CWGfyaveD2OP5egnYw+WbVSpj3BoAyfI3/nMe90yYiXZ71RCRKQX23Ih3fY0WPcRGt XnoXyk02yqb6urEri5t4DIkrGKHZekwV3YApE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=ADunvOLyjVo/mR0EW6NpqXIc8TBJDK4kAa08LUodIvo=; b=klGnPvONYLz8n46ebaLE1CErdjrsrojS4NjYN5H9cbCTE7pTSLAKU/9RADjz6i4R35 /YBKYBb/p/2yE9sQqWYcL7PhVxXDH6SmpSDxiIKcaJ3JCN6tMVEIP9akPKmHUJagi+Iv KRsp28ZA0Kt9AcNjuntqb1u6DpiCRHTdHt+xk7jJ6ArGJvdLBf6mo7sg6LYMqY6zyr+4 ictwRssuwrUsEmbJ4qopLtgpI9JenuSwo41m34eCTFx8bSgYbJv+M2wl5QnKpeNyU+Dp 9cRln1hxTPCGZ/8mzT9Rbo6bJh+kq57uqQ9aU8XhuMTFjJ897xR/Wdl10hmGSjS87HWi +D+w== X-Gm-Message-State: AG10YOQpc5FmeNiK2UR5TIZ/B3qE01G/qhTGjXRbCQu8alOsrs+plx0Uzj/h/vbk+6KKNTUa8/jFyL1Qjjhvb9Y1 MIME-Version: 1.0 X-Received: by 10.194.176.74 with SMTP id cg10mr5112070wjc.169.1453326149930; Wed, 20 Jan 2016 13:42:29 -0800 (PST) Received: by 10.28.102.86 with HTTP; Wed, 20 Jan 2016 13:42:29 -0800 (PST) In-Reply-To: References: Date: Wed, 20 Jan 2016 13:42:29 -0800 Message-ID: Subject: Re: query hanging in CANCELLATION_REQUEST From: Abdel Hakim Deneche To: "dev@drill.apache.org" Content-Type: multipart/alternative; boundary=047d7b5d977fe8e9800529cadc79 --047d7b5d977fe8e9800529cadc79 Content-Type: text/plain; charset=UTF-8 I found the issue, I think. Sort was spilling to disk and I run out of disk space, for some reason this caused Zookeeper to behave incorrectly, I could still connect to Drill after the first query hang, but once I run a second query ZK died. Could this explain the query hang ? On Tue, Jan 19, 2016 at 1:57 PM, Jacques Nadeau wrote: > It sounds like the connection break is not correctly marking an ack fail. > > -- > Jacques Nadeau > CTO and Co-Founder, Dremio > > On Tue, Jan 19, 2016 at 11:47 AM, Abdel Hakim Deneche < > adeneche@maprtech.com > > wrote: > > > Ok, I will create a JIRA > > > > On Tue, Jan 19, 2016 at 11:45 AM, Hanifi Gunes > > wrote: > > > > > I had reported this problem sometime last year verbally. I don't > remember > > > creating a JIRA though. In general, I dislike this sort of blocking > calls > > > anywhere in the execution even though one could argue it simplifies the > > > code flow. > > > > > > A JIRA would be appreciated. > > > > > > On Tue, Jan 19, 2016 at 11:10 AM, Abdel Hakim Deneche < > > > adeneche@maprtech.com > > > > wrote: > > > > > > > I was running a query with a hash join that was generating lot's of > > > > results. I cancelled the query from sqlline then closed it. Now the > > query > > > > is stuck in CANCELLATION_REQUEST state. > > > > > > > > Looking at jstack it looks like screenRoot is blocked waiting for > data > > > sent > > > > to the client to be acknowledged. > > > > > > > > Do we have a JIRA similar to this ? > > > > > > > > -- > > > > > > > > Abdelhakim Deneche > > > > > > > > Software Engineer > > > > > > > > > > > > > > > > > > > > Now Available - Free Hadoop On-Demand Training > > > > < > > > > > > > > > > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > > > > > > > > > > > > > > > > > > > > -- > > > > Abdelhakim Deneche > > > > Software Engineer > > > > > > > > > > Now Available - Free Hadoop On-Demand Training > > < > > > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > > > > > > -- Abdelhakim Deneche Software Engineer Now Available - Free Hadoop On-Demand Training --047d7b5d977fe8e9800529cadc79--