Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id F2057200C8B for ; Mon, 8 May 2017 03:45:50 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id F0972160BC5; Mon, 8 May 2017 01:45:50 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 42794160BB1 for ; Mon, 8 May 2017 03:45:50 +0200 (CEST) Received: (qmail 20403 invoked by uid 500); 8 May 2017 01:45:49 -0000 Mailing-List: contact dev-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@impala.incubator.apache.org Delivered-To: mailing list dev@impala.incubator.apache.org Received: (qmail 20391 invoked by uid 99); 8 May 2017 01:45:48 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 May 2017 01:45:48 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 414051A0328 for ; Mon, 8 May 2017 01:45:48 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.379 X-Spam-Level: ** X-Spam-Status: No, score=2.379 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 33WC54H9iRDo for ; Mon, 8 May 2017 01:45:47 +0000 (UTC) Received: from mail-wm0-f48.google.com (mail-wm0-f48.google.com [74.125.82.48]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id D69725F24C for ; Mon, 8 May 2017 01:45:46 +0000 (UTC) Received: by mail-wm0-f48.google.com with SMTP id b84so38170592wmh.0 for ; Sun, 07 May 2017 18:45:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=2KGuxuVFDVxSdDleLiV1waNdp676zrxxLMIRe00ow9k=; b=XtkJ7c3SeInj/xZ2kDaKSqyClUlIVWlfzcV300jJNbdc6y2PmTlxpFA/OLw2xavSmF gMWMp8tFvYq3KTeV6g9wcFu9yBKmACYS9jB/CnQ3uymCbclpMKSyOuOGxOfkx9erCrDL sAmbRK3DOCP/lvb29X7RyPZzaOSjDsJHBxPm8tYZwSqdXb7DJx6OG7tqpWrPKfCE8jSw nJWR/4CbH7Bx0E8pQR1hZZN/ml2Ecb0K25jMz41lmFJSnkFU5AB3WsiyZAoAkYcCemuo srkIwy86WJrtb3ZCouFNRY7U0zXQIqTgKIGiCyWjHMoGSKRMZw4uILpm7W8LELSw5VVC cg4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=2KGuxuVFDVxSdDleLiV1waNdp676zrxxLMIRe00ow9k=; b=RS1E0tZ5lTQKk/FbPgiC/zL/J7+9MUqT07RposhB4QZtTyTkuJWlB/UYta143p8UB3 ODOy11Epbtp9eJAEwBQyXdfcw5K8mNEtZlx/kg7oW4TEn6xIEKNVkXSLoJuU6gHfq+Ie CyYy0OTeo+mLVi2jbqT1GB1yxSCmruIalfkyeuYCBAAda/UTftUUnCNjkZMnibOomZ1l N1qhstSNO70xX3GWGvFKQAs6iB/YUu2pbsMH61B//seTF6NKFvJI2BHwLrqzL17Hq9Ku Njj9ZmVU6xwSOxfYCqo4WiYQXrfXVD9MtVl34Abe1dd/0qpzUQlFkMDOp0vhZ7k++WSs LthA== X-Gm-Message-State: AN3rC/5TrIw58uCypBmhabGxut03d+uZwD3O+gJC+ct2hj1E/d1tSRXu wFSC3sAeZ2YdozXd+NDCs6G570YM1a85 X-Received: by 10.25.235.18 with SMTP id j18mr10174060lfh.151.1494207945844; Sun, 07 May 2017 18:45:45 -0700 (PDT) MIME-Version: 1.0 Received: by 10.25.229.193 with HTTP; Sun, 7 May 2017 18:45:45 -0700 (PDT) In-Reply-To: References: From: yu feng Date: Mon, 8 May 2017 09:45:45 +0800 Message-ID: Subject: Re: about broadcast join and hash shuffle join To: dev@impala.incubator.apache.org Content-Type: multipart/alternative; boundary=001a113c5860d52d9d054ef96508 archived-at: Mon, 08 May 2017 01:45:51 -0000 --001a113c5860d52d9d054ef96508 Content-Type: text/plain; charset=UTF-8 Great! I agree to defaulting to partitioned joins should reduce the risk of disastrous plans. 2017-05-05 22:20 GMT+08:00 Thomas Tauber-Marshall : > There's actually a review out right now for changing the default join > algorithm when stats are unavailable to partitioned: > https://gerrit.cloudera.org/#/c/6803/ > > On Fri, May 5, 2017 at 4:44 AM yu feng wrote: > > > Hi All: > > > > I find impala choose join algorithm by comparing data transmission size > > between broad cast and shuffle join while generating physical execution > > plan. what I am confused is why impala choose broadcast as default > > implement(such as table do not compute stats) ? > > > > In my experience, shuffle join maybe the better choice, and some of my > > queries use broadcast join between two subquery with huge resultset and > the > > query costs has difference up to ten times (8s and 80s). > > > > I think user should always compute stats for every partition, do you guys > > have some good suggestion about this. > > > > Thanks a lot > > > --001a113c5860d52d9d054ef96508--