Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 6516C200ACA for ; Wed, 18 May 2016 23:39:14 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 63EFF1609B0; Wed, 18 May 2016 21:39:14 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id B55F1160A00 for ; Wed, 18 May 2016 23:39:13 +0200 (CEST) Received: (qmail 40722 invoked by uid 500); 18 May 2016 21:39:13 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 40686 invoked by uid 99); 18 May 2016 21:39:12 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 May 2016 21:39:12 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id D18CD2C1F58 for ; Wed, 18 May 2016 21:39:12 +0000 (UTC) Date: Wed, 18 May 2016 21:39:12 +0000 (UTC) From: "Jesus Camacho Rodriguez (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 18 May 2016 21:39:14 -0000 [ https://issues.apache.org/jira/browse/HIVE-13750?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13750: ------------------------------------------- Status: Patch Available (was: In Progress) > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimi= zer when possible > -------------------------------------------------------------------------= ----------------- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer > Affects Versions: 2.1.0 > Reporter: Jesus Camacho Rodriguez > Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13750.01.patch, HIVE-13750.02.patch, HIVE-13750= .patch, HIVE-13750.patch > > > Extend ReduceDedup to remove additional shuffle stage created by sorted d= ynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) unco= nditionally adds an extra shuffle stage. If sort columns of previous shuffl= e and partitioning columns of table match, reduce sink deduplication optimi= zer removes extra shuffle stage, thus bringing down overhead to zero. Howev= er, if they don=E2=80=99t match, we end up doing extra shuffle. This can be= improved since we can add table partition columns as a sort columns on ear= lier shuffle and avoid this extra shuffle. This ensures that in cases query= already has a shuffle stage, we are not shuffling data again.=20 > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)