From issues-return-245435-archive-asf-public=cust-asf.ponee.io@spark.apache.org  Wed Jan 22 16:40:03 2020
Return-Path: <issues-return-245435-archive-asf-public=cust-asf.ponee.io@spark.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [207.244.88.153])
	by mx-eu-01.ponee.io (Postfix) with SMTP id E0C2C18064C
	for <archive-asf-public@cust-asf.ponee.io>; Wed, 22 Jan 2020 17:40:02 +0100 (CET)
Received: (qmail 41898 invoked by uid 500); 22 Jan 2020 16:40:02 -0000
Mailing-List: contact issues-help@spark.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:issues-help@spark.apache.org>
List-Unsubscribe: <mailto:issues-unsubscribe@spark.apache.org>
List-Post: <mailto:issues@spark.apache.org>
List-Id: <issues.spark.apache.org>
Delivered-To: mailing list issues@spark.apache.org
Received: (qmail 41886 invoked by uid 99); 22 Jan 2020 16:40:02 -0000
Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139)
    by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Jan 2020 16:40:02 +0000
Received: from jira-he-de.apache.org (static.172.67.40.188.clients.your-server.de [188.40.67.172])
	by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 6A747E30F8
	for <issues@spark.apache.org>; Wed, 22 Jan 2020 16:40:01 +0000 (UTC)
Received: from jira-he-de.apache.org (localhost.localdomain [127.0.0.1])
	by jira-he-de.apache.org (ASF Mail Server at jira-he-de.apache.org) with ESMTP id 2F4CC780403
	for <issues@spark.apache.org>; Wed, 22 Jan 2020 16:40:00 +0000 (UTC)
Date: Wed, 22 Jan 2020 16:40:00 +0000 (UTC)
From: "Wei Xue (Jira)" <jira@apache.org>
To: issues@spark.apache.org
Message-ID: <JIRA.13279811.1579176804000.30440.1579711200191@Atlassian.JIRA>
In-Reply-To: <JIRA.13279811.1579176804000@Atlassian.JIRA>
References: <JIRA.13279811.1579176804000@Atlassian.JIRA> <JIRA.13279811.1579176804984@jira-he-de>
Subject: [jira] [Commented] (SPARK-30528) DPP issues
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394


    [ https://issues.apache.org/jira/browse/SPARK-30528?page=3Dcom.atlassia=
n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D170=
21257#comment-17021257 ]=20

Wei Xue commented on SPARK-30528:
---------------------------------

# Turning the non-broadcast-reuse DPP by default could be an option.
 # So it really depends: bloom filter would reduce the partition filtering =
rate, thus getting you less benefit, and the penalties are (without extra a=
ggregate):
 ## Duplicate scans or other ops like joins, if the filter plan is partiall=
y reused;
 ## Otherwise, you could implement sth. just like broadcast reuse to fully =
reuse the shuffle exchange of the filter plan, but you need to be aware tha=
t you are now serializing the two child operators' computation of an SMJ.
 # Yes, you could.

> DPP issues
> ----------
>
>                 Key: SPARK-30528
>                 URL: https://issues.apache.org/jira/browse/SPARK-30528
>             Project: Spark
>          Issue Type: Bug
>          Components: Optimizer
>    Affects Versions: 3.0.0
>            Reporter: Mayur Bhosale
>            Priority: Major
>              Labels: performance
>         Attachments: cases.png, dup_subquery.png, plan.png
>
>
> In DPP, heuristics to decide if DPP is going to benefit relies on the siz=
es of the tables in the right subtree of the join. This might not be a corr=
ect estimate especially when the detailed column level stats are not availa=
ble.
> {code:java}
>     // the pruning overhead is the total size in bytes of all scan relati=
ons
>     val overhead =3D otherPlan.collectLeaves().map(_.stats.sizeInBytes).s=
um.toFloat
>     filterRatio * partPlan.stats.sizeInBytes.toFloat > overhead.toFloat
> {code}
> Also, DPP executes the entire right side of the join as a subquery becaus=
e of which multiple scans happen for the tables in the right subtree of the=
 join. This can cause issues when join is non-Broadcast Hash Join (BHJ) and=
 reuse of the subquery result does not happen. Also, I couldn=E2=80=99t fig=
ure out, why do the results from the subquery get re-used only for BHJ?
> =C2=A0
> Consider a query,
> {code:java}
> SELECT *=20
> FROM   store_sales_partitioned=20
>        JOIN (SELECT *=20
>              FROM   store_returns_partitioned,=20
>                     date_dim=20
>              WHERE  sr_returned_date_sk =3D d_date_sk) ret_date=20
>          ON ss_sold_date_sk =3D d_date_sk=20
> WHERE  d_fy_quarter_seq > 0=20
> {code}
> DPP will kick-in for both the join. (Please check the image plan.png atta=
ched below for the plan)
> Some of the observations -
>  * Based on heuristics, DPP would go ahead with pruning if the cost of sc=
anning the tables in the right sub-tree of the join is less than the benefi=
t due to pruning. This is due to the reason that multiple scans will be nee=
ded for an SMJ. But heuristics simply checks if the benefits offset the cos=
t of multiple scans and do not take into consideration other operations lik=
e Join, etc in the right subtree which can be quite expensive. This issue w=
ill be particularly prominent when detailed column level stats are not avai=
lable. In the example above, a decision that pruningHasBenefit was made on =
the basis of sizes of the tables=C2=A0store_returns_partitioned and date_di=
m but did not take into consideration the join between them before the join=
 happens with the store_sales_partitioned table.
>  * Multiple scans are needed when the join is SMJ as the reuse of the exc=
hanges does not happen. This is because Aggregate gets added on top of the =
right subtree to be executed as a subquery in order to prune only required =
columns. Here, scanning all the columns as the right subtree of the join wo=
uld, and reusing the same exchange might be more helpful as it avoids dupli=
cate scans.
> This was just a representative example, but in-general for cases such as =
in the image cases.png below, DPP can cause performance issues.
> =C2=A0
> Also, for the cases when there are multiple DPP compatible join condition=
s in the same join, the entire right subtree of the join would be executed =
as a subquery that many times. Consider an example,
> {code:java}
> SELECT=C2=A0*=C2=A0
> FROM=C2=A0=C2=A0=C2=A0partitionedtable=C2=A0
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0JOIN=C2=A0nonpartitionedtable=
=C2=A0
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0ON=C2=A0partcol1=C2=
=A0=3D=C2=A0col1=C2=A0
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0A=
ND=C2=A0partcol2=C2=A0=3D=C2=A0col2=C2=A0
> WHERE=C2=A0=C2=A0nonpartitionedtable.id=C2=A0>=C2=A00=C2=A0
> {code}
> Here the right subtree of the join (scan of table=C2=A0nonpartitionedtabl=
e) would be executed twice as a subquery, once each for the every join cond=
ition. These two subqueries should be aggregated and executed only once as =
they are almost the same apart from the columns that they prune. Check the =
image dup_subquery.png attached below for the details.


--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org