spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jungtaek Lim (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-28605) Performance regression in SS's foreach
Date Sun, 18 Aug 2019 07:21:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-28605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16909908#comment-16909908
] 

Jungtaek Lim commented on SPARK-28605:
--------------------------------------

This issue seems to become invalid once we address SPARK-28650.

> Performance regression in SS's foreach
> --------------------------------------
>
>                 Key: SPARK-28605
>                 URL: https://issues.apache.org/jira/browse/SPARK-28605
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 2.4.0, 2.4.1, 2.4.2, 2.4.3
>            Reporter: Shixiong Zhu
>            Priority: Major
>              Labels: regresssion
>
> When "ForeachWriter.open" return "false", ForeachSink v1 will skip the whole partition
without reading data. But in ForeachSink v2, due to the API limitation, it needs to read
the whole partition even if all data just gets dropped.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message