spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dongjoon Hyun (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-28605) Performance regression in SS's foreach
Date Mon, 05 Aug 2019 05:59:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-28605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899790#comment-16899790
] 

Dongjoon Hyun commented on SPARK-28605:
---------------------------------------

Oh, thank you, [~zsxwing]!

> Performance regression in SS's foreach
> --------------------------------------
>
>                 Key: SPARK-28605
>                 URL: https://issues.apache.org/jira/browse/SPARK-28605
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 2.4.0, 2.4.1, 2.4.2, 2.4.3
>            Reporter: Shixiong Zhu
>            Priority: Major
>              Labels: regresssion
>
> When "ForeachWriter.open" return "false", ForeachSink v1 will skip the whole partition
without reading data. But in ForeachSink v2, due to the API limitation, it needs to read
the whole partition even if all data just gets dropped.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message