spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Li Yuanjian (JIRA)" <>
Subject [jira] [Commented] (SPARK-24036) Stateful operators in continuous processing
Date Thu, 10 May 2018 08:29:00 GMT


Li Yuanjian commented on SPARK-24036:

I agree with the division about the kinds of tasks, that's quite clear, but maybe all of this
can be maximum transparent to scheduler by reusing the ResultTask and ShuffleMapTask design,
could the DAGScheduler use ContinuousShuffleMapTask to replace original ShuffleMapTask?
{quote}Changing DAGScheduler to accommodate continuous processing would create significant
additional complexity I don't think we can really justify.
So here, in my opinion, maybe not as complex as we think? If I'm wrong please let me know.
{quote}Whether we need to write an explicit shuffle RDD class or not would I think come down
to an implementation detail of SPARK-24236. It depends on what's the cleanest way to unfold
the SparkPlan tree.
 Yep, can't agree more. I'll arrange this part of our internal code and give a preview PR.
We'll appreciate very much with your any opinions!

> Stateful operators in continuous processing
> -------------------------------------------
>                 Key: SPARK-24036
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 2.4.0
>            Reporter: Jose Torres
>            Priority: Major
> The first iteration of continuous processing in Spark 2.3 does not work with stateful

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message