spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacek Laskowski <ja...@japila.pl>
Subject Re: [SS] Bug in StreamExecution? currentBatchId and getBatchDescriptionString for web UI
Date Sun, 10 Sep 2017 11:15:47 GMT
Hi,

Please disregard my finding. It does not seem a bug, but just a small
"dead code" as "init" will never be displayed in web UI = the minimum
batch id can ever be 0 and so getBatchDescriptionString could be a
little "improved".

Sorry for the noise.

Pozdrawiam,
Jacek Laskowski
----
https://about.me/JacekLaskowski
Spark Structured Streaming (Apache Spark 2.2+)
https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Sat, Sep 9, 2017 at 9:21 PM, Jacek Laskowski <jacek@japila.pl> wrote:
> Hi,
>
> While reviewing StreamExecution and how batches are displayed in web
> UI, I've noticed that currentBatchId is -1 when StreamExecution is
> created [1] and becomes 0 when no offsets are available [2].
>
> That leads to my question about setting the job description for a
> query using getBatchDescriptionString [3]. It branches per
> currentBatchId and when it's -1 gives "init" [4] which never happens
> as showed above.
>
> That leads to the PR for SPARK-20464 "Add a job group and description
> for streaming queries and fix cancellation of running jobs using the
> job group" that sets the job description after populateStartOffsets
> [5].
>
> Shouldn't it be before populateStartOffsets so
> getBatchDescriptionString has a chance of giving "init" and we see no
> two 0s?
>
> Help appreciated.
>
> [1] https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala#L116
> [2] https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala?utf8=%E2%9C%93#L516
> [3] https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala?utf8=%E2%9C%93#L878-L883
> [4] https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala?utf8=%E2%9C%93#L879
> [5] https://github.com/apache/spark/commit/6fc6cf88d871f5b05b0ad1a504e0d6213cf9d331#diff-6532dd3b63bdab0364fbcf2303e290e4R294
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://about.me/JacekLaskowski
> Spark Structured Streaming (Apache Spark 2.2+)
> https://bit.ly/spark-structured-streaming
> Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Mime
View raw message