Mailing-List: contact dev-help@spark.apache.org; run by ezmlm
Precedence: bulk
MIME-Version: 1.0
From: Jacek Laskowski <jacek@japila.pl>
Date: Wed, 21 Jun 2017 20:16:39 -0500
Message-ID: <CACo38_QEOEJj8ULT8jQe4A0DmAY4dTYrFq8iNDm5n=RBG=mxVw@mail.gmail.com>
Subject: Why does Spark SQL use custom spark.sql.execution.id local property
 not SparkContext.setJobGroup?
To: dev <dev@spark.apache.org>
Content-Type: text/plain; charset="UTF-8"
archived-at: Thu, 22 Jun 2017 01:16:48 -0000

Hi,

Just noticed that Spark SQL uses spark.sql.execution.id local property
(via SQLExecution.withNewExecutionId [1]) to group Spark jobs
logically together while Structured Streaming uses
SparkContext.setJobGroup [2] to do the same.

I think Structured Streaming is more correct as it uses what Spark
Core introduced and uses in web UI (without introducing a custom
solution).

Why does Spark SQL introduce a custom solution based on
spark.sql.execution.id local property? What's wrong with
SparkContext.setJobGroup?

[1] https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala#L63
[2] https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala#L265

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org