spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [spark] moomindani commented on a change in pull request #28953: [SPARK-32013][SQL] Support query execution before reading DataFrame and before/after writing DataFrame over JDBC
Date Thu, 09 Jul 2020 11:26:20 GMT

moomindani commented on a change in pull request #28953:
URL: https://github.com/apache/spark/pull/28953#discussion_r452148449



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcRelationProvider.scala
##########
@@ -46,7 +46,18 @@ class JdbcRelationProvider extends CreatableRelationProvider
     val isCaseSensitive = sqlContext.conf.caseSensitiveAnalysis
 
     val conn = JdbcUtils.createConnectionFactory(options)()
+
+    var parametersWithoutPreActions: Map[String, String] = parameters
     try {
+      options.preActions match {
+        case Some(i) =>
+          runQuery(conn, i, options)
+
+          // Remove preActions to avoid duplicate execution when writing data
+          parametersWithoutPreActions = parameters.-(JDBCOptions.JDBC_PRE_ACTIONS_STRING)

Review comment:
       
   
   As I added the comments, it is for preventing duplicate execution of preActions in write
path.
   There are two methods around here.
   a) `createRelation(SQLContext, SaveMode, Map[String, String], DataFrame)`
   b) `createRelation(SQLContext, Map[String, String])`
   
   (a) is called only for writes, but (b) is called for both reads and writes
   To avoid duplicate execution of preActions in writes, we need to remove preActions parameter
here before passing it to (b)
   

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcRelationProvider.scala
##########
@@ -46,7 +46,18 @@ class JdbcRelationProvider extends CreatableRelationProvider
     val isCaseSensitive = sqlContext.conf.caseSensitiveAnalysis
 
     val conn = JdbcUtils.createConnectionFactory(options)()
+
+    var parametersWithoutPreActions: Map[String, String] = parameters
     try {
+      options.preActions match {
+        case Some(i) =>
+          runQuery(conn, i, options)
+
+          // Remove preActions to avoid duplicate execution when writing data
+          parametersWithoutPreActions = parameters.-(JDBCOptions.JDBC_PRE_ACTIONS_STRING)

Review comment:
       As I added the comments, it is for preventing duplicate execution of `preActions` in
write path.
   There are two methods around here.
   a) `createRelation(SQLContext, SaveMode, Map[String, String], DataFrame)`
   b) `createRelation(SQLContext, Map[String, String])`
   
   (a) is called only for writes, but (b) is called for both reads and writes
   To avoid duplicate execution of preActions in writes, we need to remove preActions parameter
here before passing it to (b)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message