spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dongjoon Hyun (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Deleted] (SPARK-16410) DataFrameWriter's jdbc method drops table in overwrite mode
Date Sat, 09 Jul 2016 19:28:11 GMT

     [ https://issues.apache.org/jira/browse/SPARK-16410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dongjoon Hyun updated SPARK-16410:
----------------------------------
    Comment: was deleted

(was: If you insist, ...
But, please consider one of your users, me. :-))

> DataFrameWriter's jdbc method drops table in overwrite mode
> -----------------------------------------------------------
>
>                 Key: SPARK-16410
>                 URL: https://issues.apache.org/jira/browse/SPARK-16410
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.4.1, 1.6.2
>            Reporter: Ian Hellstrom
>
> According to the [API documentation|http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrameWriter],
the write mode {{overwrite}} should _overwrite the existing data_, which suggests that the
data is removed, i.e. the table is truncated. 
> However, that is now what happens in the [source code|https://github.com/apache/spark/blob/0ad6ce7e54b1d8f5946dde652fa5341d15059158/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala#L421]:
> {code}
> if (mode == SaveMode.Overwrite && tableExists) {
>         JdbcUtils.dropTable(conn, table)
>         tableExists = false
>       }
> {code}
> This clearly shows that the table is first dropped and then recreated. This causes two
major issues:
> * Existing indexes, partitioning schemes, etc. are completely lost.
> * The case of identifiers may be changed without the user understanding why.
> In my opinion, the table should be truncated, not dropped. Overwriting data is a DML
operation and should not cause DDL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message