spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jia Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-20885) JDBC predicate pushdown uses hardcoded date format
Date Thu, 25 May 2017 23:12:04 GMT

    [ https://issues.apache.org/jira/browse/SPARK-20885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025533#comment-16025533
] 

Jia Li commented on SPARK-20885:
--------------------------------

I can help to take a look at this. 

> JDBC predicate pushdown uses hardcoded date format
> --------------------------------------------------
>
>                 Key: SPARK-20885
>                 URL: https://issues.apache.org/jira/browse/SPARK-20885
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: Peter Halverson
>            Priority: Minor
>
> If a date literal is used in a pushed-down filter expression, e.g.
> {code}
> val postingDate = java.sql.Date.valueOf("2016-06-03")
> val count = jdbcDF.filter($"POSTINGDATE" === postingDate).count
> {code}
> where the {{POSTINGDATE}} column is of JDBC type Date, the resulting pushed-down SQL
query looks like the following:
> {code}
> SELECT .. <columns> ... FROM <table> WHERE POSTINGDATE = '2016-06-03'
> {code}
> Specifically, the date is compiled into a string literal using the hardcoded yyyy-MM-dd
format that {{java.sql.Date.toString}} emits. Note the implied string conversion for date
(and timestamp) values in {{JDBCRDD.compileValue}}
> {code}
>   /**
>    * Converts value to SQL expression.
>    */
>   private def compileValue(value: Any): Any = value match {
>     case stringValue: String => s"'${escapeSql(stringValue)}'"
>     case timestampValue: Timestamp => "'" + timestampValue + "'"
>     case dateValue: Date => "'" + dateValue + "'"
>     case arrayValue: Array[Any] => arrayValue.map(compileValue).mkString(", ")
>     case _ => value
>   }
> {code}
> The resulting query fails if the database is expecting a different format for date string
literals. For example, the default format for Oracle is 'dd-MMM-yy', so when the relation
query is executed, it fails with a syntax error.
> {code}
> ORA-01861: literal does not match format string
> 01861. 00000 -  "literal does not match format string"
> {code}
> In some situations it may be possible to change the database's expected date format to
match the Java format, but this is not always possible (e.g. reading from an external database
server)
> Shouldn't this kind of conversion be going through some kind of vendor specific translation
(e.g. through a {{JDBCDialect}})?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message