Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id D03BA200D31 for ; Sat, 4 Nov 2017 17:51:05 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id CEFBC160BE9; Sat, 4 Nov 2017 16:51:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id ED6F11609EE for ; Sat, 4 Nov 2017 17:51:04 +0100 (CET) Received: (qmail 3411 invoked by uid 500); 4 Nov 2017 16:51:04 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 3402 invoked by uid 99); 4 Nov 2017 16:51:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 04 Nov 2017 16:51:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 597B81807D5 for ; Sat, 4 Nov 2017 16:51:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id eODo1G46RiUK for ; Sat, 4 Nov 2017 16:51:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 333A55FB51 for ; Sat, 4 Nov 2017 16:51:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 5FDA5E00A3 for ; Sat, 4 Nov 2017 16:51:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 1A23223F05 for ; Sat, 4 Nov 2017 16:51:00 +0000 (UTC) Date: Sat, 4 Nov 2017 16:51:00 +0000 (UTC) From: "Hongbo (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (SPARK-22443) AggregatedDialect doesn't override quoteIdentifier and other methods in JdbcDialects MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sat, 04 Nov 2017 16:51:06 -0000 [ https://issues.apache.org/jira/browse/SPARK-22443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16239097#comment-16239097 ] Hongbo commented on SPARK-22443: -------------------------------- [~srowen] Thanks for the quick response! I think returning the first dialect is an acceptable solution. But I was wondering whether it could be better? Suppose the first dialect doesn't override, e.g., the quoteIdentifier method, but the second dialect overrides it. Naturally, using the implementation in the second dialect is better. But in the current implementation, it will use the default implementation in the base JdbcDialect class. Maybe we can derive new dialects from another base class which returns null(I hate null, but wrap with Option will change external API) for the string methods? And in AggregatedDialect, it can return the first non-null result. If all the dialects return null, then it returns the default implementation in NoopDialect (the trivial concrete object derived from JdbcDialect). Just my two cents. > AggregatedDialect doesn't override quoteIdentifier and other methods in JdbcDialects > ------------------------------------------------------------------------------------ > > Key: SPARK-22443 > URL: https://issues.apache.org/jira/browse/SPARK-22443 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.2.0 > Reporter: Hongbo > Priority: Normal > > The AggregatedDialect only implements canHandle, getCatalystType, getJDBCType. It doesn't implement other methods in JdbcDialect. > So if multiple Dialects are registered with the same driver, the implementation of these methods will not be taken and the default implementation in JdbcDialect will be used. > Example: > {code:java} > package example > import java.util.Properties > import org.apache.spark.sql.SparkSession > import org.apache.spark.sql.jdbc.{JdbcDialect, JdbcDialects} > import org.apache.spark.sql.types.{DataType, MetadataBuilder} > object AnotherMySQLDialect extends JdbcDialect { > override def canHandle(url : String): Boolean = url.startsWith("jdbc:mysql") > override def getCatalystType( > sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): Option[DataType] = { > None > } > override def quoteIdentifier(colName: String): String = { > s"`$colName`" > } > } > object App { > def main(args: Array[String]) { > val spark = SparkSession.builder.master("local").appName("Simple Application").getOrCreate() > JdbcDialects.registerDialect(AnotherMySQLDialect) > val jdbcUrl = s"jdbc:mysql://host:port/db?user=user&password=password" > spark.read.jdbc(jdbcUrl, "badge", new Properties()).show() > } > } > {code} > will throw an exception. > {code:none} > 17/11/03 17:08:39 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) > java.sql.SQLDataException: Cannot determine value type from string 'id' > at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:530) > at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:513) > at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:505) > at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:479) > at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:489) > at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:89) > at com.mysql.cj.jdbc.result.ResultSetImpl.getLong(ResultSetImpl.java:853) > at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeGetter$8.apply(JdbcUtils.scala:409) > at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeGetter$8.apply(JdbcUtils.scala:408) > at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:330) > at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:312) > at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) > at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) > at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) > at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) > at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:234) > at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:228) > at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827) > at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827) > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > at org.apache.spark.scheduler.Task.run(Task.scala:108) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: com.mysql.cj.core.exceptions.DataConversionException: Cannot determine value type from string 'id' > at com.mysql.cj.core.io.StringConverter.createFromBytes(StringConverter.java:121) > at com.mysql.cj.core.io.MysqlTextValueDecoder.decodeByteArray(MysqlTextValueDecoder.java:232) > at com.mysql.cj.mysqla.result.AbstractResultsetRow.decodeAndCreateReturnValue(AbstractResultsetRow.java:124) > at com.mysql.cj.mysqla.result.AbstractResultsetRow.getValueFromBytes(AbstractResultsetRow.java:225) > at com.mysql.cj.mysqla.result.ByteArrayRow.getValue(ByteArrayRow.java:84) > at com.mysql.cj.jdbc.result.ResultSetImpl.getNonStringValueFromRow(ResultSetImpl.java:630) > ... 24 more > {code} > Though the quoteIdentifier is correctly implemented in Spark's MySQLDialect and our AnotherMySQLDialect. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org