spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiao Li (JIRA)" <>
Subject [jira] [Resolved] (SPARK-10849) Allow user to specify database column type for data frame fields when writing data to jdbc data sources.
Date Fri, 24 Mar 2017 00:40:41 GMT


Xiao Li resolved SPARK-10849.
       Resolution: Fixed
         Assignee: Suresh Thalamati
    Fix Version/s: 2.2.0

> Allow user to specify database column type for data frame fields when writing data to
jdbc data sources. 
> ---------------------------------------------------------------------------------------------------------
>                 Key: SPARK-10849
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.5.0
>            Reporter: Suresh Thalamati
>            Assignee: Suresh Thalamati
>            Priority: Minor
>             Fix For: 2.2.0
> Mapping data frame field type to database column type is addressed to large  extent by
 adding dialects, and Adding  maxlength option in SPARK-10101 to set the  VARCHAR length size.

> In some cases it is hard to determine max supported VARCHAR size , For example DB2 Z/OS
VARCHAR size depends on the page size.  And some databases also has ROW SIZE limits for VARCHAR.
 Specifying default CLOB for all String columns  will likely make read/write slow. 
> Allowing users to specify database type corresponding to the data frame field will be
useful in cases where users wants to fine tune mapping for one or two fields, and is fine
with default for all other fields .  
> I propose to make the following two properties available for users to set in the data
frame metadata when writing to JDBC data sources.
> database.column.type  --  column type to use for create table.
> jdbc.column.type"     --  jdbc type to  use for setting null values. 
> Example :
>   val secdf = sc.parallelize( Array(("Apple","Revenue ..."), ("Google","Income:123213"))).toDF("name",
>   val  metadataBuilder = new MetadataBuilder()
>   metadataBuilder.putString("database.column.type", "CLOB(100K)")
>   metadataBuilder.putLong("jdbc.type", java.sql.Types.CLOB)
>   val metadta =
>   val secReportDF = secdf.withColumn("report", col("report").as("report", metadata))
>   secReporrDF.write.jdbc("jdbc:mysql://<URL>/secdata", "reports", mysqlProps)

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message