hudi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "cdmikechen (Jira)" <j...@apache.org>
Subject [jira] [Commented] (HUDI-91) Replace Databricks spark-avro with native spark-avro #628
Date Fri, 10 Jan 2020 01:37:00 GMT

    [ https://issues.apache.org/jira/browse/HUDI-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012371#comment-17012371
] 

cdmikechen commented on HUDI-91:
--------------------------------

[~vinoth]

Yes, I know your point. We can use hudi fine by using *databricks spark-avro* right now.

I mean *native spark-avro* just start from spark2.4. If we finally replace databricks spark-avro
with native spark-avro, user use spark 2.3 or spark2.2 or spark2.1 will build hudi failed. 

We can see in [https://github.com/apache/spark/tree/branch-2.3/external] , there is no spark-avro
project in it, meanwhile [https://github.com/apache/spark/tree/branch-2.4/external] have.

So I suggest we may combine the two projects (databricks-avro and spark-avro) and simplify
them as our own internal implementation, or package spark-avro and avro 1.8.2 to hudi inside
like *org.apache.hudi.org.apache.spark.sql.avro* and *org.apache.hudi.org.apache.avro* 

 

> Replace Databricks spark-avro with native spark-avro #628
> ---------------------------------------------------------
>
>                 Key: HUDI-91
>                 URL: https://issues.apache.org/jira/browse/HUDI-91
>             Project: Apache Hudi (incubating)
>          Issue Type: Improvement
>          Components: Spark Integration, Usability
>            Reporter: Vinoth Chandar
>            Assignee: Udit Mehrotra
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.5.1
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/incubator-hudi/issues/628] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message