hudi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "vinoyang (Jira)" <j...@apache.org>
Subject [jira] [Commented] (HUDI-538) Restructuring hudi client module for multi engine support
Date Tue, 21 Jan 2020 02:53:00 GMT

    [ https://issues.apache.org/jira/browse/HUDI-538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019821#comment-17019821
] 

vinoyang commented on HUDI-538:
-------------------------------

bq. A good analogy is to think of the first integration with Flink as similar to hudi-spark.
you can write Spark programs consuming any spark datasource today and write out hudi datasets,
without using deltastreamer, right? 

Right. OK, let's discuss {{DeltaStreamer}} after we have {{hudi-flink}} module.

> Restructuring hudi client module for multi engine support
> ---------------------------------------------------------
>
>                 Key: HUDI-538
>                 URL: https://issues.apache.org/jira/browse/HUDI-538
>             Project: Apache Hudi (incubating)
>          Issue Type: Wish
>          Components: Code Cleanup
>            Reporter: vinoyang
>            Priority: Major
>
> Hudi is currently tightly coupled with the Spark framework. It caused the integration
with other computing engine more difficult. We plan to decouple it with Spark. This umbrella
issue used to track this work.
> Some thoughts wrote here: https://docs.google.com/document/d/1Q9w_4K6xzGbUrtTS0gAlzNYOmRXjzNUdbbe0q59PX9w/edit?usp=sharing
> The feature branch is {{restructure-hudi-client}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message