hudi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raymond Xu (Jira)" <j...@apache.org>
Subject [jira] [Commented] (HUDI-561) hudi partition path config
Date Mon, 24 Feb 2020 08:07:00 GMT

    [ https://issues.apache.org/jira/browse/HUDI-561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17043207#comment-17043207
] 

Raymond Xu commented on HUDI-561:
---------------------------------

[~liujinhui] I came across this ticket from a discussion of transformer where I wanted to
address similar issue via a custom transformer. After seeing the key generator classes, I
think it is more suitable for a custom key generator. In the case you described, simply extend [https://github.com/apache/incubator-hudi/blob/master/hudi-spark/src/main/java/org/apache/hudi/keygen/SimpleKeyGenerator.java] and
transform the partition path accordingly in getKey()

Having date time format as config entries to me looks like a sign of going too far on helping
out users.

 

> hudi partition path config
> --------------------------
>
>                 Key: HUDI-561
>                 URL: https://issues.apache.org/jira/browse/HUDI-561
>             Project: Apache Hudi (incubating)
>          Issue Type: Improvement
>          Components: DeltaStreamer
>            Reporter: liujinhui
>            Assignee: liujinhui
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.6.0
>
>   Original Estimate: 24h
>          Time Spent: 10m
>  Remaining Estimate: 23h 50m
>
> The current hudi partition is in accordance with hoodie.datasource.write.partitionpath.field
= keyname
> example:
> keyname 2019/12/20
> Usually the time format may be yyyy-MM-dd HH: mm: ss or other
> yyyy-MM-dd HH: mm: ss cannot be partitioned correctly
> So I want to add configuration :
> hoodie.datasource.write.partitionpath.source.format = yyyy-MM-dd HH: mm: ss
> hoodie.datasource.write.partitionpath.target.format = yyyy / MM / dd



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message