hudi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pratyaksh Sharma (Jira)" <j...@apache.org>
Subject [jira] [Updated] (HUDI-727) Copy default values of fields if not present when rewriting incoming record with new schema
Date Fri, 20 Mar 2020 10:00:01 GMT

     [ https://issues.apache.org/jira/browse/HUDI-727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Pratyaksh Sharma updated HUDI-727:
----------------------------------
    Status: In Progress  (was: Open)

> Copy default values of fields if not present when rewriting incoming record with new
schema
> -------------------------------------------------------------------------------------------
>
>                 Key: HUDI-727
>                 URL: https://issues.apache.org/jira/browse/HUDI-727
>             Project: Apache Hudi (incubating)
>          Issue Type: Improvement
>          Components: Utilities
>            Reporter: Pratyaksh Sharma
>            Assignee: Pratyaksh Sharma
>            Priority: Major
>             Fix For: 0.6.0
>
>
> Currently we recommend users to evolve schema in backwards compatible way. When one is
trying to evolve schema in backwards compatible way, one of the most significant things to
do is to define default value for newly added columns so that records published with previous
schema also can be consumed properly. 
>  
> However just before actually writing record to Hudi dataset, we try to rewrite record
with new Avro schema which has Hudi metadata columns [1]. In this function, we are only trying
to get the values from record without considering field's default value. As a result, schema
validation fails. 
> IMO, this piece of code should take into account default value as well in case field's
actual value is null. 
>  
> [1] [https://github.com/apache/incubator-hudi/blob/078d4825d909b2c469398f31c97d2290687321a8/hudi-common/src/main/java/org/apache/hudi/common/util/HoodieAvroUtils.java#L205].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message