hudi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pratyaksh Sharma (Jira)" <j...@apache.org>
Subject [jira] [Created] (HUDI-727) Copy default values of fields if not present when rewriting incoming record with new schema
Date Fri, 20 Mar 2020 08:59:00 GMT
Pratyaksh Sharma created HUDI-727:
-------------------------------------

             Summary: Copy default values of fields if not present when rewriting incoming
record with new schema
                 Key: HUDI-727
                 URL: https://issues.apache.org/jira/browse/HUDI-727
             Project: Apache Hudi (incubating)
          Issue Type: Improvement
          Components: Utilities
            Reporter: Pratyaksh Sharma
            Assignee: Pratyaksh Sharma
             Fix For: 0.6.0


Currently we recommend users to evolve schema in backwards compatible way. When one is trying
to evolve schema in backwards compatible way, one of the most significant things to do is
to define default value for newly added columns so that records published with previous schema
also can be consumed properly. 
 
However just before actually writing record to Hudi dataset, we try to rewrite record with
new Avro schema which has Hudi metadata columns [1]. In this function, we are only trying
to get the values from record without considering field's default value. As a result, schema
validation fails. 
IMO, this piece of code should take into account default value as well in case field's actual
value is null. 
 
[1] [https://github.com/apache/incubator-hudi/blob/078d4825d909b2c469398f31c97d2290687321a8/hudi-common/src/main/java/org/apache/hudi/common/util/HoodieAvroUtils.java#L205].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message