hudi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [hudi] sbernauer edited a comment on pull request #2012: [HUDI-1129] Deltastreamer Add support for schema evolution
Date Fri, 07 May 2021 05:51:00 GMT

sbernauer edited a comment on pull request #2012:
URL: https://github.com/apache/hudi/pull/2012#issuecomment-834085407


   Hi together,
   
   we sadly can't do schema evolution for 10 months now (https://github.com/apache/hudi/issues/1845)
and have to rely on ugly workarounds.
   Many thanks for working together to find a solution!
   We have tested this patch out in our test systems and everything worked fine. When we rolled
it out to production we noticed that the Memory consumption increased by multiple times. This
caused our executors to spill to disk and crash. We had to rollback to a previous version.
   So i would like to highlight the comment of @sathyaprakashg
   > @n3nash I am working on fixing build issue and will have that fix pushed soon. I would
like to point out that with this new approach, we are stroing writer schema part of payload,
which means, size of dataframe would increase to store same schema information with each record.
Any suggestion on optimizing this?
   
   Regards,
   Sebastian


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



Mime
View raw message