hudi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aakash aakash <email2aak...@gmail.com>
Subject issue while reading archived commit written by 0.5 version with 0.8 version
Date Wed, 23 Jun 2021 03:20:32 GMT
Hi,

I am trying to use Hudi 0.8 with Spark 3.0 in my prod environment and
earlier we were running Hudi 0.5 with Spark 2.4.4.

While updating a very old index, I am getting this error :

*from the logs it seem its  error out while reading this file :
hudi/.hoodie/archived/.commits_.archive.119_1-0-1 in s3*

21/06/22 19:18:06 ERROR HoodieTimelineArchiveLog: Failed to archive
commits, .commit file: 20200715192915.rollback.inflight
java.io.IOException: Not an Avro data file
at org.apache.avro.file.DataFileReader.openReader(DataFileReader.java:50)
at
org.apache.hudi.common.table.timeline.TimelineMetadataUtils.deserializeAvroMetadata(TimelineMetadataUtils.java:175)
at
org.apache.hudi.client.utils.MetadataConversionUtils.createMetaWrapper(MetadataConversionUtils.java:84)
at
org.apache.hudi.table.HoodieTimelineArchiveLog.convertToAvroRecord(HoodieTimelineArchiveLog.java:370)
at
org.apache.hudi.table.HoodieTimelineArchiveLog.archive(HoodieTimelineArchiveLog.java:311)
at
org.apache.hudi.table.HoodieTimelineArchiveLog.archiveIfRequired(HoodieTimelineArchiveLog.java:128)
at
org.apache.hudi.client.AbstractHoodieWriteClient.postCommit(AbstractHoodieWriteClient.java:430)
at
org.apache.hudi.client.AbstractHoodieWriteClient.commitStats(AbstractHoodieWriteClient.java:186)
at
org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:121)
at
org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:479)


Is this a backward compatibility issue? I have deleted a few archive files
but the problem is persisting so it does not look like a file corruption
issue.

Regards,
Aakash

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message