hudi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <>
Subject [GitHub] [incubator-hudi] n3nash commented on a change in pull request #1320: [HUDI-571] Add min/max headers on archived files
Date Fri, 14 Feb 2020 17:39:35 GMT
n3nash commented on a change in pull request #1320: [HUDI-571] Add min/max headers on archived

 File path: hudi-client/src/main/java/org/apache/hudi/io/
 @@ -268,6 +270,19 @@ public Path getArchiveFilePath() {
     return archiveFilePath;
+  private void writeHeaderBlock(Schema wrapperSchema, List<HoodieInstant> instants)
throws Exception {
+    if (!instants.isEmpty()) {
+      Collections.sort(instants, HoodieInstant.COMPARATOR);
+      HoodieInstant minInstant = instants.get(0);
+      HoodieInstant maxInstant = instants.get(instants.size() - 1);
+      Map<HeaderMetadataType, String> metadataMap = Maps.newHashMap();
+      metadataMap.put(HeaderMetadataType.SCHEMA, wrapperSchema.toString());
+      metadataMap.put(HeaderMetadataType.MIN_INSTANT_TIME, minInstant.getTimestamp());
+      metadataMap.put(HeaderMetadataType.MAX_INSTANT_TIME, maxInstant.getTimestamp());
+      this.writer.appendBlock(new HoodieAvroDataBlock(Collections.emptyList(), metadataMap));
+    }
+  }
   private void writeToFile(Schema wrapperSchema, List<IndexedRecord> records) throws
Exception {
 Review comment:
   You are right that the file is closed after archiving all instants that qualify in that
archiving process. But the next time an archival kicks in, it will check if the archival file
is grown to a certain size (say 1GB), if not, it will append the next archival blocks to the
same file..

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

With regards,
Apache Git Services

View raw message