hudi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [incubator-hudi] satishkotha commented on a change in pull request #1320: [HUDI-571] Add min/max headers on archived files
Date Wed, 12 Feb 2020 20:08:39 GMT
satishkotha commented on a change in pull request #1320: [HUDI-571] Add min/max headers on
archived files
URL: https://github.com/apache/incubator-hudi/pull/1320#discussion_r378485704
 
 

 ##########
 File path: hudi-client/src/main/java/org/apache/hudi/io/HoodieCommitArchiveLog.java
 ##########
 @@ -268,6 +270,19 @@ public Path getArchiveFilePath() {
     return archiveFilePath;
   }
 
+  private void writeHeaderBlock(Schema wrapperSchema, List<HoodieInstant> instants)
throws Exception {
+    if (!instants.isEmpty()) {
+      Collections.sort(instants, HoodieInstant.COMPARATOR);
+      HoodieInstant minInstant = instants.get(0);
+      HoodieInstant maxInstant = instants.get(instants.size() - 1);
+      Map<HeaderMetadataType, String> metadataMap = Maps.newHashMap();
+      metadataMap.put(HeaderMetadataType.SCHEMA, wrapperSchema.toString());
+      metadataMap.put(HeaderMetadataType.MIN_INSTANT_TIME, minInstant.getTimestamp());
+      metadataMap.put(HeaderMetadataType.MAX_INSTANT_TIME, maxInstant.getTimestamp());
+      this.writer.appendBlock(new HoodieAvroDataBlock(Collections.emptyList(), metadataMap));
+    }
+  }
+
   private void writeToFile(Schema wrapperSchema, List<IndexedRecord> records) throws
Exception {
 
 Review comment:
   I've included decision for including header block above. Let me know. file is closed after
archiving all instants that qualify. So i think file can grow is not a issue. Correct me if
i'm reading this wrong. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message