hudi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [incubator-hudi] satishkotha commented on a change in pull request #1274: [HUDI-571] Add 'commits show archived' command to CLI
Date Wed, 29 Jan 2020 01:43:44 GMT
satishkotha commented on a change in pull request #1274: [HUDI-571] Add 'commits show archived'
command to CLI
URL: https://github.com/apache/incubator-hudi/pull/1274#discussion_r372151885
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
 ##########
 @@ -289,11 +288,8 @@ public ConsistencyGuardConfig getConsistencyGuardConfig() {
    *
    * @return Active commit timeline
    */
-  public synchronized HoodieArchivedTimeline getArchivedTimeline() {
-    if (archivedTimeline == null) {
-      archivedTimeline = new HoodieArchivedTimeline(this);
-    }
-    return archivedTimeline;
+  public synchronized HoodieArchivedTimeline getArchivedTimeline(String startTs, String endTs)
{
 
 Review comment:
   After talking in person, changed behavior to create single instance of "HoodieArchivedTimeline"
and load metadata for all  commits.  
   
   Note that this means we read all archived files first. Then do a second pass for details
of commits in specific time range. This increases overall time taken by first command. In
the example dataset, it took ~20 minutes for initial metadata to load. 
   Then subsequent commands are few seconds each. 
   
   With previous approach we only do one pass on files. All commands are few seconds each.

   
   So I think we need to improve metadata to reduce time taken by first step with new approach.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message