hudi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [hudi] zhangyue19921010 commented on a change in pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
Date Wed, 21 Jul 2021 01:57:32 GMT

zhangyue19921010 commented on a change in pull request #3259:
URL: https://github.com/apache/hudi/pull/3259#discussion_r673605286



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java
##########
@@ -171,4 +200,38 @@ private int doCluster(JavaSparkContext jsc) throws Exception {
       return client.scheduleClustering(Option.empty());
     }
   }
+
+  @TestOnly
+  public int doScheduleAndCluster() throws Exception {
+    return this.doScheduleAndCluster(jsc);
+  }
+
+  public int doScheduleAndCluster(JavaSparkContext jsc) throws Exception {
+    LOG.info("Step 1: Do schedule");
+    String schemaStr = getSchemaFromLatestInstant();
+    try (SparkRDDWriteClient client = UtilHelpers.createHoodieClient(jsc, cfg.basePath, schemaStr,
cfg.parallelism, Option.empty(), props)) {
+
+      Option<String> instantTime;
+      if (cfg.clusteringInstantTime != null) {
+        client.scheduleClusteringAtInstant(cfg.clusteringInstantTime, Option.empty());
+        instantTime = Option.of(cfg.clusteringInstantTime);
+      } else {
+        instantTime = client.scheduleClustering(Option.empty());
+      }
+
+      int result = instantTime.isPresent() ? 0 : -1;

Review comment:
       Nice idea. Changed. PTAL :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



Mime
View raw message