druid-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] jihoonson commented on a change in pull request #5913: Move Caching Cluster Client to java streams and allow parallel intermediate merges
Date Fri, 21 Sep 2018 23:55:50 GMT
jihoonson commented on a change in pull request #5913: Move Caching Cluster Client to java
streams and allow parallel intermediate merges
URL: https://github.com/apache/incubator-druid/pull/5913#discussion_r219652313
 
 

 ##########
 File path: server/src/main/java/io/druid/client/CachingClusteredClient.java
 ##########
 @@ -162,34 +184,75 @@ public CachingClusteredClient(
     return new SpecificQueryRunnable<>(queryPlus, responseContext).run(timelineConverter);
   }
 
-  @Override
-  public <T> QueryRunner<T> getQueryRunnerForSegments(final Query<T> query,
final Iterable<SegmentDescriptor> specs)
+  private <T> QueryRunner<T> runAndMergeWithTimelineChange(
+      final Query<T> query,
+      final UnaryOperator<TimelineLookup<String, ServerSelector>> timelineConverter
+  )
   {
-    return new QueryRunner<T>()
-    {
-      @Override
-      public Sequence<T> run(final QueryPlus<T> queryPlus, final Map<String,
Object> responseContext)
-      {
-        return CachingClusteredClient.this.run(
+    final OptionalLong mergeBatch = QueryContexts.getIntermediateMergeBatchThreshold(query);
+
+    if (mergeBatch.isPresent()) {
+      final QueryRunnerFactory<T, Query<T>> queryRunnerFactory = conglomerate.findFactory(query);
+      final QueryToolChest<T, Query<T>> toolChest = queryRunnerFactory.getToolchest();
+      return (queryPlus, responseContext) -> {
+        final Stream<? extends Sequence<T>> sequences = run(
             queryPlus,
             responseContext,
-            timeline -> {
-              final VersionedIntervalTimeline<String, ServerSelector> timeline2 =
-                  new VersionedIntervalTimeline<>(Ordering.natural());
-              for (SegmentDescriptor spec : specs) {
-                final PartitionHolder<ServerSelector> entry = timeline.findEntry(spec.getInterval(),
spec.getVersion());
-                if (entry != null) {
-                  final PartitionChunk<ServerSelector> chunk = entry.getChunk(spec.getPartitionNumber());
-                  if (chunk != null) {
-                    timeline2.add(spec.getInterval(), spec.getVersion(), chunk);
-                  }
-                }
-              }
-              return timeline2;
-            }
+            timelineConverter
+        );
+        return MergeWorkTask.parallelMerge(
+            sequences.parallel(),
+            sequenceStream ->
+                new FluentQueryRunnerBuilder<>(toolChest)
+                    .create(
+                        queryRunnerFactory.mergeRunners(
 
 Review comment:
   In addition to @gianm's comment, I wonder the same goal can be achieved by using the processing
threads of brokers instead of adding a new ForkJoinPool. The broker already has its processing
thread pool, but it's not being used. I think this might be better because we can avoid adding
a new configuration for ForkJoinPool which looks similar to that for the existing processing
thread pool.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


Mime
View raw message