kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [kafka] ncliang commented on a change in pull request #10563: KAFKA-12487: Add support for cooperative consumer protocol with sink connectors
Date Thu, 29 Apr 2021 09:31:18 GMT

ncliang commented on a change in pull request #10563:
URL: https://github.com/apache/kafka/pull/10563#discussion_r622851011

File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java
@@ -631,13 +648,31 @@ private void rewind() {
     private void openPartitions(Collection<TopicPartition> partitions) {
-        sinkTaskMetricsGroup.recordPartitionCount(partitions.size());
+        updatePartitionCount();
-    private void closePartitions() {
-        commitOffsets(time.milliseconds(), true);
-        sinkTaskMetricsGroup.recordPartitionCount(0);
+    private void closeAllPartitions() {
+        closePartitions(currentOffsets.keySet(), false);
+    }
+    private void closePartitions(Collection<TopicPartition> topicPartitions, boolean
lost) {
+        if (!lost) {
+            commitOffsets(time.milliseconds(), true, topicPartitions);
+        } else {
+            log.trace("{} Closing the task as partitions have been lost: {}", this, topicPartitions);
+            task.close(topicPartitions);
+            if (workerErrantRecordReporter != null) {
+                log.trace("Cancelling reported errors for {}", topicPartitions);
+                workerErrantRecordReporter.cancelFutures(topicPartitions);

Review comment:
       I'm not sure if cancelling the outstanding futures for error reporting is the right
thing to do here. Would it be reasonable to await their completion for a reasonable amount
of time before giving up?

File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java
@@ -680,13 +717,13 @@ public void onPartitionsAssigned(Collection<TopicPartition> partitions)
-            // If we paused everything for redelivery (which is no longer relevant since
we discarded the data), make
+            // If we paused everything for redelivery and all partitions for the failed deliveries
have been revoked, make
             // sure anything we paused that the task didn't request to be paused *and* which
we still own is resumed.
             // Also make sure our tracking of paused partitions is updated to remove any
partitions we no longer own.
-            pausedForRedelivery = false;
+            pausedForRedelivery = pausedForRedelivery && !messageBatch.isEmpty();

Review comment:
       I don't know if this change is required. The way I read the current implementation,
we make sure that the paused partitions contain only assigned partitions in the block below,
setting the paused partitions on context. We then rely on the code block in `iteration()`
to resume partitions that should not be paused.
               } else if (!pausedForRedelivery) {
   Setting this to anything other than false causes us not to resume partitions which we own
that were not explicitly requested to be paused.

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:

View raw message