kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apurva Mehta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-5621) The producer should retry expired batches when retries are enabled
Date Wed, 26 Jul 2017 21:27:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16102313#comment-16102313
] 

Apurva Mehta commented on KAFKA-5621:
-------------------------------------

I think the core dichotomy is that we have mirror-maker-like use cases and application use
cases.
 
In the mirror maker use case, each partition is truly independent. If a subset of partitions
are down, we still want to process the rest. So we want to expire batches and raise errors
to the application (mirror maker in this case) as soon as possible. 

On the other hand, for an application, partitions are not really independent (and especially
so if you use transactions). If one partition is down, it makes sense to wait for it to be
ready before continuing. So we would want to handle as many errors internally as possible.
It would mean blocking sends once the queue is too large and not expiring batches in the queue.
This simplifies the application programming model. 

I think we should optimize the defaults for applications, but yet enable tools like mirror
maker to get the desired behavior by setting the right configs.

Assuming that the we complete [KAFKA-5494], we could apply retries to expired batches only
when the idempotent producer is enabled. This way the default behavior is the simplest one
for the application. 

KMM and other such tools could continue to use the producer without idempotence enabled and
keep the existing behavior. Of course, if we get into the same quandary if KMM wants to enable
idempotence, but this is the best compromise without introducing an additional config. 

Another option is to introduce the 'queue.time.ms' config. The default would be infinite.
When it is specified, we would not retry expired batches regardless of whether idempotence
is enabled. So KMM like tooling could specify a value and most application developers could
ignore it. 

I am not a fan of introducing new configs for a very narrow use case though, so I will continue
to think of more alternatives.

> The producer should retry expired batches when retries are enabled
> ------------------------------------------------------------------
>
>                 Key: KAFKA-5621
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5621
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Apurva Mehta
>             Fix For: 1.0.0
>
>
> Today, when a batch is expired in the accumulator, a {{TimeoutException}} is raised to
the user.
> It might be better the producer to retry the expired batch rather up to the configured
number of retries. This is more intuitive from the user's point of view. 
> Further the proposed behavior makes it easier for applications like mirror maker to provide
ordering guarantees even when batches expire. Today, they would resend the expired batch and
it would get added to the back of the queue, causing the output ordering to be different from
the input ordering.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message