kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ismael Juma (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-4237) Avoid long request timeout for the consumer
Date Fri, 07 Jul 2017 14:31:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16078165#comment-16078165
] 

Ismael Juma commented on KAFKA-4237:
------------------------------------

Also see KAFKA-5570 that is about doing option 1 without changing the default consumer request
timeout.

> Avoid long request timeout for the consumer
> -------------------------------------------
>
>                 Key: KAFKA-4237
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4237
>             Project: Kafka
>          Issue Type: Improvement
>          Components: consumer
>            Reporter: Jason Gustafson
>
> In the consumer rebalance protocol, the JoinGroup can stay in purgatory on the server
for as long as the rebalance timeout. For the Java client, that means that the request timeout
must be at least as large as the rebalance timeout (which is governed by {{max.poll.interval.ms}}
since KIP-62 and {{session.timeout.ms}} before then). By default, since 0.10.1, this is 5
minutes plus some change, which makes the clients slow to detect some hard failures.
> To fix this, two options come to mind:
> 1. Right now, all request APIs are limited by the same request timeout in {{NetworkClient}},
but there's not really any reason why this must be so. We could use a separate timeout for
the JoinGroup request (the implementations of this is straightforward: https://github.com/confluentinc/kafka/pull/108/files).
> 2. Alternatively, we could prevent the server from holding the JoinGroup in purgatory
for such a long time. Instead, it could return early from the JoinGroup (say before the session
timeout has expired) with an error code (e.g. REBALANCE_IN_PROGRESS), which tells the client
that it should just resend the JoinGroup.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message