kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guozhang Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-1286) Retry Can Block
Date Tue, 04 Mar 2014 19:05:23 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13919775#comment-13919775
] 

Guozhang Wang commented on KAFKA-1286:
--------------------------------------

Updated reviewboard https://reviews.apache.org/r/18740/
 against branch origin/trunk

> Retry Can Block 
> ----------------
>
>                 Key: KAFKA-1286
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1286
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: producer 
>            Reporter: Guozhang Wang
>         Attachments: KAFKA-1286.patch, KAFKA-1286_2014-03-04_11:04:32.patch
>
>
> Under the following scenario the retry logic can block
> 1. The last broker's socket closed, sender.handleDisconnect() triggered, put the node
as disconnected.
> 2. In the next sender.run(), since the node is disconnected, remove the partition from
ready set, and call sender.initConnection(), which will not throw exception.
> 3. So in this round of send, the only request it tries to send to is the metadata request,
to the last broker; and the sender will firstly try to connect to that broker.
> 4. In selector.poll(), the finishConnect() call will throw exception, and in handleDisconnects(),
inFlight request's batches will be null since it is a metadata request.
> 5. Now we will go back to 1, and loop forever. Note that this infinite loop can be triggered
even without calling producer.close.
> Also, we need to introduce the retry backoff config, otherwise the retries will be exhausted
too soon (in my tests 10 retries can be exhausted in about 600ms).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message