kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ewen Cheslack-Postava (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-3335) Kafka Connect hangs in shutdown hook
Date Tue, 31 May 2016 01:58:13 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15307077#comment-15307077

Ewen Cheslack-Postava commented on KAFKA-3335:

[~shikhar] The shutdown hook is added at the beginning to make sure we clean up even if something
happens during startup -- any services that did get started up should be properly cleaned

I think a relevant piece of info that was missing is that it looks like this was against the
0.9 releases (or a version of trunk after 0.9 and before 0.10) and the code has since been
cleaned up a bit. The startLatch wasn't previously in a finally block which explains why it
was never triggered. Since that's fixed, it won't block the subsequent stop() call. I've validated
by manually triggering an exception in both the code and the trunk code and the issue
is only reproduced in the old release.

> Kafka Connect hangs in shutdown hook
> ------------------------------------
>                 Key: KAFKA-3335
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3335
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>    Affects Versions:
>            Reporter: Ben Kirwin
> The `Connect` class can run into issues during start, such as:
> {noformat}
> Exception in thread "main" org.apache.kafka.connect.errors.ConnectException: Could not
look up partition metadata for offset backing store topic in allotted period. This could indicate
a connectivity issue, unavailable topic partitions, or if this is your first use of the topic
it may have taken too long to create.
>         at org.apache.kafka.connect.util.KafkaBasedLog.start(KafkaBasedLog.java:130)
>         at org.apache.kafka.connect.storage.KafkaOffsetBackingStore.start(KafkaOffsetBackingStore.java:85)
>         at org.apache.kafka.connect.runtime.Worker.start(Worker.java:108)
>         at org.apache.kafka.connect.runtime.Connect.start(Connect.java:56)
>         at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:62)
> {noformat}
> This exception halts the startup process. It also triggers the shutdown hook... which
blocks waiting for the service to start up before calling stop. This causes the process to
hang forever.
> There's a few things that could be done here, but it would be nice to bound the amount
of time the process spends trying to exit gracefully.

This message was sent by Atlassian JIRA

View raw message