kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joseph Aliase (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-5007) Kafka Replica Fetcher Thread- Resource Leak
Date Thu, 06 Apr 2017 17:49:41 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959426#comment-15959426
] 

Joseph Aliase commented on KAFKA-5007:
--------------------------------------

[~junrao] [~ijuma] I repeated the test in 0.8 Kafka Cluster and I don't see the open socket
issue in 0.8 version.
Adding to that I have replicated the issue in 0.10.0.0 and 0.10.2.0.

Downgrading to 0.8 is not a option for us.



> Kafka Replica Fetcher Thread- Resource Leak
> -------------------------------------------
>
>                 Key: KAFKA-5007
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5007
>             Project: Kafka
>          Issue Type: Bug
>          Components: core, network
>    Affects Versions: 0.10.1.1
>         Environment: Centos 7
> Jave 8
>            Reporter: Joseph Aliase
>            Priority: Critical
>              Labels: reliability
>
> Kafka is running out of open file descriptor when system network interface is done.
> Issue description:
> We have a Kafka Cluster of 5 node running on version 0.10.1.1. The open file descriptor
for the account running Kafka is set to 100000.
> During an upgrade, network interface went down. Outage continued for 12 hours eventually
all the broker crashed with java.io.IOException: Too many open files error.
> We repeated the test in a lower environment and observed that Open Socket count keeps
on increasing while the NIC is down.
> We have around 13 topics with max partition size of 120 and number of replica fetcher
thread is set to 8.
> Using an internal monitoring tool we observed that Open Socket descriptor   for the broker
pid continued to increase although NIC was down leading to  Open File descriptor error. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message