kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpan (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (KAFKA-5153) KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting
Date Tue, 22 Aug 2017 13:11:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136769#comment-16136769
] 

Arpan edited comment on KAFKA-5153 at 8/22/17 1:10 PM:
-------------------------------------------------------

Hi [~arthurk] - Not sure yet what is the solution and we are also stuck and it is quite strange
as well.

You may also want to have a look at https://issues.apache.org/jira/browse/KAFKA-2729 KAFKA-2729
once. This looks to be similar to the problem we are facing.

Regards,
Arpan Khagram
+91 8308993200


was (Author: arpan.khagram0212@gmail.com):
Hi [~arthurk] - Not sure yet what is the solution and we are also stuck and it is quite strange
as well.

You may also want to have a look at https://issues.apache.org/jira/browse/KAFKA-2729 once.
This looks to be similar to the problem we are facing.

Regards,
Arpan Khagram
+91 8308993200

> KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting
> ---------------------------------------------------------------------------
>
>                 Key: KAFKA-5153
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5153
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.10.2.0, 0.11.0.0
>         Environment: RHEL 6
> Java Version  1.8.0_91-b14
>            Reporter: Arpan
>            Priority: Critical
>         Attachments: server_1_72server.log, server_2_73_server.log, server_3_74Server.log,
server.properties, ThreadDump_1493564142.dump, ThreadDump_1493564177.dump, ThreadDump_1493564249.dump
>
>
> Hi Team, 
> I was earlier referring to issue KAFKA-4477 because the problem i am facing is similar.
I tried to search the same reference in release docs as well but did not get anything in 0.10.1.1
or 0.10.2.0. I am currently using 2.11_0.10.2.0.
> I am have 3 node cluster for KAFKA and cluster for ZK as well on the same set of servers
in cluster mode. We are having around 240GB of data getting transferred through KAFKA everyday.
What we are observing is disconnect of the server from cluster and ISR getting reduced and
it starts impacting service.
> I have also observed file descriptor count getting increased a bit, in normal circumstances
we have not observed FD count more than 500 but when issue started we were observing it in
the range of 650-700 on all 3 servers. Attaching thread dumps of all 3 servers when we started
facing the issue recently.
> The issue get vanished once you bounce the nodes and the set up is not working more than
5 days without this issue. Attaching server logs as well.
> Kindly let me know if you need any additional information. Attaching server.properties
as well for one of the server (It's similar on all 3 serversP)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message