Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DF0F3200BF8 for ; Fri, 13 Jan 2017 10:26:30 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id DD867160B3F; Fri, 13 Jan 2017 09:26:30 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 32C3B160B32 for ; Fri, 13 Jan 2017 10:26:30 +0100 (CET) Received: (qmail 45618 invoked by uid 500); 13 Jan 2017 09:26:29 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 45568 invoked by uid 99); 13 Jan 2017 09:26:29 -0000 Received: from Unknown (HELO jira-lw-us.apache.org) (207.244.88.139) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Jan 2017 09:26:29 +0000 Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 92BB825286 for ; Fri, 13 Jan 2017 09:26:26 +0000 (UTC) Date: Fri, 13 Jan 2017 09:26:26 +0000 (UTC) From: "huxi (JIRA)" To: dev@kafka.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (KAFKA-4616) Message loss is seen when kafka-producer-perf-test.sh is running and any broker restarted in middle MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 13 Jan 2017 09:26:31 -0000 [ https://issues.apache.org/jira/browse/KAFKA-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821473#comment-15821473 ] huxi commented on KAFKA-4616: ----------------------------- They are not missing, but are just not delivered to Kafka successfully. bq. The guarantee that Kafka offers is that a committed message will not be lost, as long as there is at least one in sync replica alive, at all times. To avoid your "data loss", try to append "retries=" to the command, although you might see some repeated produced messages. > Message loss is seen when kafka-producer-perf-test.sh is running and any broker restarted in middle > --------------------------------------------------------------------------------------------------- > > Key: KAFKA-4616 > URL: https://issues.apache.org/jira/browse/KAFKA-4616 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 0.10.0.0 > Environment: Apache mesos > Reporter: sandeep kumar singh > > if any broker is restarted while kafka-producer-perf-test.sh command is running, we see message loss. > commands i run: > **perf command: > $ bin/kafka-producer-perf-test.sh --num-records 100000 --record-size 4096 --throughput 1000 --topic test3R3P3 --producer-props bootstrap.servers=x.x.x.x:xxxx,x.x.x.x:xxxx,x.x.x.x:xxxx > I am sending 100000 messages of each having size 4096 > error thrown by perf command: > 4944 records sent, 988.6 records/sec (3.86 MB/sec), 31.5 ms avg latency, 433.0 max latency. > 5061 records sent, 1012.0 records/sec (3.95 MB/sec), 67.7 ms avg latency, 798.0 max latency. > 5001 records sent, 1000.0 records/sec (3.91 MB/sec), 49.0 ms avg latency, 503.0 max latency. > 5001 records sent, 1000.2 records/sec (3.91 MB/sec), 37.3 ms avg latency, 594.0 max latency. > 5001 records sent, 1000.2 records/sec (3.91 MB/sec), 32.6 ms avg latency, 501.0 max latency. > 5000 records sent, 999.8 records/sec (3.91 MB/sec), 49.4 ms avg latency, 516.0 max latency. > org.apache.kafka.common.errors.NetworkException: The server disconnected before a response was received. > org.apache.kafka.common.errors.NetworkException: The server disconnected before a response was received. > org.apache.kafka.common.errors.NetworkException: The server disconnected before a response was received. > ....truncated > 5001 records sent, 1000.2 records/sec (3.91 MB/sec), 33.9 ms avg latency, 497.0 max latency. > 4928 records sent, 985.6 records/sec (3.85 MB/sec), 42.1 ms avg latency, 521.0 max latency. > 5073 records sent, 1014.4 records/sec (3.96 MB/sec), 39.4 ms avg latency, 418.0 max latency. > 100000 records sent, 999.950002 records/sec (3.91 MB/sec), 37.65 ms avg latency, 798.00 ms max latency, 1 ms 50th, 260 ms 95th, 411 ms 99th, 571 ms 99.9th. > **consumer command: > $ bin/kafka-console-consumer.sh --zookeeper x.x.x.x:2181/dcos-service-kafka-framework --topic test3R3P3 1>~/kafka_output.log > message stored: > $ wc -l ~/kafka_output.log > 99932 /home/montana/kafka_output.log > I found only 99932 message are stored and 68 messages are lost. > **topic describe command: > $ bin/kafka-topics.sh --zookeeper x.x.x.x:2181/dcos-service-kafka-framework --describe |grep test3R3 > Topic:test3R3P3 PartitionCount:3 ReplicationFactor:3 Configs: > Topic: test3R3P3 Partition: 0 Leader: 2 Replicas: 1,2,0 Isr: 2,0,1 > Topic: test3R3P3 Partition: 1 Leader: 2 Replicas: 2,0,1 Isr: 2,0,1 > Topic: test3R3P3 Partition: 2 Leader: 0 Replicas: 0,1,2 Isr: 2,0,1 > **consumer group command: > $ bin/kafka-consumer-groups.sh --zookeeper x.x.x.x:2181/dcos-service-kafka-framework --describe --group console-consumer-9926 > GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG OWNER > console-consumer-9926 test3R3P3 0 33265 33265 0 console-consumer-9926_node-44a8422fe1a0-1484127474935-c795478e-0 > console-consumer-9926 test3R3P3 1 33334 33334 0 console-consumer-9926_node-44a8422fe1a0-1484127474935-c795478e-0 > console-consumer-9926 test3R3P3 2 33333 33333 0 console-consumer-9926_node-44a8422fe1a0-1484127474935-c795478e-0 > could you please help me understand what this error means "err - org.apache.kafka.common.errors.NetworkException: The server disconnected before a response was received."? > Could you please provide suggestion to fix this issue? > we are seeing this behavior every-time we perform above test-scenario. > my understanding is, there should not any data loss till n-1 broker is alive. is message loss is an expected behavior in the above case? > thanks > Sandeep -- This message was sent by Atlassian JIRA (v6.3.4#6332)