Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DE15D200B85 for ; Thu, 1 Sep 2016 04:26:40 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id CC2BE160AB5; Thu, 1 Sep 2016 02:26:30 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id EB7FF160AB4 for ; Thu, 1 Sep 2016 04:26:29 +0200 (CEST) Received: (qmail 98520 invoked by uid 500); 1 Sep 2016 02:26:28 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 97639 invoked by uid 99); 1 Sep 2016 02:26:27 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Sep 2016 02:26:27 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 4A5611A6023 for ; Thu, 1 Sep 2016 02:26:27 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.28 X-Spam-Level: * X-Spam-Status: No, score=1.28 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, WEIRD_PORT=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=confluent-io.20150623.gappssmtp.com Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id OoJ66Hc4Nphd for ; Thu, 1 Sep 2016 02:26:25 +0000 (UTC) Received: from mail-qt0-f181.google.com (mail-qt0-f181.google.com [209.85.216.181]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id 7479A5F4ED for ; Thu, 1 Sep 2016 02:26:24 +0000 (UTC) Received: by mail-qt0-f181.google.com with SMTP id 52so35265692qtq.3 for ; Wed, 31 Aug 2016 19:26:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=confluent-io.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=tUnrSrjzZmHRatN0wNgWVLTFb/cug3Ehro06xo/d5m0=; b=Mvx1HGSwva33pOz2jHxIMsAU2moI1BePzIXllyczhr9/d0D91ViOV0mqccD4GtaM7d AtJcXca27Y3UKZoJNs1bUKHJeLeU5bMd8zDUQ8WrCFk9QprL0XceA9GFR8aO7C5znepk X7RXt09uQcirc4HhHK3g7VPo/hse2+XplzBQyc7GwCODzShWj9ORqbVoCmJO3Zm00XO7 PrRkAFwlydTlE8Gr0bbWX1skn+0vTKgkAKG1PvN3nfZVlRY7AypLDCGe0LsmUhScketo RYFfoL4elfxV0I9RDun9CZKQ0Sow0mDZebv044wBfqhl9Pz+RzVK9iXYax0OGMFPQi8X 0gNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=tUnrSrjzZmHRatN0wNgWVLTFb/cug3Ehro06xo/d5m0=; b=EFFOngl4NLwFynNaaR6QHc2K08n0ynJSOONwuFJJmrscWtt9GUDrrelIqsgUmdY+or Rs1zeE0r1hEAvBU/WsV/czyw3xD/QRI9Sltgc0JmGPseAvVPWv+4Taz7cOrvWXfQ8Ey/ VKb3BiYNRfp/kkvUGYdl7Qj0hY35mBwo5PWlIoXAWBIXcFpLoGcTwN7JT+j3NSBD2J8u 7wZy+3IBsIxvIOyVSYA5qz+bF0nkof7PZe02KXiU9L6aFbB/u/3VqrYtS14t4pdh8RW6 g+HXYy8wyr6t0f55Sn8Q1ey+Mx5dnL31Alk9nut794UMv9ojlNKJoWdRmx8p0BBslEQq Brjw== X-Gm-Message-State: AE9vXwOxLFJyjDaidlUdYd+nfR8hTQu5NXJW9EHc/TbcgCkscvG/0DEVBKZFrT7VqPhMjadZdQPwzyrVwKCmK4m5 X-Received: by 10.237.53.19 with SMTP id a19mr14935928qte.67.1472696783434; Wed, 31 Aug 2016 19:26:23 -0700 (PDT) MIME-Version: 1.0 Received: by 10.55.181.193 with HTTP; Wed, 31 Aug 2016 19:26:22 -0700 (PDT) In-Reply-To: References: <5f1ba5d8d3e6443883fa05f347a5aefb@VAADCEX32.cable.comcast.com> From: Jason Gustafson Date: Wed, 31 Aug 2016 19:26:22 -0700 Message-ID: Subject: Re: Kafka consumers unable to process message To: users@kafka.apache.org Cc: "dev@kafka.apache.org" Content-Type: multipart/alternative; boundary=001a11c017d4a3953f053b68f06d archived-at: Thu, 01 Sep 2016 02:26:41 -0000 --001a11c017d4a3953f053b68f06d Content-Type: text/plain; charset=UTF-8 Hi Achintya, Just to clarify, you did not take down either of the zookeepers in this test, right? Having only two zookeepers in the ensemble would mean that if either one of them failed, zookeeper wouldn't be able to reach quorum. I'm not entirely sure why this would happen. One possibility is that the consumer is failing to find the new coordinator, which might happen if all the replicas for one of the __consumer_offsets partitions were located in the "failed" datacenter. Perhaps you can enable DEBUG logging and post some logs so we can see what it's actually doing during poll(). By the way, I noticed that your consumer configuration settings seem a little mixed up. The new consumer doesn't actually communicate with Zookeeper, so there's no need for those settings. And you don't need to include the "offsets.storage" option since Kafka is the only choice. Also, I don't think "consumer.timeout.ms" is an option. -Jason On Wed, Aug 31, 2016 at 6:43 PM, Ghosh, Achintya (Contractor) < Achintya_Ghosh@comcast.com> wrote: > Hi Jason, > > Thanks for your response. > > I know that is a known issue and I resolved it calling wakeup method by > another thread. But here my problem is different, let me explain , it's > very basic > > I created one cluster with 6 nodes( 3 from one datacenter and 3 from > another(remote) datacenter and kept replication factor 6 with 2 zookeeper > servers one from each datacenter ). Now I brought down all 3 nodes of my > local datacenter and produced few messages and I see producer is working > fine even my local data center nodes are down. It successfully writes the > messages to other data center nodes. But when I'm trying to consume the > messages the consumer.poll method gets stuck as my local datacenter is down > though other datacenter's nodes are up. > > My question is as the data has been written successfully to other > datacenter why consumer part is not working? > > Here is my Producer settings: > > props.put("bootstrap.servers", "psaq1-wc.sys.comcast.net:61616, > psaq2-wc.sys.comcast.net:61616,psaq3-wc.sys.comcast.net:61616,psaq1-ab. > sys.comcast.net:61617,psaq2-ab.sys.comcast.net:61617,psaq3 > -ab.sys.comcast.net:61617"); > props.put("acks", "1"); > props.put("max.block.ms", 1000); > props.put("key.serializer", "org.apache.kafka.common.serialization. > StringSerializer"); > props.put("value.serializer", "com.comcast.ps.kafka.object. > CustomMessageSer"); > > and here is Consumer settings: > > props.put("group.id", "app-consumer"); > props.put("enable.auto.commit", "false"); > props.put("auto.offset.reset", "earliest"); > props.put("auto.commit.interval.ms", "500"); > props.put("session.timeout.ms", "120000"); > props.put("consumer.timeout.ms", "10000"); > props.put("zookeeper.session.timeout.ms", "120000"); > props.put("zookeeper.connection.timeout.ms", "60000"); > props.put("offsets.storage","kafka"); > props.put("request.timeout.ms", "150000"); > props.put("bootstrap.servers", "psaq1-wc.sys.comcast.net: > 61616,psaq2-wc.sys.comcast.net:61616,psaq3-wc.sys.comcast.net:61616, > psaq1-ab.sys.comcast.net:61617,psaq2-ab.sys.comcast.net:61617,psaq3 > -ab.sys.comcast.net:61617"); > props.put("key.deserializer", "org.apache.kafka.common. > serialization.StringDeserializer"); > props.put("value.deserializer", > "com.comcast.ps.kafka.object.CustomMessageDeSer"); > > Is it because of consumer is not able to get the broker metadata if it is > trying to connect other datacenter's zookeeper server? I tried with to > increate the zookeeper session timeout and connection time out but no luck. > > Please help on this. > Thanks > Achintya > > > -----Original Message----- > From: Jason Gustafson [mailto:jason@confluent.io] > Sent: Wednesday, August 31, 2016 4:05 PM > To: users@kafka.apache.org > Cc: dev@kafka.apache.org > Subject: Re: Kafka consumers unable to process message > > Hi Achintya, > > We have a JIRA for this problem: https://issues. > apache.org/jira/browse/KAFKA-3834. Do you expect the client to raise an > exception in this case or do you just want to keep it from blocking > indefinitely? If the latter, you could escape the poll from another thread > using wakeup(). > > Thanks, > Jason > > On Wed, Aug 31, 2016 at 12:11 PM, Ghosh, Achintya (Contractor) < > Achintya_Ghosh@comcast.com> wrote: > > > Hi there, > > > > Kafka consumer gets stuck at consumer.poll() method if my current > > datacenter is down and replicated messages are in remote datacenter. > > > > How to solve that issue? > > > > Thanks > > Achintya > > > --001a11c017d4a3953f053b68f06d--