Return-Path: X-Original-To: apmail-kafka-dev-archive@www.apache.org Delivered-To: apmail-kafka-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5776F17BD0 for ; Wed, 1 Oct 2014 14:35:16 +0000 (UTC) Received: (qmail 65973 invoked by uid 500); 1 Oct 2014 14:35:15 -0000 Delivered-To: apmail-kafka-dev-archive@kafka.apache.org Received: (qmail 65930 invoked by uid 500); 1 Oct 2014 14:35:15 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 65904 invoked by uid 99); 1 Oct 2014 14:35:15 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Oct 2014 14:35:15 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of neha.narkhede@gmail.com designates 74.125.82.42 as permitted sender) Received: from [74.125.82.42] (HELO mail-wg0-f42.google.com) (74.125.82.42) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Oct 2014 14:34:49 +0000 Received: by mail-wg0-f42.google.com with SMTP id z12so638656wgg.25 for ; Wed, 01 Oct 2014 07:34:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=B3hkWSyKJmvv1x6S93e6l0sSR53TQTdXzl7kXLSQ8mY=; b=Q8vffmbgI8BqPAdHHBTa8xInuKtnxCdC5Ti2Uss1rtlJpLGiCGLTR7ENHdAc+mrewR 1ne1XWBktwA1mm1fxiT86VytrQvTTiZSYfTPAfE+ie1pVJW0vC7zLzrYo2kVZyZ3jJcv aUJPoouCKzvHHmOWqIN/fnN37XcN3aUtvOPUkqVu9YOoOWQwLrq/vdSAYKM0ZJQBUIM3 yxEBjAJrdF7mYqn5TsyxHkR7LNRh900ilBFlVN4qImzORlmQz5eLKFThBRX2GJxMwq0z 2jhpkf8ho9QMklXMd0F3FLMb8LklYtlOAPXQVDl/1nVqL8x6/yj7GJOcZr8luf93gGjf dOlQ== MIME-Version: 1.0 X-Received: by 10.194.94.73 with SMTP id da9mr63892814wjb.67.1412174089073; Wed, 01 Oct 2014 07:34:49 -0700 (PDT) Received: by 10.217.146.201 with HTTP; Wed, 1 Oct 2014 07:34:49 -0700 (PDT) In-Reply-To: <713F15A6-C1F4-4A86-AFCC-0542D50696A5@andrashatvani.com> References: <7EF6017E-85B7-4872-B4A1-0BA4BE8CEDD7@andrashatvani.com> <744B2D8F-F763-420C-A4B3-EA5A8BD7E27B@andrashatvani.com> <20140926175354.GG31280@jkoshy-ld.linkedin.biz> <713F15A6-C1F4-4A86-AFCC-0542D50696A5@andrashatvani.com> Date: Wed, 1 Oct 2014 07:34:49 -0700 Message-ID: Subject: Re: LeaderNotAvailableException, although leader elected From: Neha Narkhede To: "users@kafka.apache.org" Cc: "dev@kafka.apache.org" Content-Type: multipart/alternative; boundary=047d7bb04efef0b8e805045d669d X-Virus-Checked: Checked by ClamAV on apache.org --047d7bb04efef0b8e805045d669d Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Andras, Thanks for your feedback! In my opinion programmatic message sending must work out of the box on the first try, without any exceptions, warnings or the need for additional configuration. I'd be glad to support/contribute. I agree that the behavior of the producer for the first message on a topic is awkward and I'd encourage feedback from you. We certainly are interested in improving user experience. Could you please file a JIRA so we can discuss alternatives there? Thanks, Neha On Sat, Sep 27, 2014 at 12:58 AM, Andras Hatvani < andras.hatvani@andrashatvani.com> wrote: > AFAIK not topics, but only partitions of topics have leaders. What > controllers do you mean? I haven't read about such. > > Thanks for the explanation regarding the metadata request, in the meantim= e > I found out that this is an expected (!) failure ( > http://qnalist.com/questions/4787268/kafka-common-leadernotavailableexcep= tion > ). > > For me this is isn't an acceptable way to communicate that the leader > election is in progress. There is not a single hint to this fact, but onl= y > an exception. > If this is an expected behavior, then it not only mustn't be an exception= , > but it also has to be communicated that there is something in progress. > Furthermore, suggestions regarding changing the values variables I > mentioned in my solution should be mandatory. > > This was my case: > - OK, let's use Kafka > - Create an infrastructure > - Create a programmatic producer > - Send a message > - Message sending fails. > - Retry > - Message sending works! > - Look for answers on the internet and in the docs > - Read the configuration > - Play around with configuration values. > > This is bad user experience especially for a newbie and involves a lot of > effort. > > In my opinion programmatic message sending must work out of the box on th= e > first try, without any exceptions, warnings or the need for additional > configuration. > > I'd be glad to support/contribute. > > Regards, > Andras > > > On 26 Sep 2014, at 19:53, Joel Koshy wrote: > > > >>> kafka2_1 | [2014-09-26 12:35:07,289] INFO [Kafka Server 2], > started (kafka.server.KafkaServer) > >>> kafka2_1 | [2014-09-26 12:35:07,394] INFO New leader is 2 > (kafka.server.ZookeeperLeaderElector$LeaderChangeListener) > > > > The above logs are for controller election. Not leader election for > > the topic you are producing to. > > > > When you producer starts it will issue a metadata request and it > > should auto-create the topic (provided auto-create is on which is > > default). The first metadata request for a non-existent topic always > > returns LeaderNotAvailable because the controller then has to elect a > > leader for the new topic. > > > > Joel > > > > On Fri, Sep 26, 2014 at 04:07:58PM +0200, Andras Hatvani wrote: > >> And the solution was: > >> Increase retry.backoff.ms from the default 100 ms to 1000 ms, so the > output is: > >> > >> 11891 [main] INFO kafka.client.ClientUtils$ - Fetching metadata from > broker id:0,host:192.168.59.103,port:9092 with correlation id 0 for 1 > topic(s) Set(inputTopic) > >> 11893 [main] INFO kafka.producer.SyncProducer - Connected to > 192.168.59.103:9092 for producing > >> 12045 [main] INFO kafka.producer.SyncProducer - Disconnecting from > 192.168.59.103:9092 > >> 12062 [main] WARN kafka.producer.BrokerPartitionInfo - Error while > fetching metadata [{TopicMetadata for topic inputTopic -> > >> No partition metadata for topic inputTopic due to > kafka.common.LeaderNotAvailableException}] for topic [inputTopic]: class > kafka.common.LeaderNotAvailableException > >> 12066 [main] INFO kafka.client.ClientUtils$ - Fetching metadata from > broker id:0,host:192.168.59.103,port:9092 with correlation id 1 for 1 > topic(s) Set(inputTopic) > >> 12067 [main] INFO kafka.producer.SyncProducer - Connected to > 192.168.59.103:9092 for producing > >> 12097 [main] INFO kafka.producer.SyncProducer - Disconnecting from > 192.168.59.103:9092 > >> 12098 [main] WARN kafka.producer.BrokerPartitionInfo - Error while > fetching metadata [{TopicMetadata for topic inputTopic -> > >> No partition metadata for topic inputTopic due to > kafka.common.LeaderNotAvailableException}] for topic [inputTopic]: class > kafka.common.LeaderNotAvailableException > >> 12098 [main] ERROR kafka.producer.async.DefaultEventHandler - Failed t= o > collate messages by topic, partition due to: Failed to fetch topic metada= ta > for topic: inputTopic > >> 12099 [main] INFO kafka.producer.async.DefaultEventHandler - Back off > for 1000 ms before retrying send. Remaining retries =3D 3 > >> 13104 [main] INFO kafka.client.ClientUtils$ - Fetching metadata from > broker id:0,host:192.168.59.103,port:9092 with correlation id 2 for 1 > topic(s) Set(inputTopic) > >> 13111 [main] INFO kafka.producer.SyncProducer - Connected to > 192.168.59.103:9092 for producing > >> 13137 [main] INFO kafka.producer.SyncProducer - Disconnecting from > 192.168.59.103:9092 > >> 13161 [main] INFO kafka.producer.SyncProducer - Connected to > 192.168.59.103:9092 for producing > >> > >> In case this would be not enough for you, you can try to change the > values of > >> - message.send.max.retries from the default 5 to e.g. 10 and > >> - topic.metadata.refresh.interval.ms to 0 > >> > >> > >> It is still unclear, why it takes 2 tries i.e. 2 seconds to fetch the > topic metadata=E2=80=A6 > >> > >> > >> Cheers, > >> Andras > >> > >>> On 26 Sep 2014, at 14:43, Andras Hatvani < > andras.hatvani@andrashatvani.com> wrote: > >>> > >>> Hi, > >>> > >>> What does the exception mean in the following context? > >>> > >>> I've got two Kafka brokers out which the second one will be elected a= s > leader: > >>> > >>> kafka2_1 | [2014-09-26 12:35:07,289] INFO [Kafka Server 2], > started (kafka.server.KafkaServer) > >>> kafka2_1 | [2014-09-26 12:35:07,394] INFO New leader is 2 > (kafka.server.ZookeeperLeaderElector$LeaderChangeListener) > >>> > >>> Then, when I try to produce a message, I only get the following error= : > >>> > >>> 23182 [main] INFO kafka.producer.async.DefaultEventHandler - Back of= f > for 100 ms before retrying send. Remaining retries =3D 0 > >>> 23286 [main] INFO kafka.client.ClientUtils$ - Fetching metadata from > broker id:0,host:192.168.59.103,port:9092 with correlation id 8 for 1 > topic(s) Set(inputTopic) > >>> 23287 [main] INFO kafka.producer.SyncProducer - Connected to > 192.168.59.103:9092 for producing > >>> 23300 [main] INFO kafka.producer.SyncProducer - Disconnecting from > 192.168.59.103:9092 > >>> 23301 [main] WARN kafka.producer.BrokerPartitionInfo - Error while > fetching metadata [{TopicMetadata for topic inputTopic -> > >>> No partition metadata for topic inputTopic due to > kafka.common.LeaderNotAvailableException}] for topic [inputTopic]: class > kafka.common.LeaderNotAvailableException > >>> 23303 [main] ERROR kafka.producer.async.DefaultEventHandler - Failed > to send requests for topics inputTopic with correlation ids in [0,8] > >>> > >>> The property metadata.broker.list contains the host:port values of > both brokers. > >>> > >>> What can be wrong/missing? > >>> Thanks, > >>> Andras > >> > > > > --047d7bb04efef0b8e805045d669d--