From user-return-21423-archive-asf-public=cust-asf.ponee.io@flink.apache.org  Wed Jul 18 23:30:37 2018
Return-Path: <user-return-21423-archive-asf-public=cust-asf.ponee.io@flink.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 3C0DB180636
	for <archive-asf-public@cust-asf.ponee.io>; Wed, 18 Jul 2018 23:30:36 +0200 (CEST)
Received: (qmail 15583 invoked by uid 500); 18 Jul 2018 21:30:34 -0000
Mailing-List: contact user-help@flink.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:user-help@flink.apache.org>
List-Unsubscribe: <mailto:user-unsubscribe@flink.apache.org>
List-Post: <mailto:user@flink.apache.org>
List-Id: <user.flink.apache.org>
Delivered-To: mailing list user@flink.apache.org
Received: (qmail 15572 invoked by uid 99); 18 Jul 2018 21:30:34 -0000
Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142)
    by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Jul 2018 21:30:34 +0000
Received: from localhost (localhost [127.0.0.1])
	by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 65605C126E
	for <user@flink.apache.org>; Wed, 18 Jul 2018 21:30:34 +0000 (UTC)
X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org
X-Spam-Flag: NO
X-Spam-Score: 3.041
X-Spam-Level: ***
X-Spam-Status: No, score=3.041 tagged_above=-999 required=6.31
	tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,
	HTML_MESSAGE=2, HTML_OBFUSCATE_10_20=1.162,
	RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01,
	RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled
Authentication-Results: spamd4-us-west.apache.org (amavisd-new);
	dkim=pass (2048-bit key) header.d=newrelic.com
Received: from mx1-lw-us.apache.org ([10.40.0.8])
	by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024)
	with ESMTP id KpzLbnCSrBOu for <user@flink.apache.org>;
	Wed, 18 Jul 2018 21:30:30 +0000 (UTC)
Received: from mail-pg1-f193.google.com (mail-pg1-f193.google.com [209.85.215.193])
	by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 9C2C35F178
	for <user@flink.apache.org>; Wed, 18 Jul 2018 21:30:29 +0000 (UTC)
Received: by mail-pg1-f193.google.com with SMTP id p23-v6so2560814pgv.13
        for <user@flink.apache.org>; Wed, 18 Jul 2018 14:30:29 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=newrelic.com; s=google;
        h=from:message-id:mime-version:subject:date:in-reply-to:cc:to
         :references;
        bh=D8Aws4lUxkOsyDbQqVli3qglsFJeMWDFO8YZWAu2Gtg=;
        b=vM70bmLRbWa5S1jnZR46lZYJ+CYsTQVEKuMIS5Vocn66VeAJhbYlfnS0iUX36PyqsN
         Y/lOnzeD7EWp3aU9ScM/kkOY3gAE5YJ8wIpjKiY3N1f6M3pvHW0deto5DrUrcc2oZUxt
         A2kgQ90XJDSPzUquV4x0/CIDqqT3aEQOM3bO52ROX2l2pKSJ4nLEG8E1z8tLJSOVXQi7
         M3d6ew5pDenDuT6YUdKQ17sam2RHvKbwc2sdrNNNc38lFS9P83HD1iQ2mdquV9PM6tO1
         +fXuPsfnoD/vUVXZJZOQipHnvvCbaJO5BzkCrXEbM2U2EcSAv+0qUpL3F7fBeNLDv+7z
         6NDw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:from:message-id:mime-version:subject:date
         :in-reply-to:cc:to:references;
        bh=D8Aws4lUxkOsyDbQqVli3qglsFJeMWDFO8YZWAu2Gtg=;
        b=uoZFaek66DQj/daAdr08uc8S7VM+dJc+x169u8V8eeMr6TsBPcTddms/Eor0GR7df8
         wB8V3FCv725BcCWoFxoNfdieTnXJPAzqOHjmJh0HSWA8gtwpVV2igsZXd2W+2OD8stHL
         5M2Web5G9gyAyAA4uEC5t3DP5BU3uy3Op8RZjOtsBUO0gkGyAAHtFya2bSurCsMoKjO6
         D5mJj8VNM91q/ly2pe8v1tHiusmH7OwwiMQlP8BF1hp1PWi/Zw2Qa2uKhiGWBNDHnDuB
         9o8WRwchRjvnYvRoc198po76uRQLu7/+U43bxy77k3frGEpSLA79Q8QBZ7pKPyr0foOY
         iqyA==
X-Gm-Message-State: AOUpUlET8uS/8k9i4AAMeTeuNBi8DVkE1xxs9fuSRzQpMZtr3nuj42ss
	XZAM7MU6q3BsLCw/kzrctsMWQ//P8AU=
X-Google-Smtp-Source: AAOMgpfDP4jAAY7m4gsOsVlLThIdesMLyFNySf3LMI0Q+MTDyyMUUJoJAGwJZn+UNoukEgyoZYu39Q==
X-Received: by 2002:a62:a312:: with SMTP id s18-v6mr6709636pfe.13.1531949422586;
        Wed, 18 Jul 2018 14:30:22 -0700 (PDT)
Received: from [172.16.112.191] ([38.104.105.178])
        by smtp.gmail.com with ESMTPSA id 14-v6sm8897468pft.93.2018.07.18.14.30.21
        (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
        Wed, 18 Jul 2018 14:30:21 -0700 (PDT)
From: Ron Crocker <rcrocker@newrelic.com>
Message-Id: <2B0AF483-4A7B-4E55-80BC-E706C675714F@newrelic.com>
Content-Type: multipart/alternative;
	boundary="Apple-Mail=_6C2BE91C-8DAD-4505-BF14-ED9B58545D75"
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
Subject: Re: Question about the behavior of TM when it lost the zookeeper
 client session in HA mode
Date: Wed, 18 Jul 2018 14:30:20 -0700
In-Reply-To: <CAKhqdDwtaAc0mkz0hDe6AqPEzBTdayzZ28=XbpJSsHC6a4k3Jw@mail.gmail.com>
Cc: Tony Wei <tony19920430@gmail.com>
To: user <user@flink.apache.org>,
 Tony Wei <tony19920430@gmail.com>
References: <CAKhqdDy06sC36+B061h7CUuQGfmZ1a3CQPybi637KqngPy4qyw@mail.gmail.com>
 <CAKhqdDzA4=Sa=RwwQ9tEm2cD-gMXM44edTohEd5Mk1+JvDhgDQ@mail.gmail.com>
 <CAKhqdDwP15yqCn9foi=04zQNkGb3NE+7LRFtvq4Vk-+f-CCo_g@mail.gmail.com>
 <CAKiyyaEz_s6ixegDfBp8fshdpsHiKve-8=hP60Dnxnw99bcqkA@mail.gmail.com>
 <438D598F-8751-4ABB-B32F-A0417EF2F56E@data-artisans.com>
 <CAKhqdDwtaAc0mkz0hDe6AqPEzBTdayzZ28=XbpJSsHC6a4k3Jw@mail.gmail.com>
X-Mailer: Apple Mail (2.3445.9.1)


--Apple-Mail=_6C2BE91C-8DAD-4505-BF14-ED9B58545D75
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8

I just stumbled on this same problem without any associated ZK issues. =
We had a Kafka broker fail that caused this issue:

2018-07-18 02:48:13,497 INFO  =
org.apache.flink.runtime.executiongraph.ExecutionGraph        - Sink: =
Produce: <output_topic_name> (2/4) (7e7d61b286d90c51bbd20a15796633f2) =
switched from RUNNING to FAILED.
java.lang.Exception: Failed to send data to Kafka: The server =
disconnected before a response was received.
	at =
org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerBase.checkEr=
roneous(FlinkKafkaProducerBase.java:373)
	at =
org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer010.invoke(F=
linkKafkaProducer010.java:288)
	at =
org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamS=
ink.java:56)
	at =
org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(St=
reamInputProcessor.java:207)
	at =
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputSt=
reamTask.java:69)
	at =
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java=
:264)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.common.errors.NetworkException: The server =
disconnected before a response was received.

This is the kind of error we should be robust to - the Kafka cluster =
will (reasonably quickly) recover and give a new broker for a particular =
partition (in this case, partition #2). Maybe retries should be the =
default configuration? I believe the client uses the Kafka defaults =
(acks=3D0, retries=3D0), but we typically run with acks=3D1 (or all) and =
retries=3DMAX_INT. Do I need to do anything more than that to get a more =
robust producer?

Ron

> On May 16, 2018, at 7:45 PM, Tony Wei <tony19920430@gmail.com> wrote:
>=20
> Hi Ufuk, Piotr
>=20
> Thanks for all of your replies. I knew that jobs are cancelled if the =
JM looses the connection to ZK, but JM didn't loose connection in my =
case.
> My job failed because of the exception from KafkaProducer. However, it =
happened before and after that exception that TM lost ZK connection.
> So, as Piotr said, it looks like an error in Kafka producer and I will =
pay more attention on it to see if there is something unexpected happens =
again.
>=20
> Best Regards,
> Tony Wei
>=20
> 2018-05-15 19:56 GMT+08:00 Piotr Nowojski <piotr@data-artisans.com =
<mailto:piotr@data-artisans.com>>:
> Hi,
>=20
> It looks like there was an error in asynchronous job of sending the =
records to Kafka. Probably this is a collateral damage of loosing =
connection to zookeeper.=20
>=20
> Piotrek
>=20
>> On 15 May 2018, at 13:33, Ufuk Celebi <uce@apache.org =
<mailto:uce@apache.org>> wrote:
>>=20
>> Hey Tony,
>>=20
>> thanks for the detailed report.
>>=20
>> - In Flink 1.4, jobs are cancelled if the JM looses the connection to =
ZK and recovered when the connection is re-established (and one JM =
becomes leader again).
>>=20
>> - Regarding the KafkaProducer: I'm not sure from the log message =
whether Flink closes the KafkaProducer because the job is cancelled or =
because there is a connectivity issue to the Kafka cluster. Including =
Piotr (cc) in this thread who has worked on the KafkaProducer in the =
past. If it is a connectivity issue, it might also explain why you lost =
the connection to ZK.
>>=20
>> Glad to hear that everything is back to normal. Keep us updated if =
something unexpected happens again.
>>=20
>> =E2=80=93 Ufuk
>>=20
>>=20
>> On Tue, May 15, 2018 at 6:28 AM, Tony Wei <tony19920430@gmail.com =
<mailto:tony19920430@gmail.com>> wrote:
>> Hi all,
>>=20
>> I restarted the cluster and changed the log level to DEBUG, and =
raised the parallelism of my streaming job from 32 to 40.
>> However, the problem just disappeared and I don't know why.
>> I will remain these settings for a while. If the error happen again, =
I will bring more informations back for help. Thank you.
>>=20
>> Best Regards,
>> Tony Wei
>>=20
>> 2018-05-14 14:24 GMT+08:00 Tony Wei <tony19920430@gmail.com =
<mailto:tony19920430@gmail.com>>:
>> Hi all,
>>=20
>> After I changed the =
`high-availability.zookeeper.client.session-timeout` and =
`maxSessionTimeout` to 120000ms, the exception still occurred.
>>=20
>> Here is the log snippet. It seems this is nothing to do with =
zookeeper client timeout, but I still don't know why kafka producer =
would be closed without any task state changed.
>>=20
>> ```
>> 2018-05-14 05:18:53,468 WARN  =
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn  - =
Client session timed out, have not heard from server in 82828ms for =
sessionid 0x305f957eb8d000a
>> 2018-05-14 05:18:53,468 INFO  =
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn  - =
Client session timed out, have not heard from server in 82828ms for =
sessionid 0x305f957eb8d000a, closing socket connection and attempting =
reconnect
>> 2018-05-14 05:18:53,571 INFO  =
org.apache.flink.shaded.curator.org.apache.curator.framework.state.Connect=
ionStateManager  - State change: SUSPENDED
>> 2018-05-14 05:18:53,574 WARN  =
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService =
 - Connection to ZooKeeper suspended. Can no longer retrieve the leader =
from ZooKeeper.
>> 2018-05-14 05:18:53,850 WARN  =
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn  - =
SASL configuration failed: javax.security.auth.login.LoginException: No =
JAAS configuration section named 'Client' was found in specified JAAS =
configuration file: '/mnt/jaas-466390940757021791.conf'. Will continue =
connection to Zookeeper server without SASL authentication, if Zookeeper =
server allows it.
>> 2018-05-14 05:18:53,850 INFO  =
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn  - =
Opening socket connection to server XXX.XXX.XXX.XXX:2181
>> 2018-05-14 05:18:53,852 ERROR =
org.apache.flink.shaded.curator.org.apache.curator.ConnectionState  - =
Authentication failed
>> 2018-05-14 05:18:53,853 INFO  =
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn  - =
Socket connection established to XXX.XXX.XXX.XXX:2181, initiating =
session
>> 2018-05-14 05:18:53,859 INFO  =
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn  - =
Session establishment complete on server XXX.XXX.XXX.XXX:2181, sessionid =
=3D 0x305f957eb8d000a, negotiated timeout =3D 120000
>> 2018-05-14 05:18:53,860 INFO  =
org.apache.flink.shaded.curator.org.apache.curator.framework.state.Connect=
ionStateManager  - State change: RECONNECTED
>> 2018-05-14 05:18:53,860 INFO  =
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService =
 - Connection to ZooKeeper was reconnected. Leader retrieval can be =
restarted.
>> 2018-05-14 05:28:54,781 INFO  =
org.apache.kafka.clients.producer.KafkaProducer               - Closing =
the Kafka producer with timeoutMillis =3D 9223372036854775807 ms.
>> 2018-05-14 05:28:54,829 INFO  =
org.apache.kafka.clients.producer.KafkaProducer               - Closing =
the Kafka producer with timeoutMillis =3D 9223372036854775807 ms.
>> 2018-05-14 05:28:54,918 INFO  =
org.apache.flink.runtime.taskmanager.Task                     - =
match-rule -> (get-ordinary -> Sink: kafka-sink, get-cd -> Sink: =
kafka-sink-cd) (1/32) (e3462ff8bb565bb0cf4de49ffc2595fb) switched from =
RUNNING to FAILED.
>> java.lang.Exception: Failed to send data to Kafka: The server =
disconnected before a response was received.
>> 	at =
org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerBase.checkEr=
roneous(FlinkKafkaProducerBase.java:373)
>> 	at =
org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer010.invoke(F=
linkKafkaProducer010.java:288)
>> 	at =
org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamS=
ink.java:56)
>> 	at =
org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.push=
ToOperator(OperatorChain.java:464)
>> 	at =
org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.coll=
ect(OperatorChain.java:441)
>> 	at =
org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.coll=
ect(OperatorChain.java:415)
>> 	at =
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOu=
tput.collect(AbstractStreamOperator.java:831)
>> 	at =
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOu=
tput.collect(AbstractStreamOperator.java:809)
>> 	at =
org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMa=
p.java:41)
>> 	at =
org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.push=
ToOperator(OperatorChain.java:464)
>> 	at =
org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.coll=
ect(OperatorChain.java:441)
>> 	at =
org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.coll=
ect(OperatorChain.java:415)
>> 	at =
org.apache.flink.streaming.api.collector.selector.CopyingDirectedOutput.co=
llect(CopyingDirectedOutput.java:62)
>> 	at =
org.apache.flink.streaming.api.collector.selector.CopyingDirectedOutput.co=
llect(CopyingDirectedOutput.java:34)
>> 	at =
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOu=
tput.collect(AbstractStreamOperator.java:831)
>> 	at =
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOu=
tput.collect(AbstractStreamOperator.java:809)
>> 	at =
org.apache.flink.streaming.api.operators.TimestampedCollector.collect(Time=
stampedCollector.java:51)
>> 	at =
com.appier.rt.rt_match.flink.operator.MatchRuleOperator$$anonfun$flatMap1$=
4.apply(MatchRuleOperator.scala:39)
>> 	at =
com.appier.rt.rt_match.flink.operator.MatchRuleOperator$$anonfun$flatMap1$=
4.apply(MatchRuleOperator.scala:38)
>> 	at =
scala.collection.MapLike$MappedValues$$anonfun$foreach$3.apply(MapLike.sca=
la:245)
>> 	at =
scala.collection.MapLike$MappedValues$$anonfun$foreach$3.apply(MapLike.sca=
la:245)
>> 	at =
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(Trave=
rsableLike.scala:733)
>> 	at scala.collection.immutable.Map$Map2.foreach(Map.scala:137)
>> 	at =
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:=
732)
>> 	at =
scala.collection.MapLike$MappedValues.foreach(MapLike.scala:245)
>> 	at =
com.appier.rt.rt_match.flink.operator.MatchRuleOperator.flatMap1(MatchRule=
Operator.scala:38)
>> 	at =
com.appier.rt.rt_match.flink.operator.MatchRuleOperator.flatMap1(MatchRule=
Operator.scala:14)
>> 	at org.apache.flink.streaming.api.operators.co =
<http://api.operators.co/>.CoStreamFlatMap.processElement1(CoStreamFlatMap=
.java:53)
>> 	at org.apache.flink.streaming.runtime.io =
<http://runtime.io/>.StreamTwoInputProcessor.processInput(StreamTwoInputPr=
ocessor.java:243)
>> 	at =
org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputSt=
reamTask.java:91)
>> 	at =
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java=
:264)
>> 	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
>> 	at java.lang.Thread.run(Thread.java:748)
>> Caused by: org.apache.kafka.common.errors.NetworkException: The =
server disconnected before a response was received.
>> ```
>>=20
>> Best Regards,
>> Tony Wei
>>=20
>> 2018-05-14 11:36 GMT+08:00 Tony Wei <tony19920430@gmail.com =
<mailto:tony19920430@gmail.com>>:
>> Hi all,
>>=20
>> Recently, my flink job met a problem that caused the job failed and =
restarted.
>>=20
>> The log is list this screen snapshot
>>=20
>> <exception.png>
>>=20
>> or this
>>=20
>> ```
>> 2018-05-11 13:21:04,582 WARN  =
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn  - =
Client session timed out, have not heard from server in 61054ms for =
sessionid 0x3054b165fe2006a
>> 2018-05-11 13:21:04,583 INFO  =
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn  - =
Client session timed out, have not heard from server in 61054ms for =
sessionid 0x3054b165fe2006a, closing socket connection and attempting =
reconnect
>> 2018-05-11 13:21:04,683 INFO  =
org.apache.flink.shaded.curator.org.apache.curator.framework.state.Connect=
ionStateManager  - State change: SUSPENDED
>> 2018-05-11 13:21:04,686 WARN  =
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService =
 - Connection to ZooKeeper suspended. Can no longer retrieve the leader =
from ZooKeeper.
>> 2018-05-11 13:21:04,689 INFO  =
org.apache.kafka.clients.producer.KafkaProducer               - Closing =
the Kafka producer with timeoutMillis =3D 9223372036854775807 ms.
>> 2018-05-11 13:21:04,694 INFO  =
org.apache.kafka.clients.producer.KafkaProducer               - Closing =
the Kafka producer with timeoutMillis =3D 9223372036854775807 ms.
>> 2018-05-11 13:21:04,698 INFO  =
org.apache.flink.runtime.taskmanager.Task                     - =
match-rule -> (get-ordinary -> Sink: kafka-sink, get-cd -> Sink: =
kafka-sink-cd) (4/32) (65a4044ac963e083f2635fe24e7f2403) switched from =
RUNNING to FAILED.
>> java.lang.Exception: Failed to send data to Kafka: The server =
disconnected before a response was received.
>> ```
>>=20
>> Logs showed `org.apache.kafka.clients.producer.KafkaProducer - =
Closing the Kafka producer with timeoutMillis =3D 9223372036854775807 =
ms.` This timeout value is Long.MAX_VALUE. It happened when someone =
called `producer.close()`.
>>=20
>> And I also saw the log said =
`org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn  - =
Client session timed out, have not heard from server in 61054ms for =
sessionid 0x3054b165fe2006a, closing socket connection and attempting =
reconnect`
>> and =
`org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService =
 - Connection to ZooKeeper suspended. Can no longer retrieve the leader =
from ZooKeeper.`
>>=20
>> I have checked zookeeper and kafka and there was no error during that =
period.
>> I was wondering if TM will stop the tasks when it lost zookeeper =
client in HA mode. Since I didn't see any document or mailing thread =
discuss this, I'm not sure if this is the reason that made kafka =
producer closed.
>> Could someone who know HA well? Or someone know what happened in my =
job?
>>=20
>> My flink cluster version is 1.4.0 with 2 masters and 10 slaves. My =
zookeeper cluster version is 3.4.11 with 3 nodes.
>> The `high-availability.zookeeper.client.session-timeout` is default =
value: 60000 ms.
>> The `maxSessionTimeout` in zoo.cfg is 40000ms.
>> I have already change the maxSessionTimeout to 120000ms this morning.
>>=20
>> This problem happened many many times during the last weekend and =
made my kafka log delay grew up. Please help me. Thank you very much!
>>=20
>> Best Regards,
>> Tony Wei
>>=20
>>=20
>>=20
>>=20
>>=20
>>=20
>=20
>=20


--Apple-Mail=_6C2BE91C-8DAD-4505-BF14-ED9B58545D75
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=utf-8

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html; =
charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; line-break: after-white-space;" class=3D"">I =
just stumbled on this same problem without any associated ZK issues. We =
had a Kafka broker fail that caused this issue:<div class=3D""><br =
class=3D""></div><div class=3D""><pre class=3D"ng-binding" =
style=3D"box-sizing: border-box; overflow: auto; font-size: 13px; =
line-height: 1.42857; color: rgb(51, 51, 51); font-family: Menlo, =
Monaco, Consolas, &quot;Courier New&quot;, monospace; padding: 9.5px; =
margin-top: 0px; margin-bottom: 10px; word-break: break-all; word-wrap: =
break-word; background-color: rgb(245, 245, 245); border: 1px solid =
rgb(204, 204, 204); border-top-left-radius: 4px; =
border-top-right-radius: 4px; border-bottom-right-radius: 4px; =
border-bottom-left-radius: 4px; white-space: pre-wrap; =
font-variant-ligatures: normal; orphans: 2; widows: 2;">2018-07-18 =
02:48:13,497 INFO  =
org.apache.flink.runtime.executiongraph.ExecutionGraph        - Sink: =
Produce: &lt;output_topic_name&gt; (2/4) =
(7e7d61b286d90c51bbd20a15796633f2) switched from RUNNING to FAILED.
java.lang.Exception: Failed to send data to Kafka: The server =
disconnected before a response was received.
	at =
org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerBase.checkEr=
roneous(FlinkKafkaProducerBase.java:373)
	at =
org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer010.invoke(F=
linkKafkaProducer010.java:288)
	at =
org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamS=
ink.java:56)
	at =
org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(St=
reamInputProcessor.java:207)
	at =
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputSt=
reamTask.java:69)
	at =
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java=
:264)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.common.errors.NetworkException: The server =
disconnected before a response was received.</pre><div class=3D""><br =
class=3D""></div></div><div class=3D"">This is the kind of error we =
should be robust to - the Kafka cluster will (reasonably quickly) =
recover and give a new broker for a particular partition (in this case, =
partition #2). Maybe retries should be the default configuration? I =
believe the client uses the Kafka defaults (acks=3D0, retries=3D0), but =
we typically run with acks=3D1 (or all) and retries=3DMAX_INT. Do I need =
to do anything more than that to get a more robust producer?</div><div =
class=3D""><br class=3D""></div><div class=3D"">Ron<br class=3D""><div><br=
 class=3D""><blockquote type=3D"cite" class=3D""><div class=3D"">On May =
16, 2018, at 7:45 PM, Tony Wei &lt;<a =
href=3D"mailto:tony19920430@gmail.com" =
class=3D"">tony19920430@gmail.com</a>&gt; wrote:</div><br =
class=3D"Apple-interchange-newline"><div class=3D""><div dir=3D"ltr" =
class=3D"">Hi Ufuk, Piotr<div class=3D""><br class=3D""></div><div =
class=3D"">Thanks for all of your replies. I knew that jobs are =
cancelled if the JM looses the connection to ZK, but JM didn't loose =
connection in my case.</div><div class=3D"">My job failed because of the =
exception from KafkaProducer. However, it happened before and after that =
exception that<span =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;=
font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;f=
ont-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text=
-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(2=
55,255,255);text-decoration-style:initial;text-decoration-color:initial;fl=
oat:none;display:inline" class=3D"">&nbsp;TM lost ZK =
connection.</span></div><div class=3D"">So, as Piotr said, it looks like =
an error in Kafka producer and I will pay more attention on it to see if =
there is something unexpected happens again.</div><div class=3D""><br =
class=3D""></div><div class=3D"">Best Regards,</div><div class=3D"">Tony =
Wei</div></div><div class=3D"gmail_extra"><br class=3D""><div =
class=3D"gmail_quote">2018-05-15 19:56 GMT+08:00 Piotr Nowojski <span =
dir=3D"ltr" class=3D"">&lt;<a href=3D"mailto:piotr@data-artisans.com" =
target=3D"_blank" class=3D"">piotr@data-artisans.com</a>&gt;</span>:<br =
class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 =
.8ex;border-left:1px #ccc solid;padding-left:1ex"><div =
style=3D"word-wrap:break-word;line-break:after-white-space" =
class=3D"">Hi,<div class=3D""><br class=3D""></div><div class=3D"">It =
looks like there was an error in asynchronous job of sending the records =
to Kafka. Probably this is a collateral damage of loosing connection to =
zookeeper.&nbsp;</div><div class=3D""><br class=3D""></div><div =
class=3D"">Piotrek</div><div class=3D""><div class=3D""><br =
class=3D""><blockquote type=3D"cite" class=3D""><div class=3D""><div =
class=3D"h5"><div class=3D"">On 15 May 2018, at 13:33, Ufuk Celebi =
&lt;<a href=3D"mailto:uce@apache.org" target=3D"_blank" =
class=3D"">uce@apache.org</a>&gt; wrote:</div><br =
class=3D"m_6074367212805839907Apple-interchange-newline"></div></div><div =
class=3D""><div class=3D""><div class=3D"h5"><div dir=3D"ltr" =
class=3D""><div class=3D"">Hey Tony,<br class=3D""></div><div =
class=3D""><br class=3D""></div><div class=3D"">thanks for the detailed =
report.</div><div class=3D""><br class=3D""></div><div class=3D"">- In =
Flink 1.4, jobs are cancelled if the JM looses the connection to ZK and =
recovered when the connection is re-established (and one JM becomes =
leader again).</div><div class=3D""><br class=3D""></div><div class=3D"">-=
 Regarding the KafkaProducer: I'm not sure from the log message whether =
Flink closes the KafkaProducer because the job is cancelled or because =
there is a connectivity issue to the Kafka cluster. Including Piotr (cc) =
in this thread who has worked on the KafkaProducer in the past. If it is =
a connectivity issue, it might also explain why you lost the connection =
to ZK.</div><div class=3D""><br class=3D""></div><div class=3D"">Glad to =
hear that everything is back to normal. Keep us updated if something =
unexpected happens again.</div><div class=3D""><br class=3D""></div><div =
class=3D"">=E2=80=93 Ufuk</div><div class=3D""><br =
class=3D""></div></div></div></div><div class=3D"gmail_extra"><br =
class=3D""><div class=3D"gmail_quote"><div class=3D""><div class=3D"h5">On=
 Tue, May 15, 2018 at 6:28 AM, Tony Wei <span dir=3D"ltr" =
class=3D"">&lt;<a href=3D"mailto:tony19920430@gmail.com" target=3D"_blank"=
 class=3D"">tony19920430@gmail.com</a>&gt;</span> wrote:<br =
class=3D""></div></div><blockquote class=3D"gmail_quote" style=3D"margin:0=
 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div =
class=3D""><div class=3D"h5"><div dir=3D"ltr" class=3D"">Hi all,<div =
class=3D""><br class=3D""></div><div class=3D"">I restarted the cluster =
and changed the log level to DEBUG, and raised the parallelism of my =
streaming job from 32 to 40.</div><div class=3D"">However, the problem =
just disappeared and I don't know why.</div><div class=3D"">I will =
remain these settings for a while. If the error happen again, I will =
bring more informations back for help. Thank you.<br class=3D""></div><div=
 class=3D""><br class=3D""></div><div class=3D"">Best Regards,</div><div =
class=3D"">Tony Wei</div></div></div></div><div =
class=3D"m_6074367212805839907HOEnZb"><div =
class=3D"m_6074367212805839907h5"><div class=3D"gmail_extra"><br =
class=3D""><div class=3D"gmail_quote"><div class=3D""><div =
class=3D"h5">2018-05-14 14:24 GMT+08:00 Tony Wei <span dir=3D"ltr" =
class=3D"">&lt;<a href=3D"mailto:tony19920430@gmail.com" target=3D"_blank"=
 class=3D"">tony19920430@gmail.com</a>&gt;</span>:<br =
class=3D""></div></div><blockquote class=3D"gmail_quote" style=3D"margin:0=
 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div =
class=3D""><div class=3D"h5"><div dir=3D"ltr" class=3D"">Hi all,<div =
class=3D""><br class=3D""></div><div class=3D"">After I changed the <b =
class=3D"">`<span =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:14px;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;le=
tter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;w=
hite-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-=
decoration-style:initial;text-decoration-color:initial" =
class=3D"">h</span></b><b =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:14px;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;le=
tter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;w=
hite-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-=
decoration-style:initial;text-decoration-color:initial" =
class=3D"">igh-availability.zookeeper.c<wbr =
class=3D"">lient.session-timeout`</b><span =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:14px;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;le=
tter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;w=
hite-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-=
decoration-style:initial;text-decoration-color:initial" =
class=3D"">&nbsp;and&nbsp;<b =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:14px;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;le=
tter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;w=
hite-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-=
decoration-style:initial;text-decoration-color:initial" class=3D"">`ma<wbr=
 class=3D"">xSessionTimeout`</b><span =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:14px;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;le=
tter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;w=
hite-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-=
decoration-style:initial;text-decoration-color:initial" =
class=3D"">&nbsp;to <span =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:14px;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;fo=
nt-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-=
transform:none;white-space:normal;word-spacing:0px;background-color:rgb(25=
5,255,255);text-decoration-style:initial;text-decoration-color:initial;flo=
at:none;display:inline" class=3D"">120000ms, the exception still =
occurred.</span></span></span></div><div class=3D""><span =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:14px;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;le=
tter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;w=
hite-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-=
decoration-style:initial;text-decoration-color:initial" class=3D""><span =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:14px;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;le=
tter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;w=
hite-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-=
decoration-style:initial;text-decoration-color:initial" class=3D""><span =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:14px;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;fo=
nt-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-=
transform:none;white-space:normal;word-spacing:0px;background-color:rgb(25=
5,255,255);text-decoration-style:initial;text-decoration-color:initial;flo=
at:none;display:inline" class=3D""><br class=3D"">Here is the log =
snippet. It seems this is nothing to do with zookeeper client timeout, =
but I still don't know why kafka producer would be closed without any =
task state changed.</span></span></span></div><div class=3D""><span =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:14px;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;le=
tter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;w=
hite-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-=
decoration-style:initial;text-decoration-color:initial" class=3D""><span =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:14px;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;le=
tter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;w=
hite-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-=
decoration-style:initial;text-decoration-color:initial" class=3D""><span =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:14px;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;fo=
nt-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-=
transform:none;white-space:normal;word-spacing:0px;background-color:rgb(25=
5,255,255);text-decoration-style:initial;text-decoration-color:initial;flo=
at:none;display:inline" class=3D""><br =
class=3D""></span></span></span></div><div class=3D""><span =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:14px;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;le=
tter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;w=
hite-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-=
decoration-style:initial;text-decoration-color:initial" class=3D""><span =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:14px;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;le=
tter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;w=
hite-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-=
decoration-style:initial;text-decoration-color:initial" class=3D""><span =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:14px;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;fo=
nt-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-=
transform:none;white-space:normal;word-spacing:0px;background-color:rgb(25=
5,255,255);text-decoration-style:initial;text-decoration-color:initial;flo=
at:none;display:inline" class=3D"">```</span></span></span></div><div =
class=3D""><span =
style=3D"text-align:start;text-indent:0px;text-decoration-style:initial;te=
xt-decoration-color:initial" class=3D""><span =
style=3D"text-align:start;text-indent:0px;text-decoration-style:initial;te=
xt-decoration-color:initial" class=3D""><span =
style=3D"text-align:start;text-indent:0px;text-decoration-style:initial;te=
xt-decoration-color:initial;float:none;display:inline" class=3D""><div =
class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D"">2018-05-14 05:18:53,468 WARN&nbsp; =
org.apache.flink.shaded.zookee<wbr =
class=3D"">per.org.apache.zookeeper.Clien<wbr class=3D"">tCnxn&nbsp; - =
Client session timed out, have not heard from server in 82828ms for =
sessionid 0x305f957eb8d000a</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D"">2018-05-14 05:18:53,468 INFO&nbsp; =
org.apache.flink.shaded.zookee<wbr =
class=3D"">per.org.apache.zookeeper.Clien<wbr class=3D"">tCnxn&nbsp; - =
Client session timed out, have not heard from server in 82828ms for =
sessionid 0x305f957eb8d000a, closing socket connection and attempting =
reconnect</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D"">2018-05-14 05:18:53,571 INFO&nbsp; =
org.apache.flink.shaded.curato<wbr =
class=3D"">r.org.apache.curator.framework<wbr =
class=3D"">.state.ConnectionStateManager&nbsp; - State change: =
SUSPENDED</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D"">2018-05-14 05:18:53,574 WARN&nbsp; =
org.apache.flink.runtime.leade<wbr =
class=3D"">rretrieval.ZooKeeperLeaderRetr<wbr =
class=3D"">ievalService&nbsp; - Connection to ZooKeeper suspended. Can =
no longer retrieve the leader from ZooKeeper.</span></div><div =
class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D"">2018-05-14 05:18:53,850 WARN&nbsp; =
org.apache.flink.shaded.zookee<wbr =
class=3D"">per.org.apache.zookeeper.Clien<wbr class=3D"">tCnxn&nbsp; - =
SASL configuration failed: javax.security.auth.login.Logi<wbr =
class=3D"">nException: No JAAS configuration section named 'Client' was =
found in specified JAAS configuration file: =
'/mnt/jaas-466390940757021791.<wbr class=3D"">conf'. Will continue =
connection to Zookeeper server without SASL authentication, if Zookeeper =
server allows it.</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D"">2018-05-14 05:18:53,850 INFO&nbsp; =
org.apache.flink.shaded.zookee<wbr =
class=3D"">per.org.apache.zookeeper.Clien<wbr class=3D"">tCnxn&nbsp; - =
Opening socket connection to server =
XXX.XXX.XXX.XXX:2181</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D"">2018-05-14 05:18:53,852 ERROR =
org.apache.flink.shaded.curato<wbr =
class=3D"">r.org.apache.curator.Connectio<wbr class=3D"">nState&nbsp; - =
Authentication failed</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D"">2018-05-14 05:18:53,853 INFO&nbsp; =
org.apache.flink.shaded.zookee<wbr =
class=3D"">per.org.apache.zookeeper.Clien<wbr class=3D"">tCnxn&nbsp; - =
Socket connection established to XXX.XXX.XXX.XXX:2181, initiating =
session</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D"">2018-05-14 05:18:53,859 INFO&nbsp; =
org.apache.flink.shaded.zookee<wbr =
class=3D"">per.org.apache.zookeeper.Clien<wbr class=3D"">tCnxn&nbsp; - =
Session establishment complete on server XXX.XXX.XXX.XXX:2181, sessionid =
=3D 0x305f957eb8d000a, negotiated timeout =3D 120000</span></div><div =
class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D"">2018-05-14 05:18:53,860 INFO&nbsp; =
org.apache.flink.shaded.curato<wbr =
class=3D"">r.org.apache.curator.framework<wbr =
class=3D"">.state.ConnectionStateManager&nbsp; - State change: =
RECONNECTED</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D"">2018-05-14 05:18:53,860 INFO&nbsp; =
org.apache.flink.runtime.leade<wbr =
class=3D"">rretrieval.ZooKeeperLeaderRetr<wbr =
class=3D"">ievalService&nbsp; - Connection to ZooKeeper was reconnected. =
Leader retrieval can be restarted.</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D"">2018-05-14 05:28:54,781 INFO&nbsp; =
org.apache.kafka.clients.produ<wbr class=3D"">cer.KafkaProducer&nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;- Closing the Kafka =
producer with timeoutMillis =3D 9223372036854775807 ms.</span></div><div =
class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D"">2018-05-14 05:28:54,829 INFO&nbsp; =
org.apache.kafka.clients.produ<wbr class=3D"">cer.KafkaProducer&nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;- Closing the Kafka =
producer with timeoutMillis =3D 9223372036854775807 ms.</span></div><div =
class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D"">2018-05-14 05:28:54,918 INFO&nbsp; =
org.apache.flink.runtime.taskm<wbr class=3D"">anager.Task&nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;- =
match-rule -&gt; (get-ordinary -&gt; Sink: kafka-sink, get-cd -&gt; =
Sink: kafka-sink-cd) (1/32) (e3462ff8bb565bb0cf4de49ffc259<wbr =
class=3D"">5fb) switched from RUNNING to FAILED.</span></div><span =
class=3D""><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D"">java.lang.Exception: Failed to send data to Kafka: The server =
disconnected before a response was received.</span></div></span><div =
class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.con<wbr =
class=3D"">nectors.kafka.FlinkKafkaProduc<wbr =
class=3D"">erBase.checkErroneous(FlinkKaf<wbr =
class=3D"">kaProducerBase.java:373)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.con<wbr =
class=3D"">nectors.kafka.FlinkKafkaProduc<wbr =
class=3D"">er010.invoke(FlinkKafkaProduce<wbr =
class=3D"">r010.java:288)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.api<wbr =
class=3D"">.operators.StreamSink.processE<wbr =
class=3D"">lement(StreamSink.java:56)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.run<wbr =
class=3D"">time.tasks.OperatorChain$Chain<wbr =
class=3D"">ingOutput.pushToOperator(Opera<wbr =
class=3D"">torChain.java:464)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.run<wbr =
class=3D"">time.tasks.OperatorChain$Chain<wbr =
class=3D"">ingOutput.collect(OperatorChai<wbr =
class=3D"">n.java:441)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.run<wbr =
class=3D"">time.tasks.OperatorChain$Chain<wbr =
class=3D"">ingOutput.collect(OperatorChai<wbr =
class=3D"">n.java:415)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.api<wbr =
class=3D"">.operators.AbstractStreamOpera<wbr =
class=3D"">tor$CountingOutput.collect(Abs<wbr =
class=3D"">tractStreamOperator.java:831)</span></div><div class=3D""><span=
 style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.api<wbr =
class=3D"">.operators.AbstractStreamOpera<wbr =
class=3D"">tor$CountingOutput.collect(Abs<wbr =
class=3D"">tractStreamOperator.java:809)</span></div><div class=3D""><span=
 style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.api<wbr =
class=3D"">.operators.StreamMap.processEl<wbr =
class=3D"">ement(StreamMap.java:41)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.run<wbr =
class=3D"">time.tasks.OperatorChain$Chain<wbr =
class=3D"">ingOutput.pushToOperator(Opera<wbr =
class=3D"">torChain.java:464)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.run<wbr =
class=3D"">time.tasks.OperatorChain$Chain<wbr =
class=3D"">ingOutput.collect(OperatorChai<wbr =
class=3D"">n.java:441)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.run<wbr =
class=3D"">time.tasks.OperatorChain$Chain<wbr =
class=3D"">ingOutput.collect(OperatorChai<wbr =
class=3D"">n.java:415)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.api<wbr =
class=3D"">.collector.selector.CopyingDir<wbr =
class=3D"">ectedOutput.collect(CopyingDir<wbr =
class=3D"">ectedOutput.java:62)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.api<wbr =
class=3D"">.collector.selector.CopyingDir<wbr =
class=3D"">ectedOutput.collect(CopyingDir<wbr =
class=3D"">ectedOutput.java:34)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.api<wbr =
class=3D"">.operators.AbstractStreamOpera<wbr =
class=3D"">tor$CountingOutput.collect(Abs<wbr =
class=3D"">tractStreamOperator.java:831)</span></div><div class=3D""><span=
 style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.api<wbr =
class=3D"">.operators.AbstractStreamOpera<wbr =
class=3D"">tor$CountingOutput.collect(Abs<wbr =
class=3D"">tractStreamOperator.java:809)</span></div><div class=3D""><span=
 style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.api<wbr =
class=3D"">.operators.TimestampedCollecto<wbr =
class=3D"">r.collect(TimestampedCollector<wbr =
class=3D"">.java:51)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at com.appier.rt.rt_match.flink.o<wbr =
class=3D"">perator.MatchRuleOperator$$ano<wbr =
class=3D"">nfun$flatMap1$4.apply(MatchRul<wbr =
class=3D"">eOperator.scala:39)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at com.appier.rt.rt_match.flink.o<wbr =
class=3D"">perator.MatchRuleOperator$$ano<wbr =
class=3D"">nfun$flatMap1$4.apply(MatchRul<wbr =
class=3D"">eOperator.scala:38)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at scala.collection.MapLike$Mappe<wbr =
class=3D"">dValues$$anonfun$foreach$3.app<wbr =
class=3D"">ly(MapLike.scala:245)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at scala.collection.MapLike$Mappe<wbr =
class=3D"">dValues$$anonfun$foreach$3.app<wbr =
class=3D"">ly(MapLike.scala:245)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at scala.collection.TraversableLi<wbr =
class=3D"">ke$WithFilter$$anonfun$foreach<wbr =
class=3D"">$1.apply(TraversableLike.<wbr =
class=3D"">scala:733)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at scala.collection.immutable.Map<wbr =
class=3D"">$Map2.foreach(Map.scala:137)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at scala.collection.TraversableLi<wbr =
class=3D"">ke$WithFilter.foreach(Traversa<wbr =
class=3D"">bleLike.scala:732)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at scala.collection.MapLike$Mappe<wbr =
class=3D"">dValues.foreach(MapLike.scala:<wbr =
class=3D"">245)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at com.appier.rt.rt_match.flink.o<wbr =
class=3D"">perator.MatchRuleOperator.flat<wbr =
class=3D"">Map1(MatchRuleOperator.scala:3<wbr =
class=3D"">8)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at com.appier.rt.rt_match.flink.o<wbr =
class=3D"">perator.MatchRuleOperator.flat<wbr =
class=3D"">Map1(MatchRuleOperator.scala:1<wbr =
class=3D"">4)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.<a href=3D"http://api.operators.co/" =
target=3D"_blank" class=3D"">api<wbr =
class=3D"">.operators.co</a>.CoStreamFlatMap.<wbr =
class=3D"">processElement1(CoStreamFlatMa<wbr =
class=3D"">p.java:53)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.<a href=3D"http://runtime.io/" =
target=3D"_blank" class=3D"">run<wbr =
class=3D"">time.io</a>.StreamTwoInputProcesso<wbr =
class=3D"">r.processInput(StreamTwoInputP<wbr =
class=3D"">rocessor.java:243)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.run<wbr =
class=3D"">time.tasks.TwoInputStreamTask.<wbr =
class=3D"">run(TwoInputStreamTask.java:91<wbr =
class=3D"">)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.streaming.run<wbr =
class=3D"">time.tasks.StreamTask.invoke(S<wbr =
class=3D"">treamTask.java:264)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at org.apache.flink.runtime.taskm<wbr =
class=3D"">anager.Task.run(Task.java:718)</span></div><div =
class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D""><span style=3D"white-space:pre-wrap" class=3D"">	=
</span>at java.lang.Thread.run(Thread.ja<wbr =
class=3D"">va:748)</span></div><div class=3D""><span =
style=3D"font-size:14px;background-color:rgb(238,238,238)" =
class=3D"">Caused by: org.apache.kafka.common.errors<wbr =
class=3D"">.NetworkException: The server disconnected before a response =
was received.</span></div><div =
style=3D"background-color:rgb(255,255,255);color:rgb(34,34,34);font-family=
:arial,sans-serif;font-size:14px;font-style:normal;font-variant-ligatures:=
normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text=
-transform:none;white-space:normal;word-spacing:0px" =
class=3D"">```</div><div =
style=3D"background-color:rgb(255,255,255);color:rgb(34,34,34);font-family=
:arial,sans-serif;font-size:14px;font-style:normal;font-variant-ligatures:=
normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text=
-transform:none;white-space:normal;word-spacing:0px" class=3D""><br =
class=3D""></div><div =
style=3D"background-color:rgb(255,255,255);color:rgb(34,34,34);font-family=
:arial,sans-serif;font-size:14px;font-style:normal;font-variant-ligatures:=
normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text=
-transform:none;white-space:normal;word-spacing:0px" class=3D"">Best =
Regards,</div><div =
style=3D"background-color:rgb(255,255,255);color:rgb(34,34,34);font-family=
:arial,sans-serif;font-size:14px;font-style:normal;font-variant-ligatures:=
normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text=
-transform:none;white-space:normal;word-spacing:0px" class=3D"">Tony =
Wei</div></span></span></span></div></div></div></div><div =
class=3D"m_6074367212805839907m_2993062436910575832HOEnZb"><div =
class=3D"m_6074367212805839907m_2993062436910575832h5"><div =
class=3D"gmail_extra"><br class=3D""><div class=3D"gmail_quote"><div =
class=3D""><div class=3D"h5">2018-05-14 11:36 GMT+08:00 Tony Wei <span =
dir=3D"ltr" class=3D"">&lt;<a href=3D"mailto:tony19920430@gmail.com" =
target=3D"_blank" class=3D"">tony19920430@gmail.com</a>&gt;</span>:<br =
class=3D""></div></div><blockquote class=3D"gmail_quote" style=3D"margin:0=
 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr" =
class=3D""><div class=3D""><div class=3D"h5">Hi all,<div class=3D""><br =
class=3D""></div><div class=3D"">Recently, my flink job met a problem =
that caused the job failed and restarted.</div><div class=3D""><br =
class=3D""></div><div class=3D"">The log is list this screen =
snapshot</div><div class=3D""><br class=3D""></div></div></div><div =
class=3D""><span =
id=3D"m_6074367212805839907cid:ii_jh5ntso40_1635c96b8dc614db" =
class=3D"">&lt;exception.png&gt;</span><br class=3D""><br =
class=3D""></div><span class=3D""><div class=3D"">or this</div><div =
class=3D""><br class=3D""></div><div class=3D""><span =
style=3D"background-color:rgb(238,238,238)" class=3D"">```<br =
class=3D""></span><div class=3D""><span =
style=3D"background-color:rgb(238,238,238)" class=3D"">2018-05-11 =
13:21:04,582 WARN&nbsp; org.apache.flink.shaded.zookee<wbr =
class=3D"">per.org.apache.zookeeper.Clien<wbr class=3D"">tCnxn&nbsp; - =
Client session timed out, have not heard from server in 61054ms for =
sessionid 0x3054b165fe2006a</span></div><div class=3D""><span =
style=3D"background-color:rgb(238,238,238)" class=3D"">2018-05-11 =
13:21:04,583 INFO&nbsp; org.apache.flink.shaded.zookee<wbr =
class=3D"">per.org.apache.zookeeper.Clien<wbr class=3D"">tCnxn&nbsp; - =
Client session timed out, have not heard from server in 61054ms for =
sessionid 0x3054b165fe2006a, closing socket connection and attempting =
reconnect</span></div><div class=3D""><span =
style=3D"background-color:rgb(238,238,238)" class=3D"">2018-05-11 =
13:21:04,683 INFO&nbsp; org.apache.flink.shaded.curato<wbr =
class=3D"">r.org.apache.curator.framework<wbr =
class=3D"">.state.ConnectionStateManager&nbsp; - State change: =
SUSPENDED</span></div><div class=3D""><span =
style=3D"background-color:rgb(238,238,238)" class=3D"">2018-05-11 =
13:21:04,686 WARN&nbsp; org.apache.flink.runtime.leade<wbr =
class=3D"">rretrieval.ZooKeeperLeaderRetr<wbr =
class=3D"">ievalService&nbsp; - Connection to ZooKeeper suspended. Can =
no longer retrieve the leader from ZooKeeper.</span></div><div =
class=3D""><span style=3D"background-color:rgb(238,238,238)" =
class=3D"">2018-05-11 13:21:04,689 INFO&nbsp; =
org.apache.kafka.clients.produ<wbr class=3D"">cer.KafkaProducer&nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;- Closing the Kafka =
producer with timeoutMillis =3D 9223372036854775807 ms.</span></div><div =
class=3D""><span style=3D"background-color:rgb(238,238,238)" =
class=3D"">2018-05-11 13:21:04,694 INFO&nbsp; =
org.apache.kafka.clients.produ<wbr class=3D"">cer.KafkaProducer&nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;- Closing the Kafka =
producer with timeoutMillis =3D 9223372036854775807 ms.</span></div><div =
class=3D""><span style=3D"background-color:rgb(238,238,238)" =
class=3D"">2018-05-11 13:21:04,698 INFO&nbsp; =
org.apache.flink.runtime.taskm<wbr class=3D"">anager.Task&nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;- =
match-rule -&gt; (get-ordinary -&gt; Sink: kafka-sink, get-cd -&gt; =
Sink: kafka-sink-cd) (4/32) (65a4044ac963e083f2635fe24e7f2<wbr =
class=3D"">403) switched from RUNNING to FAILED.</span></div><div =
class=3D""><span style=3D"background-color:rgb(238,238,238)" =
class=3D"">java.lang.Exception: Failed to send data to Kafka: The server =
disconnected before a response was received.<br =
class=3D"">```</span></div></div><div class=3D""><br class=3D""></div><div=
 class=3D"">Logs showed <b class=3D"">`org.apache.kafka.clients.prod<wbr =
class=3D"">ucer.KafkaProducer - Closing the Kafka producer with =
timeoutMillis =3D 9223372036854775807 ms.`</b> This timeout value is <b =
class=3D"">Long.MAX_VALUE</b>. It&nbsp;happened when someone called <b =
class=3D"">`producer.close()`</b>.<br class=3D""></div><div class=3D""><br=
 class=3D""></div><div class=3D"">And I also saw the log said <b =
class=3D"">`org.apache.flink.shaded.zooke<wbr =
class=3D"">eper.org.apache.zookeeper.Clie<wbr class=3D"">ntCnxn&nbsp; - =
Client session timed out, have not heard from server in 61054ms for =
sessionid 0x3054b165fe2006a, closing socket connection and attempting =
reconnect`</b></div><div class=3D"">and <b =
class=3D"">`org.apache.flink.runtime.lead<wbr =
class=3D"">erretrieval.ZooKeeperLeaderRet<wbr =
class=3D"">rievalService&nbsp; - Connection to ZooKeeper suspended. Can =
no longer retrieve the leader from ZooKeeper.`</b></div><div =
class=3D""><div =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;=
font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;f=
ont-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text=
-transform:none;white-space:normal;word-spacing:0px;text-decoration-style:=
initial;text-decoration-color:initial" class=3D""><br =
class=3D""></div><div =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;=
font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;f=
ont-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text=
-transform:none;white-space:normal;word-spacing:0px;text-decoration-style:=
initial;text-decoration-color:initial" class=3D"">I have checked =
zookeeper and kafka and there was no error during that period.<br =
class=3D""></div><div class=3D"">I was wondering if TM will stop the =
tasks when it lost zookeeper client in HA mode. Since I didn't see any =
document or mailing thread discuss this, I'm not sure if this is the =
reason that made kafka producer closed.<br class=3D""></div><div =
class=3D"">Could someone who know HA well? Or someone know what happened =
in my job?</div><div class=3D""><br class=3D""></div><div class=3D"">My =
flink cluster version is 1.4.0 with 2 masters and 10 slaves. My =
zookeeper cluster version is 3.4.11 with 3 nodes.</div><div class=3D"">The=
 <b class=3D"">`high-availability.zookeeper.c<wbr =
class=3D"">lient.session-timeout`</b>&nbsp;is default value: 60000 =
ms.</div><div class=3D"">The <b class=3D"">`maxSessionTimeout`</b> in =
zoo.cfg is 40000ms.</div><div class=3D"">I have already change the <b =
class=3D"">maxSessionTimeout</b> to 120000ms this morning.<br =
class=3D""></div><div class=3D""><br class=3D""></div><div =
class=3D""><span =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;=
font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;f=
ont-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text=
-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(2=
55,255,255);text-decoration-style:initial;text-decoration-color:initial;fl=
oat:none;display:inline" class=3D"">This problem happened many many =
times during the last weekend and made my kafka log delay grew up. =
Please help me. Thank you very much!</span><br class=3D""></div><div =
class=3D""><span =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;=
font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;f=
ont-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text=
-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(2=
55,255,255);text-decoration-style:initial;text-decoration-color:initial;fl=
oat:none;display:inline" class=3D""><br class=3D""></span></div><div =
class=3D""><span =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;=
font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;f=
ont-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text=
-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(2=
55,255,255);text-decoration-style:initial;text-decoration-color:initial;fl=
oat:none;display:inline" class=3D"">Best Regards,</span></div><div =
class=3D""><span =
style=3D"color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;=
font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;f=
ont-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text=
-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(2=
55,255,255);text-decoration-style:initial;text-decoration-color:initial;fl=
oat:none;display:inline" class=3D"">Tony Wei</span></div><div =
class=3D""><br class=3D""></div><div class=3D""><br class=3D""></div><br =
class=3D""></div></span></div>
</blockquote></div><br class=3D""></div>
</div></div></blockquote></div><br class=3D""></div>
</div></div></blockquote></div><br class=3D""></div>
</div></blockquote></div><br class=3D""></div></div></blockquote></div><br=
 class=3D""></div>
</div></blockquote></div><br class=3D""></div></body></html>=

--Apple-Mail=_6C2BE91C-8DAD-4505-BF14-ED9B58545D75--