Mailing-List: contact user-help@flink.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@flink.apache.org
MIME-Version: 1.0
In-Reply-To: <CANC1h_sGk4jpyLq1EX6xT8V2JNBJ=EAOxkd6BTA+nFHQAf64uA@mail.gmail.com>
References: <CAP7S8jvP5B1b1w+NFN6zRw28xNwAOrUY12iPuMQsLjPnwG3vWw@mail.gmail.com>
 <CAAdrtT057gdg+Umpjn-g0sqCUUkMSqwJgHxMXTxdXj=VV5B6Ww@mail.gmail.com>
 <CANC1h_uK+EmLqaiBbJ_PBz4Vqxt6ZEAvdO7Hv=4yQe5KrQ08iw@mail.gmail.com>
 <CAP7S8juYa9NaSRVB6JR5ZK=1TWFD8Tdjph6F9eC5grB4nznBrw@mail.gmail.com>
 <CANC1h_uTOh_dyZ5L9A-d+izG5=byZKKookY-KSjSobC1GJOYaA@mail.gmail.com>
 <CAP7S8jtJjv9NKSGKWwhyFVMLhfC+TNLbY5R5QZ=-dUPV9Ht99w@mail.gmail.com>
 <CANC1h_vV8Ldeh+zf+Gixiz3t3WB2RiGKT38F=0QNRW5eUHys7Q@mail.gmail.com>
 <CAP7S8juVvm2EHZbWh=p0Mi3c7TykV5YL76jGC33rmwa_NE1w=Q@mail.gmail.com>
 <CAP7S8jvEntkn1zC3uY6PmhNaqJJCh+_co5xHsOGMbua6Rf40LQ@mail.gmail.com>
 <etPan.57f3676a.4ab93f5e.f3@apache.org> <CAP7S8jvObMSO56OYYFy9M3qSuN1EtNDUwZAb2U1FbNw7bLdQ9Q@mail.gmail.com>
 <CANC1h_sGk4jpyLq1EX6xT8V2JNBJ=EAOxkd6BTA+nFHQAf64uA@mail.gmail.com>
From: Chakravarthy varaga <chakravarthyvp@gmail.com>
Date: Tue, 4 Oct 2016 18:20:12 +0100
Message-ID: <CAP7S8juzeiQ=9dzz8biTpigHzHg7P98KQ1mUpPnob1jbk_jw_w@mail.gmail.com>
Subject: Re: Flink Checkpoint runs slow for low load stream
To: user@flink.apache.org
Content-Type: multipart/alternative; boundary=001a11331636fb8f8c053e0d4548
archived-at: Tue, 04 Oct 2016 17:20:20 -0000

--001a11331636fb8f8c053e0d4548
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Thanks for your prompt response Stephan.

    I'd wait for Flink 1.1.3 !!!

Best Regards
Varaga

On Tue, Oct 4, 2016 at 5:36 PM, Stephan Ewen <sewen@apache.org> wrote:

> The plan to release 1.1.3 is asap ;-)
>
> Waiting for last backported patched to get in, then release testing and
> release.
>
> If you want to test it today, you would need to manually build the
> release-1.1 branch.
>
> Best,
> Stephan
>
>
> On Tue, Oct 4, 2016 at 5:46 PM, Chakravarthy varaga <
> chakravarthyvp@gmail.com> wrote:
>
>> Hi Gordon,
>>
>>      Do I need to clone and build release-1.1 branch to test this?
>>      I currently use flinlk 1.1.2 runtime. When is the plan to release i=
t
>> in 1.1.3?
>>
>> Best Regards
>> Varaga
>>
>> On Tue, Oct 4, 2016 at 9:25 AM, Tzu-Li (Gordon) Tai <tzulitai@apache.org=
>
>> wrote:
>>
>>> Hi,
>>>
>>> Helping out here: this is the PR for async Kafka offset committing -
>>> https://github.com/apache/flink/pull/2574.
>>> It has already been merged into the master and release-1.1 branches, so
>>> you can try out the changes now if you=E2=80=99d like.
>>> The change should also be included in the 1.1.3 release, which the Flin=
k
>>> community is discussing to release soon.
>>>
>>> Will definitely be helpful if you can provide feedback afterwards!
>>>
>>> Best Regards,
>>> Gordon
>>>
>>>
>>> On October 3, 2016 at 9:40:14 PM, Chakravarthy varaga (
>>> chakravarthyvp@gmail.com) wrote:
>>>
>>> Hi Stephan,
>>>
>>>     Is the Async kafka offset commit released in 1.3.1?
>>>
>>> Varaga
>>>
>>> On Wed, Sep 28, 2016 at 9:49 AM, Chakravarthy varaga <
>>> chakravarthyvp@gmail.com> wrote:
>>>
>>>> Hi Stephan,
>>>>
>>>>      That should be great. Let me know once the fix is done and the
>>>> snapshot version to use, I'll check and revert then.
>>>>      Can you also share the JIRA that tracks the issue?
>>>>
>>>>      With regards to offset commit issue, I'm not sure as to how to
>>>> proceed here. Probably I'll use your fix first and see if the problem
>>>> reoccurs.
>>>>
>>>> Thanks much
>>>> Varaga
>>>>
>>>> On Tue, Sep 27, 2016 at 7:46 PM, Stephan Ewen <sewen@apache.org> wrote=
:
>>>>
>>>>> @CVP
>>>>>
>>>>> Flink stores in checkpoints in your case only the Kafka offsets (few
>>>>> bytes) and the custom state (e).
>>>>>
>>>>> Here is an illustration of the checkpoint and what is stored (from th=
e
>>>>> Flink docs).
>>>>> https://ci.apache.org/projects/flink/flink-docs-master/inter
>>>>> nals/stream_checkpointing.html
>>>>>
>>>>>
>>>>> I am quite puzzled why the offset committing problem occurs only for
>>>>> one input, and not for the other.
>>>>> I am preparing a fix for 1.2, possibly going into 1.1.3 as well.
>>>>> Could you try out a snapshot version to see if that fixes your proble=
m?
>>>>>
>>>>> Greetings,
>>>>> Stephan
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Sep 27, 2016 at 2:24 PM, Chakravarthy varaga <
>>>>> chakravarthyvp@gmail.com> wrote:
>>>>>
>>>>>> Hi Stefan,
>>>>>>
>>>>>>      Thanks a million for your detailed explanation. I appreciate it=
.
>>>>>>
>>>>>>      -  The *zookeeper bundled with kafka 0.9.0.1* was used to start
>>>>>> zookeeper. There is only 1 instance (standalone) of zookeeper runnin=
g on my
>>>>>> localhost (ubuntu 14.04)
>>>>>>      -  There is only 1 Kafka broker (*version: 0.9.0.1* )
>>>>>>
>>>>>>      With regards to Flink cluster there's only 1 JM & 2 TMs started
>>>>>> with no HA. I presume this does not use zookeeper anyways as it runs=
 as
>>>>>> standalone cluster.
>>>>>>
>>>>>>
>>>>>>      BTW., The kafka connector version that I use is as suggested in
>>>>>> the flink connectors page
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *.        <dependency>
>>>>>> <groupId>org.apache.flink</groupId>
>>>>>> <artifactId>flink-connector-kafka-0.9_2.10</artifactId>
>>>>>> <version>1.1.1</version>         </dependency>*
>>>>>>
>>>>>>      Do you see any issues with versions?
>>>>>>
>>>>>>      1) Do you have benchmarks wrt., to checkpointing in flink?
>>>>>>
>>>>>>      2) There isn't detailed explanation on what states are stored a=
s
>>>>>> part of the checkpointing process. For ex.,  If I have pipeline like
>>>>>> *source -> map -> keyBy -> map -> sink, my assumption on what's
>>>>>> stored is:*
>>>>>>
>>>>>> *         a) The source stream's custom watermarked records*
>>>>>>
>>>>>> *         b) Intermediate states of each of the transformations in
>>>>>> the pipeline*
>>>>>>
>>>>>> *         c) Delta of Records stored from the previous sink*
>>>>>>
>>>>>> *         d) Custom States (SayValueState as in my case) -
>>>>>> Essentially this is what I bother about storing.*
>>>>>> *         e) All of my operators*
>>>>>>
>>>>>>       Is my understanding right?
>>>>>>
>>>>>>      3) Is there a way in Flink to checkpoint only d) as stated abov=
e
>>>>>>
>>>>>>      4) Can you apply checkpointing to only streams and certain
>>>>>> operators (say I wish to store aggregated values part of the transfo=
rmation)
>>>>>>
>>>>>> Best Regards
>>>>>> CVP
>>>>>>
>>>>>>
>>>>>> On Mon, Sep 26, 2016 at 6:18 PM, Stephan Ewen <sewen@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks, the logs were very helpful!
>>>>>>>
>>>>>>> TL:DR - The offset committing to ZooKeeper is very slow and prevent=
s
>>>>>>> proper starting of checkpoints.
>>>>>>>
>>>>>>> Here is what is happening in detail:
>>>>>>>
>>>>>>>   - Between the point when the TaskManager receives the "trigger
>>>>>>> checkpoint" message and when the point when the KafkaSource actuall=
y starts
>>>>>>> the checkpoint is a long time (many seconds) - for one of the Kafka=
 Inputs
>>>>>>> (the other is fine).
>>>>>>>   - The only way this delayed can be introduced is if another
>>>>>>> checkpoint related operation (such as trigger() or notifyComplete()=
 ) is
>>>>>>> still in progress when the checkpoint is started. Flink does not pe=
rform
>>>>>>> concurrent checkpoint operations on a single operator, to ease the
>>>>>>> concurrency model for users.
>>>>>>>   - The operation that is still in progress must be the committing
>>>>>>> of the offsets (to ZooKeeper or Kafka). That also explains why this=
 only
>>>>>>> happens once one side receives the first record. Before that, there=
 is
>>>>>>> nothing to commit.
>>>>>>>
>>>>>>>
>>>>>>> What Flink should fix:
>>>>>>>   - The KafkaConsumer should run the commit operations
>>>>>>> asynchronously, to not block the "notifyCheckpointComplete()" metho=
d.
>>>>>>>
>>>>>>> What you can fix:
>>>>>>>   - Have a look at your Kafka/ZooKeeper setup. One Kafka Input work=
s
>>>>>>> well, the other does not. Do they go against different sets of brok=
ers, or
>>>>>>> different ZooKeepers? Is the metadata for one input bad?
>>>>>>>   - In the next Flink version, you may opt-out of committing offset=
s
>>>>>>> to Kafka/ZooKeeper all together. It is not important for Flink's
>>>>>>> checkpoints anyways.
>>>>>>>
>>>>>>> Greetings,
>>>>>>> Stephan
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Sep 26, 2016 at 5:13 PM, Chakravarthy varaga <
>>>>>>> chakravarthyvp@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Stefan,
>>>>>>>>
>>>>>>>>     Please find my responses below.
>>>>>>>>
>>>>>>>>     - What source are you using for the slow input?
>>>>>>>> *     [CVP] - Both stream as pointed out in my first mail, are
>>>>>>>> Kafka Streams*
>>>>>>>>   - How large is the state that you are checkpointing?
>>>>>>>>
>>>>>>>> *[CVP] - I have enabled checkpointing on the StreamEnvironment as
>>>>>>>> below.*
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *         final StreamExecutionEnvironment streamEnv =3D
>>>>>>>> StreamExecutionEnvironment.getExecutionEnvironment();
>>>>>>>> streamEnv.setStateBackend(new
>>>>>>>> FsStateBackend("file:///tmp/flink/checkpoints"));
>>>>>>>> streamEnv.enableCheckpointing(10000); *
>>>>>>>>
>>>>>>>>
>>>>>>>> *      In terms of the state stored, the KS1 stream has payload of
>>>>>>>> 100K events/second, while KS2 have about 1 event / 10 minutes... b=
asically
>>>>>>>> the operators perform flatmaps on 8 fields of tuple (all fields ar=
e
>>>>>>>> primitives). If you look at the states' sizes in dashboard they ar=
e in
>>>>>>>> Kb... *
>>>>>>>>   - Can you try to see in the log if actually the state snapshot
>>>>>>>> takes that long, or if it simply takes long for the checkpoint
>>>>>>>> barriers to travel through the stream due to a lot of backpressure=
?
>>>>>>>>     [CVP] -There are no back pressure atleast from the sample
>>>>>>>> computation in the flink dashboard. 100K/second is low load for fl=
ink's
>>>>>>>> benchmarks. I could not quite get the barriers vs snapshot state. =
I have
>>>>>>>> attached the Task Manager log (DEBUG) info if that will interest y=
ou.
>>>>>>>>
>>>>>>>>      I have attached the checkpoints times' as .png from the
>>>>>>>> dashboard. Basically if you look at checkpoint IDs 28 & 29 &30- yo=
u'd
>>>>>>>> see that the checkpoints take more than a minute in each case. Bef=
ore these
>>>>>>>> checkpoints, the KS2 stream did not have any events. As soon as an
>>>>>>>> event(should be in bytes) was generated, the checkpoints went slow=
 and
>>>>>>>> subsequently a minute more for every checkpoint thereafter.
>>>>>>>>
>>>>>>>>    This log was collected from the standalone flink cluster with 1
>>>>>>>> job manager & 2 TMs. 1 TM was running this application with checkp=
ointing
>>>>>>>> (parallelism=3D1)
>>>>>>>>
>>>>>>>>     Please let me know if you need further info.,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Sep 23, 2016 at 6:26 PM, Stephan Ewen <sewen@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi!
>>>>>>>>>
>>>>>>>>> Let's try to figure that one out. Can you give us a bit more
>>>>>>>>> information?
>>>>>>>>>
>>>>>>>>>   - What source are you using for the slow input?
>>>>>>>>>   - How large is the state that you are checkpointing?
>>>>>>>>>   - Can you try to see in the log if actually the state snapshot
>>>>>>>>> takes that long, or if it simply takes long for the checkpoint ba=
rriers to
>>>>>>>>> travel through the stream due to a lot of backpressure?
>>>>>>>>>
>>>>>>>>> Greetings,
>>>>>>>>> Stephan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Sep 23, 2016 at 3:35 PM, Fabian Hueske <fhueske@gmail.com=
>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi CVP,
>>>>>>>>>>
>>>>>>>>>> I'm not so much familiar with the internals of the checkpointing
>>>>>>>>>> system, but maybe Stephan (in CC) has an idea what's going on he=
re.
>>>>>>>>>>
>>>>>>>>>> Best, Fabian
>>>>>>>>>>
>>>>>>>>>> 2016-09-23 11:33 GMT+02:00 Chakravarthy varaga <
>>>>>>>>>> chakravarthyvp@gmail.com>:
>>>>>>>>>>
>>>>>>>>>>> Hi Aljoscha & Fabian,
>>>>>>>>>>>
>>>>>>>>>>>     I have a stream application that has 2 stream source as
>>>>>>>>>>> below.
>>>>>>>>>>>
>>>>>>>>>>>      KeyedStream<String, String> *ks1* =3D ds1.keyBy("*") ;
>>>>>>>>>>>      KeyedStream<Tuple2<String, V>, String> *ks2* =3D
>>>>>>>>>>> ds2.flatMap(split T into k-v pairs).keyBy(0);
>>>>>>>>>>>
>>>>>>>>>>>      ks1.connect(ks2).flatMap(X);
>>>>>>>>>>>      //X is a CoFlatMapFunction that inserts and removes
>>>>>>>>>>> elements from ks2 into a key-value state member. Elements from =
ks1 are
>>>>>>>>>>> matched against that state. the CoFlatMapFunction operator main=
tains
>>>>>>>>>>> ValueState<Tuple2<Long, Long>>;
>>>>>>>>>>>
>>>>>>>>>>>      //ks1 is streaming about 100K events/sec from kafka topic
>>>>>>>>>>>      //ks2 is streaming about 1 event every 10 minutes...
>>>>>>>>>>> Precisely when the 1st event is consumed from this stream, chec=
kpoint takes
>>>>>>>>>>> 2 minutes straight away.
>>>>>>>>>>>
>>>>>>>>>>>     The version of flink is 1.1.2.
>>>>>>>>>>>
>>>>>>>>>>> I tried to use checkpoint every 10 Secs using a
>>>>>>>>>>> FsStateBackend... What I notice is that the checkpoint duration=
 is almost 2
>>>>>>>>>>> minutes for many cases, while for the other cases it varies fro=
m 100 ms to
>>>>>>>>>>> 1.5 minutes frequently. I'm attaching the snapshot of the dashb=
oard for
>>>>>>>>>>> your reference.
>>>>>>>>>>>
>>>>>>>>>>>      Is this an issue with flink checkpointing?
>>>>>>>>>>>
>>>>>>>>>>>  Best Regards
>>>>>>>>>>> CVP
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

--001a11331636fb8f8c053e0d4548
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div><div><div>Thanks for your prompt response Stephan.<br=
><br></div>=C2=A0=C2=A0=C2=A0 I&#39;d wait for Flink 1.1.3 !!! <br><br></di=
v>Best Regards<br></div>Varaga<br></div><div class=3D"gmail_extra"><br><div=
 class=3D"gmail_quote">On Tue, Oct 4, 2016 at 5:36 PM, Stephan Ewen <span d=
ir=3D"ltr">&lt;<a href=3D"mailto:sewen@apache.org" target=3D"_blank">sewen@=
apache.org</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=
=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=
=3D"ltr">The plan to release 1.1.3 is asap ;-)<div><br></div><div>Waiting f=
or last backported patched to get in, then release testing and release.</di=
v><div><br></div><div>If you want to test it today, you would need to manua=
lly build the release-1.1 branch.</div><div><br></div><div>Best,</div><div>=
Stephan</div><div><br></div></div><div class=3D"HOEnZb"><div class=3D"h5"><=
div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Tue, Oct 4, 201=
6 at 5:46 PM, Chakravarthy varaga <span dir=3D"ltr">&lt;<a href=3D"mailto:c=
hakravarthyvp@gmail.com" target=3D"_blank">chakravarthyvp@gmail.com</a>&gt;=
</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .=
8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div><div=
><div><div>Hi Gordon,<br><br></div>=C2=A0=C2=A0=C2=A0=C2=A0 Do I need to cl=
one and build release-1.1 branch to test this?<br></div>=C2=A0=C2=A0=C2=A0=
=C2=A0 I currently use flinlk 1.1.2 runtime. When is the plan to release it=
 in 1.1.3?<br><br></div>Best Regards<span class=3D"m_-7143918970999659457HO=
EnZb"><font color=3D"#888888"><br></font></span></div><span class=3D"m_-714=
3918970999659457HOEnZb"><font color=3D"#888888">Varaga<br></font></span></d=
iv><div class=3D"m_-7143918970999659457HOEnZb"><div class=3D"m_-71439189709=
99659457h5"><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Tu=
e, Oct 4, 2016 at 9:25 AM, Tzu-Li (Gordon) Tai <span dir=3D"ltr">&lt;<a hre=
f=3D"mailto:tzulitai@apache.org" target=3D"_blank">tzulitai@apache.org</a>&=
gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 =
0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style=3D"word-wrap=
:break-word"><div id=3D"m_-7143918970999659457m_3455145436409323816m_864564=
4689360950347bloop_customfont" style=3D"font-family:Helvetica,Arial;font-si=
ze:13px;color:rgba(0,0,0,1.0);margin:0px;line-height:auto">Hi,</div><div id=
=3D"m_-7143918970999659457m_3455145436409323816m_8645644689360950347bloop_c=
ustomfont" style=3D"font-family:Helvetica,Arial;font-size:13px;color:rgba(0=
,0,0,1.0);margin:0px;line-height:auto"><br></div><div id=3D"m_-714391897099=
9659457m_3455145436409323816m_8645644689360950347bloop_customfont" style=3D=
"font-family:Helvetica,Arial;font-size:13px;color:rgba(0,0,0,1.0);margin:0p=
x;line-height:auto">Helping out here: this is the PR for async Kafka offset=
 committing - <a href=3D"https://github.com/apache/flink/pull/2574" target=
=3D"_blank">https://github.com/apache/flin<wbr>k/pull/2574</a>.</div><div i=
d=3D"m_-7143918970999659457m_3455145436409323816m_8645644689360950347bloop_=
customfont" style=3D"font-family:Helvetica,Arial;font-size:13px;color:rgba(=
0,0,0,1.0);margin:0px;line-height:auto">It has already been merged into the=
 master and release-1.1 branches, so you can try out the changes now if you=
=E2=80=99d like.</div><div id=3D"m_-7143918970999659457m_345514543640932381=
6m_8645644689360950347bloop_customfont" style=3D"font-family:Helvetica,Aria=
l;font-size:13px;color:rgba(0,0,0,1.0);margin:0px;line-height:auto">The cha=
nge should also be included in the 1.1.3 release, which the Flink community=
 is discussing to release soon.</div><div id=3D"m_-7143918970999659457m_345=
5145436409323816m_8645644689360950347bloop_customfont" style=3D"font-family=
:Helvetica,Arial;font-size:13px;color:rgba(0,0,0,1.0);margin:0px;line-heigh=
t:auto"><br></div><div id=3D"m_-7143918970999659457m_3455145436409323816m_8=
645644689360950347bloop_customfont" style=3D"font-family:Helvetica,Arial;fo=
nt-size:13px;color:rgba(0,0,0,1.0);margin:0px;line-height:auto">Will defini=
tely be helpful if you can provide feedback afterwards!</div><div id=3D"m_-=
7143918970999659457m_3455145436409323816m_8645644689360950347bloop_customfo=
nt" style=3D"font-family:Helvetica,Arial;font-size:13px;color:rgba(0,0,0,1.=
0);margin:0px;line-height:auto"><br></div><div id=3D"m_-7143918970999659457=
m_3455145436409323816m_8645644689360950347bloop_customfont" style=3D"font-f=
amily:Helvetica,Arial;font-size:13px;color:rgba(0,0,0,1.0);margin:0px;line-=
height:auto">Best Regards,</div><div id=3D"m_-7143918970999659457m_34551454=
36409323816m_8645644689360950347bloop_customfont" style=3D"font-family:Helv=
etica,Arial;font-size:13px;color:rgba(0,0,0,1.0);margin:0px;line-height:aut=
o">Gordon</div><div><div class=3D"m_-7143918970999659457m_34551454364093238=
16h5"> <br> <div class=3D"m_-7143918970999659457m_3455145436409323816m_8645=
644689360950347bloop_sign" id=3D"m_-7143918970999659457m_345514543640932381=
6m_8645644689360950347bloop_sign_1475569226232290048"></div> <br><p class=
=3D"m_-7143918970999659457m_3455145436409323816m_8645644689360950347airmail=
_on">On October 3, 2016 at 9:40:14 PM, Chakravarthy varaga (<a href=3D"mail=
to:chakravarthyvp@gmail.com" target=3D"_blank">chakravarthyvp@gmail.com</a>=
) wrote:</p> <blockquote type=3D"cite" class=3D"m_-7143918970999659457m_345=
5145436409323816m_8645644689360950347clean_bq"><span><div><div></div><div>


<div dir=3D"ltr">
<div>
<div>Hi Stephan,<br>
<br></div>
=C2=A0=C2=A0=C2=A0 Is the Async kafka offset commit released in
1.3.1?<br>
<br></div>
Varaga<br></div>
<div class=3D"gmail_extra"><br>
<div class=3D"gmail_quote">On Wed, Sep 28, 2016 at 9:49 AM,
Chakravarthy varaga <span dir=3D"ltr">&lt;<a href=3D"mailto:chakravarthyvp@=
gmail.com" target=3D"_blank">chakravarthyvp@gmail.com</a>&gt;</span> wrote:=
<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
<div dir=3D"ltr">
<div>
<div>
<div>
<div>Hi Stephan,<br>
<br></div>
=C2=A0=C2=A0=C2=A0=C2=A0 That should be great. Let me know once the
fix is done and the snapshot version to use, I&#39;ll check and revert
then.<br></div>
<div>=C2=A0=C2=A0=C2=A0=C2=A0 Can you also share the JIRA that
tracks the issue?<br></div>
<div>=C2=A0<br></div>
=C2=A0=C2=A0=C2=A0=C2=A0 With regards to offset commit issue, I&#39;m
not sure as to how to proceed here. Probably I&#39;ll use your fix
first and see if the problem reoccurs.<br>
<br></div>
Thanks much<span class=3D"m_-7143918970999659457m_3455145436409323816m_8645=
644689360950347HOEnZb"><font color=3D"#888888"><br></font></span></div>
<span class=3D"m_-7143918970999659457m_3455145436409323816m_864564468936095=
0347HOEnZb"><font color=3D"#888888">Varaga<br></font></span></div>
<div class=3D"m_-7143918970999659457m_3455145436409323816m_8645644689360950=
347HOEnZb">
<div class=3D"m_-7143918970999659457m_3455145436409323816m_8645644689360950=
347h5">
<div class=3D"gmail_extra"><br>
<div class=3D"gmail_quote">On Tue, Sep 27, 2016 at 7:46 PM, Stephan
Ewen <span dir=3D"ltr">&lt;<a href=3D"mailto:sewen@apache.org" target=3D"_b=
lank">sewen@apache.org</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
<div dir=3D"ltr">@CVP
<div><br></div>
<div>Flink stores in checkpoints in your case only the Kafka
offsets (few bytes) and the custom state (e).</div>
<div><br></div>
<div>Here is an illustration of the checkpoint and what is stored
(from the Flink docs).</div>
<div><a href=3D"https://ci.apache.org/projects/flink/flink-docs-master/inte=
rnals/stream_checkpointing.html" target=3D"_blank">https://ci.apache.org/pr=
ojects<wbr>/flink/flink-docs-master/inter<wbr>nals/stream_checkpointing.htm=
l</a><br>
</div>
<div><br></div>
<div><br></div>
<div>I am quite puzzled why the offset committing problem occurs
only for one input, and not for the other.<br></div>
<div>I am preparing a fix for 1.2, possibly going into 1.1.3 as
well.</div>
<div>Could you try out a snapshot version to see if that fixes your
problem?</div>
<div><br></div>
<div>Greetings,<br></div>
<div>Stephan</div>
<div><br></div>
<div><br></div>
</div>
<div class=3D"m_-7143918970999659457m_3455145436409323816m_8645644689360950=
347m_-5350752042696643377HOEnZb">
<div class=3D"m_-7143918970999659457m_3455145436409323816m_8645644689360950=
347m_-5350752042696643377h5">
<div class=3D"gmail_extra"><br>
<div class=3D"gmail_quote">On Tue, Sep 27, 2016 at 2:24 PM,
Chakravarthy varaga <span dir=3D"ltr">&lt;<a href=3D"mailto:chakravarthyvp@=
gmail.com" target=3D"_blank">chakravarthyvp@gmail.com</a>&gt;</span> wrote:=
<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
<div dir=3D"ltr">
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>Hi Stefan,<br>
<br></div>
=C2=A0=C2=A0=C2=A0=C2=A0 Thanks a million for your detailed
explanation. I appreciate it.<br>
<br></div>
=C2=A0=C2=A0=C2=A0=C2=A0 -=C2=A0 The <b>zookeeper bundled with
kafka 0.9.0.1</b> was used to start zookeeper. There is only 1
instance (standalone) of zookeeper running on my localhost (ubuntu
14.04)<br></div>
=C2=A0=C2=A0=C2=A0=C2=A0 -=C2=A0 There is only 1 Kafka broker
(<b>version: 0.9.0.1</b> )<br>
<br></div>
=C2=A0=C2=A0=C2=A0=C2=A0 With regards to Flink cluster there&#39;s only
1 JM &amp; 2 TMs started with no HA. I presume this does not use
zookeeper anyways as it runs as standalone cluster.<br>
<br>
=C2=A0<br></div>
<div>=C2=A0=C2=A0=C2=A0=C2=A0 BTW., The kafka connector version
that I use is as suggested in the flink connectors page<i>.<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 &lt;dependency&gt;<br>
=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0
&lt;groupId&gt;org.apache.flink&lt;/gro<wbr>upId&gt;<br>
=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0
&lt;artifactId&gt;<b>flink-connector-ka<wbr>fka-0.9_2.10</b>&lt;/artifactId=
&gt;<br>

=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0
&lt;version&gt;1.1.1&lt;/version&gt;<br>
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 &lt;/dependency&gt;</i><br>
=C2=A0<br></div>
<div>=C2=A0=C2=A0=C2=A0=C2=A0 Do you see any issues with
versions?<br></div>
<div>=C2=A0=C2=A0=C2=A0<br></div>
=C2=A0=C2=A0=C2=A0=C2=A0 1) Do you have benchmarks wrt., to
checkpointing in flink?<br>
<br></div>
=C2=A0 =C2=A0=C2=A0 2) There isn&#39;t detailed explanation on what
states are stored as part of the checkpointing process. For
ex.,=C2=A0 If I have pipeline like <i>source -&gt; map -&gt; keyBy
-&gt; map -&gt; sink, my assumption on what&#39;s stored
is:<br></i></div>
<i>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 a) The source
stream&#39;s custom watermarked records<br></i></div>
<i>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 b) Intermediate
states of each of the transformations in the
pipeline<br></i></div>
<i>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 c) Delta of
Records stored from the previous sink<br></i></div>
<i>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 d) Custom
States (SayValueState as in my case) - Essentially this is what I
bother about storing.<br></i>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div><i>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 e) All of
my operators</i><br>
<br></div>
<div>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Is my understanding
right?<br></div>
<div><br></div>
<div>=C2=A0=C2=A0=C2=A0=C2=A0 3) Is there a way in Flink to
checkpoint only d) as stated above<br>
<br></div>
<div>=C2=A0=C2=A0=C2=A0=C2=A0 4) Can you apply checkpointing to
only streams and certain operators (say I wish to store aggregated
values part of the transformation)<br></div>
<div><br></div>
<div>Best Regards<br></div>
<div>CVP<br></div>
<div><br></div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<div>
<div>
<div class=3D"gmail_extra"><br>
<div class=3D"gmail_quote">On Mon, Sep 26, 2016 at 6:18 PM, Stephan
Ewen <span dir=3D"ltr">&lt;<a href=3D"mailto:sewen@apache.org" target=3D"_b=
lank">sewen@apache.org</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
<div dir=3D"ltr">Thanks, the logs were very helpful!
<div><br></div>
<div>TL:DR - The offset committing to ZooKeeper is very slow and
prevents proper starting of checkpoints.<br></div>
<div><br></div>
<div>Here is what is happening in detail:</div>
<div><br></div>
<div>=C2=A0 - Between the point when the TaskManager receives the
&quot;trigger checkpoint&quot; message and when the point when the
KafkaSource actually starts the checkpoint is a long time (many
seconds) - for one of the Kafka Inputs (the other is fine).</div>
<div>=C2=A0 - The only way this delayed can be introduced is if
another checkpoint related operation (such as trigger() or
notifyComplete() ) is still in progress when the checkpoint is
started. Flink does not perform concurrent checkpoint operations on
a single operator, to ease the concurrency model for users.</div>
<div>=C2=A0 - The operation that is still in progress must be the
committing of the offsets (to ZooKeeper or Kafka). That also
explains why this only happens once one side receives the first
record. Before that, there is nothing to commit.</div>
<div><br></div>
<div><br></div>
<div>What Flink should fix:</div>
<div>=C2=A0 - The KafkaConsumer should run the commit operations
asynchronously, to not block the &quot;notifyCheckpointComplete()&quot;
method.</div>
<div><br></div>
<div>What you can fix:</div>
<div>=C2=A0 - Have a look at your Kafka/ZooKeeper setup. One Kafka
Input works well, the other does not. Do they go against different
sets of brokers, or different ZooKeepers? Is the metadata for one
input bad?</div>
<div>=C2=A0 - In the next Flink version, you may opt-out of
committing offsets to Kafka/ZooKeeper all together. It is not
important for Flink&#39;s checkpoints anyways.</div>
<div><br></div>
<div>Greetings,</div>
<div>Stephan</div>
<div>
<div>
<div><br></div>
<div class=3D"gmail_extra"><br>
<div class=3D"gmail_quote">On Mon, Sep 26, 2016 at 5:13 PM,
Chakravarthy varaga <span dir=3D"ltr">&lt;<a href=3D"mailto:chakravarthyvp@=
gmail.com" target=3D"_blank">chakravarthyvp@gmail.com</a>&gt;</span> wrote:=
<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
<div dir=3D"ltr">
<div>
<div>Hi Stefan,<br>
<br></div>
=C2=A0=C2=A0=C2=A0 Please find my responses below.<span><br>
<br>
=C2=A0=C2=A0=C2=A0 - What source are you using for the
<span>slow</span> input?<br></span></div>
<b>=C2=A0=C2=A0=C2=A0=C2=A0 <span style=3D"background-color:rgb(255,153,0)"=
>[CVP] - Both stream as pointed
out in my first mail, are Kafka Streams</span></b><br>
<div>
<div><span>=C2=A0 - How large is the state that you are
checkpointing?<br></span></div>
<div><b>=C2=A0=C2=A0=C2=A0=C2=A0</b> <span style=3D"background-color:rgb(25=
5,153,0)"><b>[CVP] - I have enabled
checkpointing on the StreamEnvironment as
below.<br></b><i>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0
final StreamExecutionEnvironment streamEnv =3D
StreamExecutionEnvironment.get<wbr>ExecutionEnvironment();<br>
=C2=A0 =C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0
streamEnv.setStateBackend(new
FsStateBackend(&quot;file:///tmp/fl<wbr>ink/checkpoints&quot;));<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0
streamEnv.enableCheckpointing(<wbr>10000);<br>
<br></i></span></div>
<div><b><span style=3D"background-color:rgb(255,153,0)">=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0 In
terms of the state stored, the KS1 stream has payload of 100K
events/second, while KS2 have about 1 event / 10 minutes...
basically the operators perform flatmaps on 8 fields of tuple (all
fields are primitives). If you look at the states&#39; sizes in
dashboard they are in Kb...</span><i><br>
<br></i></b></div>
<div><span>=C2=A0 - Can you try to see in the log if actually the
state snapshot takes that long, or if it simply takes long for the
<span>checkpoint</span> barriers to travel through the stream due
to a lot of backpressure?<br></span>=C2=A0=C2=A0<b>=C2=A0</b>
<span style=3D"background-color:rgb(255,153,0)">[CVP] -There are no
back pressure atleast from the sample computation in the flink
dashboard. 100K/second is low load for flink&#39;s benchmarks. I could
not quite get the barriers vs snapshot state. I have attached the
Task Manager log (DEBUG) info if that will interest you.<br>
=C2=A0<br></span></div>
<div><span style=3D"background-color:rgb(255,153,0)">=C2=A0=C2=A0=C2=A0=C2=
=A0 I have
attached the checkpoints times&#39; as .png from the dashboard.
Basically if you look at checkpoint IDs 28 &amp; 29 &amp;30-</span>
<span style=3D"background-color:rgb(255,153,0)">you&#39;d see that the
checkpoints take more than a minute in each case. Before these
checkpoints, the KS2 stream did not have any events. As soon as an
event(should be in bytes) was generated, the checkpoints went slow
and subsequently a minute more for every checkpoint
thereafter.<br>
<br></span></div>
<div><span style=3D"background-color:rgb(255,153,0)">=C2=A0=C2=A0
This log was collected from the standalone flink cluster with 1 job
manager &amp; 2 TMs. 1 TM was running this application with
checkpointing (parallelism=3D1)</span><span style=3D"background-color:rgb(2=
55,153,0)"><br>
=C2=A0<br></span></div>
<div><span style=3D"background-color:rgb(255,153,0)">=C2=A0=C2=A0=C2=A0 Ple=
ase let me
know if you need further info.,</span><b><br></b></div>
<div><b>=C2=A0=C2=A0=C2=A0=C2=A0<br></b></div>
</div>
</div>
<div>
<div>
<div class=3D"gmail_extra"><br>
<div class=3D"gmail_quote">On Fri, Sep 23, 2016 at 6:26 PM, Stephan
Ewen <span dir=3D"ltr">&lt;<a href=3D"mailto:sewen@apache.org" target=3D"_b=
lank">sewen@apache.org</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
<div dir=3D"ltr">Hi!
<div><br></div>
<div>Let&#39;s try to figure that one out. Can you give us a bit more
information?</div>
<div><br></div>
<div>=C2=A0 - What source are you using for the slow input?</div>
<div>=C2=A0 - How large is the state that you are
checkpointing?</div>
<div>=C2=A0 - Can you try to see in the log if actually the state
snapshot takes that long, or if it simply takes long for the
checkpoint barriers to travel through the stream due to a lot of
backpressure?</div>
<div><br></div>
<div>Greetings,</div>
<div>Stephan</div>
<div><br></div>
<div><br></div>
</div>
<div>
<div>
<div class=3D"gmail_extra"><br>
<div class=3D"gmail_quote">On Fri, Sep 23, 2016 at 3:35 PM, Fabian
Hueske <span dir=3D"ltr">&lt;<a href=3D"mailto:fhueske@gmail.com" target=3D=
"_blank">fhueske@gmail.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
<div dir=3D"ltr">
<div>
<div>Hi CVP,<br>
<br></div>
I&#39;m not so much familiar with the internals of the checkpointing
system, but maybe Stephan (in CC) has an idea what&#39;s going on
here.<br>
<br></div>
Best, Fabian<br></div>
<div>
<div>
<div class=3D"gmail_extra"><br>
<div class=3D"gmail_quote">2016-09-23 11:33 GMT+02:00 Chakravarthy
varaga <span dir=3D"ltr">&lt;<a href=3D"mailto:chakravarthyvp@gmail.com" ta=
rget=3D"_blank">chakravarthyvp@gmail.com</a>&gt;</span>:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
<div dir=3D"ltr">
<div>
<div>
<div>
<div>
<div>
<div>
<div>Hi Aljoscha &amp; Fabian,<br>
<br></div>
=C2=A0=C2=A0=C2=A0 I have a stream application that has 2 stream
source as below.<br></div>
<span><br>
=C2=A0=C2=A0=C2=A0=C2=A0 KeyedStream&lt;String, String&gt;
<b>ks1</b> =3D ds1.keyBy(&quot;*&quot;) ;<br>
=C2=A0=C2=A0=C2=A0=C2=A0 KeyedStream&lt;Tuple2&lt;String, V&gt;,
String&gt; <b>ks2</b> =3D ds2.flatMap(split T into k-v
pairs).keyBy(0);<br>
<br></span>=C2=A0=C2=A0=C2=A0=C2=A0
ks1.connect(ks2).flatMap(X);<br>
=C2=A0=C2=A0=C2=A0=C2=A0 //X is a CoFlatMapFunction that inserts
and removes elements from ks2 into a key-value state member.
Elements from ks1 are matched against that state. the
CoFlatMapFunction operator maintains ValueState&lt;Tuple2&lt;Long,
Long&gt;&gt;;<br>
<br></div>
=C2=A0=C2=A0=C2=A0=C2=A0 //ks1 is streaming about 100K events/sec
from kafka topic<br></div>
=C2=A0=C2=A0=C2=A0=C2=A0 //ks2 is streaming about 1 event every 10
minutes... Precisely when the 1st event is consumed from this
stream, checkpoint takes 2 minutes straight away.<br>
<br></div>
=C2=A0=C2=A0=C2=A0 The version of flink is 1.1.2.<br>
<br>
I tried to use checkpoint every 10 Secs using a FsStateBackend...
What I notice is that the checkpoint duration is almost 2 minutes
for many cases, while for the other cases it varies from 100 ms to
1.5 minutes frequently. I&#39;m attaching the snapshot of the dashboard
for your reference.<br>
<br></div>
<div>=C2=A0=C2=A0=C2=A0=C2=A0 Is this an issue with flink
checkpointing?<br></div>
<div><br></div>
=C2=A0Best Regards<br></div>
CVP</div>
</blockquote>
</div>
<br></div>
</div>
</div>
</blockquote>
</div>
<br></div>
</div>
</div>
</blockquote>
</div>
<br></div>
</div>
</div>
</blockquote>
</div>
<br></div>
</div>
</div>
</div>
</blockquote>
</div>
<br></div>
</div>
</div>
</blockquote>
</div>
<br></div>
</div>
</div>
</blockquote>
</div>
<br></div>
</div>
</div>
</blockquote>
</div>
<br></div>


</div></div></span></blockquote></div></div></div></blockquote></div><br></=
div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>

--001a11331636fb8f8c053e0d4548--