Mailing-List: contact user-help@flink.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@flink.apache.org
MIME-Version: 1.0
In-Reply-To: <7C706D03-76FE-422B-A4C0-E1A2C0660E1A@data-artisans.com>
References: <CAMpYU5RLiYm=sP1ko355EE1y4fW-Rx7C-pU+DPL636ewVGvhKA@mail.gmail.com>
 <CANC1h_sDEeULyymtLCfFvFe-hUEZvdV5sbixZD57HMp8N1Rz1w@mail.gmail.com>
 <1487941734848-11879.post@n4.nabble.com> <1487944701219-11882.post@n4.nabble.com>
 <CANC1h_tU-+D5UzxfbtQqNBMh-eU8P80sGnByFCcmEXjvJayGPg@mail.gmail.com>
 <CAMpYU5Rh7XrzygPA17qEAFcewOVWhRQ84xrE-UMhniyUm-tHyA@mail.gmail.com>
 <CANC1h_unYJ-uuNXHmo==AZwSobZ1nfpZWHTJMUfkJZwpfu0eDw@mail.gmail.com>
 <CAMpYU5Q_fkcnjTwzxyTCz=b7cfdmJ6LYiDbj9GGrM3CErcNUVA@mail.gmail.com>
 <78EE2263-A2CA-49C3-A86A-8793E56CB394@mediamath.com> <CANC1h_uXR4r_eWOcrgX8PtLoESd-UJN1k4QEziE=j8YSzb=A_Q@mail.gmail.com>
 <CAMpYU5T65kr-H0Cur49XB-gw1AMgPDzCTGsFz7yEhFS8pwBJ8Q@mail.gmail.com>
 <1948A710-5EDD-4AD2-91AC-381883F877C1@mediamath.com> <CANC1h_u1VTKWE0rwa9wAigFHMTqtqbmQxHew3NPc9iGWo8R1xA@mail.gmail.com>
 <CANC1h_ujzH73p0GobpwMZKoucdRF8AaeACeG4YE2v7OqrrMHiQ@mail.gmail.com>
 <CAK2z7j7-iY4z9mUCMA_F9UmeDAm67_f2y9rsOz9YZjPpkPqvjA@mail.gmail.com> <7C706D03-76FE-422B-A4C0-E1A2C0660E1A@data-artisans.com>
From: Stephan Ewen <sewen@apache.org>
Date: Tue, 14 Mar 2017 17:39:16 +0100
Message-ID: <CANC1h_vfOUSQ1FV0mYdfXsZXbFKofZjiYeuSN2j+6EBONrQ76w@mail.gmail.com>
Subject: Re: Checkpointing with RocksDB as statebackend
To: user@flink.apache.org
Content-Type: multipart/alternative; boundary=001a11c1662a0e8b05054ab378cd
archived-at: Tue, 14 Mar 2017 16:39:22 -0000

--001a11c1662a0e8b05054ab378cd
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

The issue in Flink is https://issues.apache.org/jira/browse/FLINK-5756

On Tue, Mar 14, 2017 at 3:40 PM, Stefan Richter <s.richter@data-artisans.co=
m
> wrote:

> Hi Vinay,
>
> I think the issue is tracked here: https://github.com/
> facebook/rocksdb/issues/1988.
>
> Best,
> Stefan
>
> Am 14.03.2017 um 15:31 schrieb Vishnu Viswanath <
> vishnu.viswanath25@gmail.com>:
>
> Hi Stephan,
>
> Is there a ticket number/link to track this, My job has all the condition=
s
> you mentioned.
>
> Thanks,
> Vishnu
>
> On Tue, Mar 14, 2017 at 7:13 AM, Stephan Ewen <sewen@apache.org> wrote:
>
>> Hi Vinay!
>>
>> We just discovered a bug in RocksDB. The bug affects windows without
>> reduce() or fold(), windows with evictors, and ListState.
>>
>> A certain access pattern in RocksDB starts being so slow after a certain
>> size-per-key that it basically brings down the streaming program and the
>> snapshots.
>>
>> We are reaching out to the RocksDB folks and looking for workarounds in
>> Flink.
>>
>> Greetings,
>> Stephan
>>
>>
>> On Wed, Mar 1, 2017 at 12:10 PM, Stephan Ewen <sewen@apache.org> wrote:
>>
>>> @vinay  Can you try to not set the buffer timeout at all? I am actually
>>> not sure what would be the effect of setting it to a negative value, th=
at
>>> can be a cause of problems...
>>>
>>>
>>> On Mon, Feb 27, 2017 at 7:44 PM, Seth Wiesman <swiesman@mediamath.com>
>>> wrote:
>>>
>>>> Vinay,
>>>>
>>>>
>>>>
>>>> The bucketing sink performs rename operations during the checkpoint an=
d
>>>> if it tries to rename a file that is not yet consistent that would cau=
se a
>>>> FileNotFound exception which would fail the checkpoint.
>>>>
>>>>
>>>>
>>>> Stephan,
>>>>
>>>>
>>>>
>>>> Currently my aws fork contains some very specific assumptions about th=
e
>>>> pipeline that will in general only hold for my pipeline. This is becau=
se
>>>> there were still some open questions that  I had about how to solve
>>>> consistency issues in the general case. I will comment on the Jira iss=
ue
>>>> with more specific.
>>>>
>>>>
>>>>
>>>> Seth Wiesman
>>>>
>>>>
>>>>
>>>> *From: *vinay patil <vinay18.patil@gmail.com>
>>>> *Reply-To: *"user@flink.apache.org" <user@flink.apache.org>
>>>> *Date: *Monday, February 27, 2017 at 1:05 PM
>>>> *To: *"user@flink.apache.org" <user@flink.apache.org>
>>>>
>>>> *Subject: *Re: Checkpointing with RocksDB as statebackend
>>>>
>>>>
>>>>
>>>> Hi Seth,
>>>>
>>>> Thank you for your suggestion.
>>>>
>>>> But if the issue is only related to S3, then why does this happen when
>>>> I replace the S3 sink  to HDFS as well (for checkpointing I am using H=
DFS
>>>> only )
>>>>
>>>> Stephan,
>>>>
>>>> Another issue I see is when I set env.setBufferTimeout(-1) , and keep
>>>> the checkpoint interval to 10minutes, I have observed that nothing get=
s
>>>> written to sink (tried with S3 as well as HDFS), atleast I was expecti=
ng
>>>> pending files here.
>>>>
>>>> This issue gets worst when checkpointing is disabled  as nothing is
>>>> written.
>>>>
>>>>
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Vinay Patil
>>>>
>>>>
>>>>
>>>> On Mon, Feb 27, 2017 at 10:55 PM, Stephan Ewen [via Apache Flink User
>>>> Mailing List archive.] <[hidden email]> wrote:
>>>>
>>>> Hi Seth!
>>>>
>>>>
>>>>
>>>> Wow, that is an awesome approach.
>>>>
>>>>
>>>>
>>>> We have actually seen these issues as well and we are looking to
>>>> eventually implement our own S3 file system (and circumvent Hadoop's S=
3
>>>> connector that Flink currently relies on): https://issues.apache.org
>>>> /jira/browse/FLINK-5706
>>>>
>>>>
>>>>
>>>> Do you think your patch would be a good starting point for that and
>>>> would you be willing to share it?
>>>>
>>>>
>>>>
>>>> The Amazon AWS SDK for Java is Apache 2 licensed, so that is possible
>>>> to fork officially, if necessary...
>>>>
>>>>
>>>>
>>>> Greetings,
>>>>
>>>> Stephan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Feb 27, 2017 at 5:15 PM, Seth Wiesman <[hidden email]
>>>> <http://user/SendEmail.jtp?type=3Dnode&node=3D11943&i=3D0>> wrote:
>>>>
>>>> Just wanted to throw in my 2cts.
>>>>
>>>>
>>>>
>>>> I=E2=80=99ve been running pipelines with similar state size using rock=
sdb which
>>>> externalize to S3 and bucket to S3. I was getting stalls like this and
>>>> ended up tracing the problem to S3 and the bucketing sink. The solutio=
n was
>>>> two fold:
>>>>
>>>>
>>>>
>>>> 1)       I forked hadoop-aws and have it treat flink as a source of
>>>> truth. Emr uses a dynamodb table to determine if S3 is inconsistent.
>>>> Instead I say that if flink believes that a file exists on S3 and we d=
on=E2=80=99t
>>>> see it then I am going to trust that flink is in a consistent state an=
d S3
>>>> is not. In this case, various operations will perform a back off and r=
etry
>>>> up to a certain number of times.
>>>>
>>>>
>>>>
>>>> 2)       The bucketing sink performs multiple renames over the
>>>> lifetime of a file, occurring when a checkpoint starts and then again =
on
>>>> notification after it completes. Due to S3=E2=80=99s consistency guara=
ntees the
>>>> second rename of file can never be assured to work and will eventually=
 fail
>>>> either during or after a checkpoint. Because there is no upper bound o=
n the
>>>> time it will take for a file on S3 to become consistent, retries canno=
t
>>>> solve this specific problem as it could take upwards of many minutes t=
o
>>>> rename which would stall the entire pipeline. The only viable solution=
 I
>>>> could find was to write a custom sink which understands S3. Each write=
r
>>>> will write file locally and then copy it to S3 on checkpoint. By only
>>>> interacting with S3 once per file it can circumvent consistency issues=
 all
>>>> together.
>>>>
>>>>
>>>>
>>>> Hope this helps,
>>>>
>>>>
>>>>
>>>> Seth Wiesman
>>>>
>>>>
>>>>
>>>> *From: *vinay patil <[hidden email]
>>>> <http://user/SendEmail.jtp?type=3Dnode&node=3D11943&i=3D1>>
>>>> *Reply-To: *"[hidden email]
>>>> <http://user/SendEmail.jtp?type=3Dnode&node=3D11943&i=3D2>" <[hidden e=
mail]
>>>> <http://user/SendEmail.jtp?type=3Dnode&node=3D11943&i=3D3>>
>>>> *Date: *Saturday, February 25, 2017 at 10:50 AM
>>>> *To: *"[hidden email]
>>>> <http://user/SendEmail.jtp?type=3Dnode&node=3D11943&i=3D4>" <[hidden e=
mail]
>>>> <http://user/SendEmail.jtp?type=3Dnode&node=3D11943&i=3D5>>
>>>> *Subject: *Re: Checkpointing with RocksDB as statebackend
>>>>
>>>>
>>>>
>>>> HI Stephan,
>>>>
>>>> Just to avoid the confusion here, I am using S3 sink for writing the
>>>> data, and using HDFS for storing checkpoints.
>>>>
>>>> There are 2 core nodes (HDFS) and two task nodes on EMR
>>>>
>>>>
>>>> I replaced s3 sink with HDFS for writing data in my last test.
>>>>
>>>> Let's say the checkpoint interval is 5 minutes, now within 5minutes of
>>>> run the state size grows to 30GB ,  after checkpointing the 30GB state=
 that
>>>> is maintained in rocksDB has to be copied to HDFS, right ?  is this ca=
using
>>>> the pipeline to stall ?
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Vinay Patil
>>>>
>>>>
>>>>
>>>> On Sat, Feb 25, 2017 at 12:22 AM, Vinay Patil <[hidden email]> wrote:
>>>>
>>>> Hi Stephan,
>>>>
>>>> To verify if S3 is making teh pipeline stall, I have replaced the S3
>>>> sink with HDFS and kept minimum pause between checkpoints to 5minutes,
>>>> still I see the same issue with checkpoints getting failed.
>>>>
>>>> If I keep the  pause time to 20 seconds, all checkpoints are completed
>>>> , however there is a hit in overall throughput.
>>>>
>>>>
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Vinay Patil
>>>>
>>>>
>>>>
>>>> On Fri, Feb 24, 2017 at 10:09 PM, Stephan Ewen [via Apache Flink User
>>>> Mailing List archive.] <[hidden email]> wrote:
>>>>
>>>> Flink's state backends currently do a good number of "make sure this
>>>> exists" operations on the file systems. Through Hadoop's S3 filesystem=
,
>>>> that translates to S3 bucket list operations, where there is a limit i=
n how
>>>> many operation may happen per time interval. After that, S3 blocks.
>>>>
>>>>
>>>>
>>>> It seems that operations that are totally cheap on HDFS are hellishly
>>>> expensive (and limited) on S3. It may be that you are affected by that=
.
>>>>
>>>>
>>>>
>>>> We are gradually trying to improve the behavior there and be more S3
>>>> aware.
>>>>
>>>>
>>>>
>>>> Both 1.3-SNAPSHOT and 1.2-SNAPSHOT already contain improvements there.
>>>>
>>>>
>>>>
>>>> Best,
>>>>
>>>> Stephan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Feb 24, 2017 at 4:42 PM, vinay patil <[hidden email]
>>>> <http://user/SendEmail.jtp?type=3Dnode&node=3D11891&i=3D0>> wrote:
>>>>
>>>> Hi Stephan,
>>>>
>>>> So do you mean that S3 is causing the stall , as I have mentioned in m=
y
>>>> previous mail, I could not see any progress for 16minutes as checkpoin=
ts
>>>> were getting failed continuously.
>>>>
>>>>
>>>>
>>>> On Feb 24, 2017 8:30 PM, "Stephan Ewen [via Apache Flink User Mailing
>>>> List archive.]" <[hidden email]
>>>> <http://user/SendEmail.jtp?type=3Dnode&node=3D11887&i=3D0>> wrote:
>>>>
>>>> Hi Vinay!
>>>>
>>>>
>>>>
>>>> True, the operator state (like Kafka) is currently not asynchronously
>>>> checkpointed.
>>>>
>>>>
>>>>
>>>> While it is rather small state, we have seen before that on S3 it can
>>>> cause trouble, because S3 frequently stalls uploads of even data amoun=
ts as
>>>> low as kilobytes due to its throttling policies.
>>>>
>>>>
>>>>
>>>> That would be a super important fix to add!
>>>>
>>>>
>>>>
>>>> Best,
>>>>
>>>> Stephan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Feb 24, 2017 at 2:58 PM, vinay patil <[hidden email]
>>>> <http://user/SendEmail.jtp?type=3Dnode&node=3D11885&i=3D0>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I have attached a snapshot for reference:
>>>> As you can see all the 3 checkpointins failed , for checkpoint ID 2 an=
d
>>>> 3 it
>>>> is stuck at the Kafka source after 50%
>>>> (The data sent till now by Kafka source 1 is 65GB and sent by source 2
>>>> is
>>>> 15GB )
>>>>
>>>> Within 10minutes 15M records were processed, and for the next 16minute=
s
>>>> the
>>>> pipeline is stuck , I don't see any progress beyond 15M because of
>>>> checkpoints getting failed consistently.
>>>>
>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.na
>>>> bble.com/file/n11882/Checkpointing_Failed.png>
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context: http://apache-flink-user-maili
>>>> ng-list-archive.2336050.n4.nabble.com/Re-Checkpointing-with-
>>>> RocksDB-as-statebackend-tp11752p11882.html
>>>>
>>>> Sent from the Apache Flink User Mailing List archive. mailing list
>>>> archive at Nabble.com.
>>>>
>>>>
>>>>
>>>>
>>>> ------------------------------
>>>>
>>>> *If you reply to this email, your message will be added to the
>>>> discussion below:*
>>>>
>>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nab
>>>> ble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp1175
>>>> 2p11885.html
>>>>
>>>> To start a new topic under Apache Flink User Mailing List archive.,
>>>> email [hidden email]
>>>> <http://user/SendEmail.jtp?type=3Dnode&node=3D11887&i=3D1>
>>>> To unsubscribe from Apache Flink User Mailing List archive., click her=
e.
>>>> NAML
>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/t=
emplate/NamlServlet.jtp?macro=3Dmacro_viewer&id=3Dinstant_html%21nabble%3Ae=
mail.naml&base=3Dnabble.naml.namespaces.BasicNamespace-nabble.view.web.temp=
late.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=3Dn=
otify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.nam=
l-send_instant_email%21nabble%3Aemail.naml>
>>>>
>>>>
>>>> ------------------------------
>>>>
>>>> View this message in context: Re: Checkpointing with RocksDB as
>>>> statebackend
>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/R=
e-Checkpointing-with-RocksDB-as-statebackend-tp11752p11887.html>
>>>>
>>>> Sent from the Apache Flink User Mailing List archive. mailing list
>>>> archive
>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/>
>>>> at Nabble.com.
>>>>
>>>>
>>>>
>>>>
>>>> ------------------------------
>>>>
>>>> *If you reply to this email, your message will be added to the
>>>> discussion below:*
>>>>
>>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nab
>>>> ble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp1175
>>>> 2p11891.html
>>>>
>>>> To start a new topic under Apache Flink User Mailing List archive.,
>>>> email [hidden email]
>>>> To unsubscribe from Apache Flink User Mailing List archive., click her=
e.
>>>> NAML
>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/t=
emplate/NamlServlet.jtp?macro=3Dmacro_viewer&id=3Dinstant_html%21nabble%3Ae=
mail.naml&base=3Dnabble.naml.namespaces.BasicNamespace-nabble.view.web.temp=
late.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=3Dn=
otify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.nam=
l-send_instant_email%21nabble%3Aemail.naml>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ------------------------------
>>>>
>>>> View this message in context: Re: Checkpointing with RocksDB as
>>>> statebackend
>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/R=
e-Checkpointing-with-RocksDB-as-statebackend-tp11752p11913.html>
>>>> Sent from the Apache Flink User Mailing List archive. mailing list
>>>> archive
>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/>
>>>> at Nabble.com.
>>>>
>>>>
>>>>
>>>>
>>>> ------------------------------
>>>>
>>>> *If you reply to this email, your message will be added to the
>>>> discussion below:*
>>>>
>>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nab
>>>> ble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp1175
>>>> 2p11943.html
>>>>
>>>> To start a new topic under Apache Flink User Mailing List archive.,
>>>> email [hidden email]
>>>> To unsubscribe from Apache Flink User Mailing List archive., click her=
e
>>>> .
>>>> NAML
>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/t=
emplate/NamlServlet.jtp?macro=3Dmacro_viewer&id=3Dinstant_html%21nabble%3Ae=
mail.naml&base=3Dnabble.naml.namespaces.BasicNamespace-nabble.view.web.temp=
late.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=3Dn=
otify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.nam=
l-send_instant_email%21nabble%3Aemail.naml>
>>>>
>>>>
>>>>
>>>>
>>>> ------------------------------
>>>>
>>>> View this message in context: Re: Checkpointing with RocksDB as
>>>> statebackend
>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/R=
e-Checkpointing-with-RocksDB-as-statebackend-tp11752p11949.html>
>>>> Sent from the Apache Flink User Mailing List archive. mailing list
>>>> archive
>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/>
>>>> at Nabble.com.
>>>>
>>>>
>>>
>>
>
>

--001a11c1662a0e8b05054ab378cd
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">The issue in Flink is=C2=A0<a href=3D"https://issues.apach=
e.org/jira/browse/FLINK-5756">https://issues.apache.org/jira/browse/FLINK-5=
756</a><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Tue, Ma=
r 14, 2017 at 3:40 PM, Stefan Richter <span dir=3D"ltr">&lt;<a href=3D"mail=
to:s.richter@data-artisans.com" target=3D"_blank">s.richter@data-artisans.c=
om</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"marg=
in:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style=3D"wo=
rd-wrap:break-word">Hi Vinay,<div><br></div><div>I think the issue is track=
ed here:=C2=A0<a href=3D"https://github.com/facebook/rocksdb/issues/1988" t=
arget=3D"_blank">https://github.com/<wbr>facebook/rocksdb/issues/1988</a>.<=
/div><div><br></div><div>Best,</div><div>Stefan</div><div><div class=3D"h5"=
><div><br><div><blockquote type=3D"cite"><div>Am 14.03.2017 um 15:31 schrie=
b Vishnu Viswanath &lt;<a href=3D"mailto:vishnu.viswanath25@gmail.com" targ=
et=3D"_blank">vishnu.viswanath25@gmail.com</a>&gt;<wbr>:</div><br class=3D"=
m_-4861140091401466367Apple-interchange-newline"><div><div dir=3D"ltr">Hi S=
tephan,<div><br></div><div>Is there a ticket number/link to track this, My =
job has all the conditions you mentioned.</div><div><br></div><div>Thanks,<=
/div><div>Vishnu</div></div><div class=3D"gmail_extra"><br><div class=3D"gm=
ail_quote">On Tue, Mar 14, 2017 at 7:13 AM, Stephan Ewen <span dir=3D"ltr">=
&lt;<a href=3D"mailto:sewen@apache.org" target=3D"_blank">sewen@apache.org<=
/a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:=
0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr">Hi=
 Vinay!<div><br></div><div>We just discovered a bug in RocksDB. The bug aff=
ects windows without reduce() or fold(), windows with evictors, and ListSta=
te.</div><div><br></div><div>A certain access pattern in RocksDB starts bei=
ng so slow after a certain size-per-key that it basically brings down the s=
treaming program and the snapshots.<br></div><div><br></div><div>We are rea=
ching out to the RocksDB folks and looking for workarounds in Flink.</div><=
div><br></div><div>Greetings,</div><div>Stephan</div><div><br></div></div><=
div class=3D"m_-4861140091401466367HOEnZb"><div class=3D"m_-486114009140146=
6367h5"><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Wed, M=
ar 1, 2017 at 12:10 PM, Stephan Ewen <span dir=3D"ltr">&lt;<a href=3D"mailt=
o:sewen@apache.org" target=3D"_blank">sewen@apache.org</a>&gt;</span> wrote=
:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-le=
ft:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr">@vinay =C2=A0Can you t=
ry to not set the buffer timeout at all? I am actually not sure what would =
be the effect of setting it to a negative value, that can be a cause of pro=
blems...<div><div class=3D"m_-4861140091401466367m_8905657262401065005h5"><=
div class=3D"gmail_extra"><br></div><div class=3D"gmail_extra"><br><div cla=
ss=3D"gmail_quote">On Mon, Feb 27, 2017 at 7:44 PM, Seth Wiesman <span dir=
=3D"ltr">&lt;<a href=3D"mailto:swiesman@mediamath.com" target=3D"_blank">sw=
iesman@mediamath.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_qu=
ote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex=
">


<div bgcolor=3D"white" lang=3D"EN-US" link=3D"blue" vlink=3D"purple">
<div class=3D"m_-4861140091401466367m_8905657262401065005m_9205152186707556=
404m_-7739541663351612604WordSection1"><p class=3D"MsoNormal"><span style=
=3D"font-size:11.0pt;font-family:Calibri">Vinay, <u></u>
<u></u></span></p><p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;fo=
nt-family:Calibri"><u></u>=C2=A0<u></u></span></p><p class=3D"MsoNormal"><s=
pan style=3D"font-size:11.0pt;font-family:Calibri">The bucketing sink perfo=
rms rename operations during the checkpoint and if it tries to rename a fil=
e that is not yet consistent that would cause a FileNotFound exception whic=
h would fail
 the checkpoint. <u></u><u></u></span></p><p class=3D"MsoNormal"><span styl=
e=3D"font-size:11.0pt;font-family:Calibri"><u></u>=C2=A0<u></u></span></p><=
p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:Calibri">=
Stephan, <u></u>
<u></u></span></p><p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;fo=
nt-family:Calibri"><u></u>=C2=A0<u></u></span></p><p class=3D"MsoNormal"><s=
pan style=3D"font-size:11.0pt;font-family:Calibri">Currently my aws fork co=
ntains some very specific assumptions about the pipeline that will in gener=
al only hold for my pipeline. This is because there were still some open qu=
estions that
 =C2=A0I had about how to solve consistency issues in the general case. I w=
ill comment on the Jira issue with more specific.
<u></u><u></u></span></p><p class=3D"MsoNormal"><span style=3D"font-size:11=
.0pt;font-family:Calibri"><u></u>=C2=A0<u></u></span></p><p class=3D"MsoNor=
mal"><span style=3D"font-size:11.0pt;font-family:Calibri">Seth Wiesman<u></=
u><u></u></span></p><p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;=
font-family:Calibri"><u></u>=C2=A0<u></u></span></p>
<div style=3D"border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0in =
0in 0in"><p class=3D"MsoNormal"><b><span style=3D"font-family:Calibri">From=
: </span>
</b><span style=3D"font-family:Calibri">vinay patil &lt;<a href=3D"mailto:v=
inay18.patil@gmail.com" target=3D"_blank">vinay18.patil@gmail.com</a>&gt;<b=
r>
<b>Reply-To: </b>&quot;<a href=3D"mailto:user@flink.apache.org" target=3D"_=
blank">user@flink.apache.org</a>&quot; &lt;<a href=3D"mailto:user@flink.apa=
che.org" target=3D"_blank">user@flink.apache.org</a>&gt;<br>
<b>Date: </b>Monday, February 27, 2017 at 1:05 PM<br>
<b>To: </b>&quot;<a href=3D"mailto:user@flink.apache.org" target=3D"_blank"=
>user@flink.apache.org</a>&quot; &lt;<a href=3D"mailto:user@flink.apache.or=
g" target=3D"_blank">user@flink.apache.org</a>&gt;</span></p><div><div clas=
s=3D"m_-4861140091401466367m_8905657262401065005m_9205152186707556404h5"><b=
r>
<b>Subject: </b>Re: Checkpointing with RocksDB as statebackend<u></u><u></u=
></div></div><div><br class=3D"m_-4861140091401466367webkit-block-placehold=
er"></div>
</div><div><div class=3D"m_-4861140091401466367m_8905657262401065005m_92051=
52186707556404h5">
<div><p class=3D"MsoNormal"><u></u>=C2=A0<u></u></p>
</div>
<div>
<div>
<div>
<div>
<div>
<div>
<div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt">Hi Seth,<u></u><=
u></u></p>
</div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt">Thank you for y=
our suggestion.<u></u><u></u></p>
</div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt">But if the issu=
e is only related to S3, then why does this happen when I replace the S3 si=
nk=C2=A0 to HDFS as well (for checkpointing I am using HDFS only )<u></u><u=
></u></p>
</div><p class=3D"MsoNormal">Stephan,<u></u><u></u></p>
</div><p class=3D"MsoNormal">Another issue I see is when I set env.setBuffe=
rTimeout(-1) , and keep the checkpoint interval to 10minutes, I have observ=
ed that nothing gets written to sink (tried with S3 as well as HDFS), atlea=
st I was expecting pending files here.<u></u><u></u></p>
</div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt">This issue gets=
 worst when checkpointing is disabled=C2=A0 as nothing is written.<u></u><u=
></u></p>
</div><p class=3D"MsoNormal"><u></u>=C2=A0<u></u></p>
</div>
<div><p class=3D"MsoNormal"><br clear=3D"all">
<u></u><u></u></p>
<div>
<div>
<div>
<div>
<div><p class=3D"MsoNormal"><span>Regards,</span> <u></u><u></u></p>
<div><p class=3D"MsoNormal"><span>Vinay Patil</span><u></u><u></u></p>
</div>
</div>
</div>
</div>
</div>
</div><p class=3D"MsoNormal"><u></u>=C2=A0<u></u></p>
<div><p class=3D"MsoNormal">On Mon, Feb 27, 2017 at 10:55 PM, Stephan Ewen =
[via Apache Flink User Mailing List archive.] &lt;<a>[hidden email]</a>&gt;=
 wrote:<u></u><u></u></p>
<blockquote style=3D"border:none;border-left:solid #cccccc 1.5pt;padding:0i=
n 0in 0in 12.0pt;margin-top:5.0pt;margin-bottom:5.0pt">
<div><p class=3D"MsoNormal">Hi Seth! <u></u><u></u></p>
<div><p class=3D"MsoNormal"><u></u>=C2=A0<u></u></p>
</div>
<div><p class=3D"MsoNormal">Wow, that is an awesome approach.<u></u><u></u>=
</p>
</div>
<div><p class=3D"MsoNormal"><u></u>=C2=A0<u></u></p>
</div>
<div><p class=3D"MsoNormal">We have actually seen these issues as well and =
we are looking to eventually implement our own S3 file system (and circumve=
nt Hadoop&#39;s S3 connector that Flink currently relies on):=C2=A0<a href=
=3D"https://issues.apache.org/jira/browse/FLINK-5706" target=3D"_blank">htt=
ps://issues.apache.org<wbr>/jira/browse/FLINK-5706</a><u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal"><u></u>=C2=A0<u></u></p>
</div>
<div><p class=3D"MsoNormal">Do you think your patch would be a good startin=
g point for that and would you be willing to share it?<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal"><u></u>=C2=A0<u></u></p>
</div>
<div><p class=3D"MsoNormal">The Amazon AWS SDK for Java is Apache 2 license=
d, so that is possible to fork officially, if necessary...<u></u><u></u></p=
>
</div>
<div><p class=3D"MsoNormal"><u></u>=C2=A0<u></u></p>
</div>
<div><p class=3D"MsoNormal">Greetings,<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">Stephan<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal"><u></u>=C2=A0<u></u></p>
</div>
<div><p class=3D"MsoNormal"><u></u>=C2=A0<u></u></p>
</div>
</div>
<div>
<div>
<div><p class=3D"MsoNormal"><u></u>=C2=A0<u></u></p>
<div><p class=3D"MsoNormal">On Mon, Feb 27, 2017 at 5:15 PM, Seth Wiesman &=
lt;<a href=3D"http://user/SendEmail.jtp?type=3Dnode&amp;node=3D11943&amp;i=
=3D0" target=3D"_blank">[hidden email]</a>&gt; wrote:<u></u><u></u></p>
<blockquote style=3D"border:none;border-left:solid #cccccc 1.5pt;padding:0i=
n 0in 0in 12.0pt;margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<div><p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:Cal=
ibri">Just wanted to throw in my 2cts. =C2=A0</span><u></u><u></u></p><p cl=
ass=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:Calibri">=C2=
=A0</span><u></u><u></u></p><p class=3D"MsoNormal"><span style=3D"font-size=
:11.0pt;font-family:Calibri">I=E2=80=99ve been running pipelines with simil=
ar state size using rocksdb which externalize to S3 and bucket to S3. I was=
 getting stalls
 like this and ended up tracing the problem to S3 and the bucketing sink. T=
he solution was two fold:
</span><u></u><u></u></p><p class=3D"MsoNormal"><span style=3D"font-size:11=
.0pt;font-family:Calibri">=C2=A0</span><u></u><u></u></p><p class=3D"m_-486=
1140091401466367m_8905657262401065005m_9205152186707556404m_-77395416633516=
12604m-4470877161989047399m1376363357071695961msolistparagraph"><span style=
=3D"font-size:11.0pt;font-family:Calibri">1)</span><span style=3D"font-size=
:7.0pt">=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0
</span><span style=3D"font-size:11.0pt;font-family:Calibri">I forked hadoop=
-aws and have it treat flink as a source of truth. Emr uses a dynamodb tabl=
e to determine if S3 is inconsistent. Instead I say that if flink believes =
that a file exists on S3 and we don=E2=80=99t
 see it then I am going to trust that flink is in a consistent state and S3=
 is not. In this case, various operations will perform a back off and retry=
 up to a certain number of times.
</span><u></u><u></u></p><p class=3D"MsoNormal"><span style=3D"font-size:11=
.0pt;font-family:Calibri">=C2=A0</span><u></u><u></u></p><p class=3D"m_-486=
1140091401466367m_8905657262401065005m_9205152186707556404m_-77395416633516=
12604m-4470877161989047399m1376363357071695961msolistparagraph"><span style=
=3D"font-size:11.0pt;font-family:Calibri">2)</span><span style=3D"font-size=
:7.0pt">=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0
</span><span style=3D"font-size:11.0pt;font-family:Calibri">The bucketing s=
ink performs multiple renames over the lifetime of a file, occurring when a=
 checkpoint starts and then again on notification after it completes. Due t=
o S3=E2=80=99s consistency guarantees the
 second rename of file can never be assured to work and will eventually fai=
l either during or after a checkpoint. Because there is no upper bound on t=
he time it will take for a file on S3 to become consistent, retries cannot =
solve this specific problem as it
 could take upwards of many minutes to rename which would stall the entire =
pipeline. The only viable solution I could find was to write a custom sink =
which understands S3. Each writer will write file locally and then copy it =
to S3 on checkpoint. By only interacting
 with S3 once per file it can circumvent consistency issues all together. <=
/span>
<u></u><u></u></p><p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;fo=
nt-family:Calibri">=C2=A0</span><u></u><u></u></p><p class=3D"MsoNormal"><s=
pan style=3D"font-size:11.0pt;font-family:Calibri">Hope this helps,
</span><u></u><u></u></p><p class=3D"MsoNormal"><span style=3D"font-size:11=
.0pt;font-family:Calibri">=C2=A0</span><u></u><u></u></p><p class=3D"MsoNor=
mal"><span style=3D"font-size:11.0pt;font-family:Calibri">Seth Wiesman</spa=
n><u></u><u></u></p><p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;=
font-family:Calibri">=C2=A0</span><u></u><u></u></p>
<div style=3D"border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0in =
0in 0in"><p class=3D"MsoNormal"><b><span style=3D"font-family:Calibri">From=
:
</span></b><span style=3D"font-family:Calibri">vinay patil &lt;<a href=3D"h=
ttp://user/SendEmail.jtp?type=3Dnode&amp;node=3D11943&amp;i=3D1" target=3D"=
_blank">[hidden email]</a>&gt;<br>
<b>Reply-To: </b>&quot;<a href=3D"http://user/SendEmail.jtp?type=3Dnode&amp=
;node=3D11943&amp;i=3D2" target=3D"_blank">[hidden email]</a>&quot; &lt;<a =
href=3D"http://user/SendEmail.jtp?type=3Dnode&amp;node=3D11943&amp;i=3D3" t=
arget=3D"_blank">[hidden email]</a>&gt;<br>
<b>Date: </b>Saturday, February 25, 2017 at 10:50 AM<br>
<b>To: </b>&quot;<a href=3D"http://user/SendEmail.jtp?type=3Dnode&amp;node=
=3D11943&amp;i=3D4" target=3D"_blank">[hidden email]</a>&quot; &lt;<a href=
=3D"http://user/SendEmail.jtp?type=3Dnode&amp;node=3D11943&amp;i=3D5" targe=
t=3D"_blank">[hidden email]</a>&gt;<br>
<b>Subject: </b>Re: Checkpointing with RocksDB as statebackend</span><u></u=
><u></u></p>
</div>
<div>
<div>
<div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
</div>
<div>
<div>
<div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt">HI Stephan,<u></=
u><u></u></p>
</div><p class=3D"MsoNormal">Just to avoid the confusion here, I am using S=
3 sink for writing the data, and using HDFS for storing checkpoints.<u></u>=
<u></u></p>
</div><p class=3D"MsoNormal">There are 2 core nodes (HDFS) and two task nod=
es on EMR
<u></u><u></u></p>
<div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt"><br>
I replaced s3 sink with HDFS for writing data in my last test.<u></u><u></u=
></p>
</div>
<div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt">Let&#39;s say th=
e checkpoint interval is 5 minutes, now within 5minutes of run the state si=
ze grows to 30GB ,=C2=A0 after checkpointing the 30GB state that is maintai=
ned in rocksDB has to be copied
 to HDFS, right ?=C2=A0 is this causing the pipeline to stall ?<u></u><u></=
u></p>
</div>
</div>
<div><p class=3D"MsoNormal"><br clear=3D"all">
<u></u><u></u></p>
<div>
<div>
<div>
<div>
<div><p class=3D"MsoNormal"><span>Regards,</span>
<u></u><u></u></p>
<div><p class=3D"MsoNormal"><span>Vinay Patil</span><u></u><u></u></p>
</div>
</div>
</div>
</div>
</div>
</div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
<div><p class=3D"MsoNormal">On Sat, Feb 25, 2017 at 12:22 AM, Vinay Patil &=
lt;[hidden email]&gt; wrote:<u></u><u></u></p>
<blockquote style=3D"border:none;border-left:solid #cccccc 1.5pt;padding:0i=
n 0in 0in 12.0pt;margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<div>
<div>
<div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt">Hi Stephan,<u></=
u><u></u></p>
</div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt">To verify if S3=
 is making teh pipeline stall, I have replaced the S3 sink with HDFS and ke=
pt minimum pause between checkpoints to 5minutes, still I see the same issu=
e with checkpoints
 getting failed.<u></u><u></u></p>
</div><p class=3D"MsoNormal">If I keep the=C2=A0 pause time to 20 seconds, =
all checkpoints are completed , however there is a hit in overall throughpu=
t.<u></u><u></u></p>
</div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt"><u></u>=C2=A0<u=
></u></p>
</div>
<div><p class=3D"MsoNormal"><br clear=3D"all">
<u></u><u></u></p>
<div>
<div>
<div>
<div>
<div><p class=3D"MsoNormal"><span>Regards,</span>
<u></u><u></u></p>
<div><p class=3D"MsoNormal"><span>Vinay Patil</span><u></u><u></u></p>
</div>
</div>
</div>
</div>
</div>
</div>
<div>
<div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
<div><p class=3D"MsoNormal">On Fri, Feb 24, 2017 at 10:09 PM, Stephan Ewen =
[via Apache Flink User Mailing List archive.] &lt;[hidden email]&gt; wrote:=
<u></u><u></u></p>
<blockquote style=3D"border:none;border-left:solid #cccccc 1.5pt;padding:0i=
n 0in 0in 12.0pt;margin-top:5.0pt;margin-bottom:5.0pt">
<div><p class=3D"MsoNormal">Flink&#39;s state backends currently do a good =
number of &quot;make sure this exists&quot; operations on the file systems.=
 Through Hadoop&#39;s S3 filesystem, that translates to S3 bucket list oper=
ations,
 where there is a limit in how many operation may happen per time interval.=
 After that, S3 blocks.
<u></u><u></u></p>
<div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">It seems that operations that are totally cheap=
 on HDFS are hellishly expensive (and limited) on S3. It may be that you ar=
e affected by that.<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">We are gradually trying to improve the behavior=
 there and be more S3 aware.<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">Both 1.3-SNAPSHOT and 1.2-SNAPSHOT already cont=
ain improvements there.<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">Best,<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">Stephan<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
<div>
<div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
<div><p class=3D"MsoNormal">On Fri, Feb 24, 2017 at 4:42 PM, vinay patil &l=
t;<a href=3D"http://user/SendEmail.jtp?type=3Dnode&amp;node=3D11891&amp;i=
=3D0" target=3D"_blank">[hidden email]</a>&gt; wrote:<u></u><u></u></p>
<blockquote style=3D"border:none;border-left:solid #cccccc 1.5pt;padding:0i=
n 0in 0in 12.0pt;margin-top:5.0pt;margin-bottom:5.0pt"><p>Hi Stephan,<u></u=
><u></u></p><p>So do you mean that S3 is causing the stall , as I have ment=
ioned in my previous mail, I could not see any progress for 16minutes as ch=
eckpoints were getting failed continuously.<u></u><u></u></p>
<div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
<div><p class=3D"MsoNormal">On Feb 24, 2017 8:30 PM, &quot;Stephan Ewen [vi=
a Apache Flink User Mailing List archive.]&quot; &lt;<a href=3D"http://user=
/SendEmail.jtp?type=3Dnode&amp;node=3D11887&amp;i=3D0" target=3D"_blank">[h=
idden email]</a>&gt;
 wrote:<u></u><u></u></p>
<blockquote style=3D"border:none;border-left:solid #cccccc 1.5pt;padding:0i=
n 0in 0in 12.0pt;margin-top:5.0pt;margin-bottom:5.0pt">
<div><p class=3D"MsoNormal">Hi Vinay!
<u></u><u></u></p>
<div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">True, the operator state (like Kafka) is curren=
tly not asynchronously checkpointed.<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">While it is rather small state, we have seen be=
fore that on S3 it can cause trouble, because S3 frequently stalls uploads =
of even data amounts as low as kilobytes due to its throttling
 policies.<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">That would be a super important fix to add!<u><=
/u><u></u></p>
</div>
<div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">Best,<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">Stephan<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
</div>
<div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
<div><p class=3D"MsoNormal">On Fri, Feb 24, 2017 at 2:58 PM, vinay patil &l=
t;<a href=3D"http://user/SendEmail.jtp?type=3Dnode&amp;node=3D11885&amp;i=
=3D0" target=3D"_blank">[hidden email]</a>&gt; wrote:<u></u><u></u></p>
<blockquote style=3D"border:none;border-left:solid #cccccc 1.5pt;padding:0i=
n 0in 0in 12.0pt;margin-top:5.0pt;margin-bottom:5.0pt"><p class=3D"MsoNorma=
l">Hi,<br>
<br>
I have attached a snapshot for reference:<br>
As you can see all the 3 checkpointins failed , for checkpoint ID 2 and 3 i=
t<br>
is stuck at the Kafka source after 50%<br>
(The data sent till now by Kafka source 1 is 65GB and sent by source 2 is<b=
r>
15GB )<br>
<br>
Within 10minutes 15M records were processed, and for the next 16minutes the=
<br>
pipeline is stuck , I don&#39;t see any progress beyond 15M because of<br>
checkpoints getting failed consistently.<br>
<br>
&lt;<a href=3D"http://apache-flink-user-mailing-list-archive.2336050.n4.nab=
ble.com/file/n11882/Checkpointing_Failed.png" target=3D"_blank">http://apac=
he-flink-user-mail<wbr>ing-list-archive.2336050.n4.na<wbr>bble.com/file/n11=
882/Checkpoin<wbr>ting_Failed.png</a>&gt;<br>
<br>
<br>
<br>
--<br>
View this message in context: <a href=3D"http://apache-flink-user-mailing-l=
ist-archive.2336050.n4.nabble.com/Re-Checkpointing-with-RocksDB-as-statebac=
kend-tp11752p11882.html" target=3D"_blank">
http://apache-flink-user-maili<wbr>ng-list-archive.2336050.n4.nab<wbr>ble.c=
om/Re-Checkpointing-with-<wbr>RocksDB-as-statebackend-tp1175<wbr>2p11882.ht=
ml</a><u></u><u></u></p>
<div>
<div><p class=3D"MsoNormal">Sent from the Apache Flink User Mailing List ar=
chive. mailing list archive at <a href=3D"http://Nabble.com" target=3D"_bla=
nk">Nabble.com</a>.<u></u><u></u></p>
</div>
</div>
</blockquote>
</div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
</div>
</div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt">=C2=A0<u></u><u=
></u></p>
<div class=3D"MsoNormal" align=3D"center" style=3D"text-align:center">
<hr size=3D"1" width=3D"100%" noshade style=3D"color:#cccccc" align=3D"cent=
er">
</div>
<div>
<div><p class=3D"MsoNormal"><b><span style=3D"font-size:9.0pt;font-family:G=
eneva;color:#444444">If you reply to this email, your message will be added=
 to the discussion below:</span></b><u></u><u></u></p>
</div><p class=3D"MsoNormal"><span style=3D"font-size:9.0pt;font-family:Gen=
eva;color:#444444"><a href=3D"http://apache-flink-user-mailing-list-archive=
.2336050.n4.nabble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp1175=
2p11885.html" target=3D"_blank">http://apache-flink-user-maili<wbr>ng-list-=
archive.2336050.n4.nab<wbr>ble.com/Re-Checkpointing-with-<wbr>RocksDB-as-st=
atebackend-tp1175<wbr>2p11885.html</a>
</span><u></u><u></u></p>
</div>
<div style=3D"margin-top:4.8pt"><p class=3D"MsoNormal" style=3D"line-height=
:18.0pt">
<span style=3D"font-size:8.5pt;font-family:Geneva;color:#666666">To start a=
 new topic under Apache Flink User Mailing List archive., email
<a href=3D"http://user/SendEmail.jtp?type=3Dnode&amp;node=3D11887&amp;i=3D1=
" target=3D"_blank">[hidden email]</a>
<br>
To unsubscribe from Apache Flink User Mailing List archive., click here.<br=
>
<a href=3D"http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.=
com/template/NamlServlet.jtp?macro=3Dmacro_viewer&amp;id=3Dinstant_html%21n=
abble%3Aemail.naml&amp;base=3Dnabble.naml.namespaces.BasicNamespace-nabble.=
view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&am=
p;breadcrumbs=3Dnotify_subscribers%21nabble%3Aemail.naml-instant_emails%21n=
abble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml" target=3D"_bla=
nk"><span style=3D"font-size:7.0pt;font-family:Times">NAML</span></a>
</span><u></u><u></u></p>
</div>
</blockquote>
</div>
</div><p class=3D"MsoNormal"><span class=3D"m_-4861140091401466367m_8905657=
262401065005m_9205152186707556404m_-7739541663351612604m-447087716198904739=
9m1376363357071695961m-790566684505200102m-7943621781236374332im">=C2=A0</s=
pan><u></u><u></u></p>
<div>
<div class=3D"MsoNormal"><span class=3D"m_-4861140091401466367m_89056572624=
01065005m_9205152186707556404m_-7739541663351612604m-4470877161989047399m13=
76363357071695961m-790566684505200102m-7943621781236374332im">
<hr size=3D"2" width=3D"300" style=3D"width:225.0pt" align=3D"left">
</span></div>
</div><p class=3D"MsoNormal"><span class=3D"m_-4861140091401466367m_8905657=
262401065005m_9205152186707556404m_-7739541663351612604m-447087716198904739=
9m1376363357071695961m-790566684505200102m-7943621781236374332im">View this=
 message in context:
<a href=3D"http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.=
com/Re-Checkpointing-with-RocksDB-as-statebackend-tp11752p11887.html" targe=
t=3D"_blank">
Re: Checkpointing with RocksDB as statebackend</a></span><u></u><u></u></p>
<div>
<div><p class=3D"MsoNormal">Sent from the
<a href=3D"http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.=
com/" target=3D"_blank">
Apache Flink User Mailing List archive. mailing list archive</a> at <a href=
=3D"http://Nabble.com" target=3D"_blank">Nabble.com</a>.<u></u><u></u></p>
</div>
</div>
</blockquote>
</div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
</div>
</div>
</div>
</div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt">=C2=A0<u></u><u=
></u></p>
<div class=3D"MsoNormal" align=3D"center" style=3D"text-align:center">
<hr size=3D"1" width=3D"100%" noshade style=3D"color:#cccccc" align=3D"cent=
er">
</div>
<div>
<div><p class=3D"MsoNormal"><b><span style=3D"font-size:9.0pt;font-family:G=
eneva;color:#444444">If you reply to this email, your message will be added=
 to the discussion below:</span></b><u></u><u></u></p>
</div><p class=3D"MsoNormal"><span style=3D"font-size:9.0pt;font-family:Gen=
eva;color:#444444"><a href=3D"http://apache-flink-user-mailing-list-archive=
.2336050.n4.nabble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp1175=
2p11891.html" target=3D"_blank">http://apache-flink-user-maili<wbr>ng-list-=
archive.2336050.n4.nab<wbr>ble.com/Re-Checkpointing-with-<wbr>RocksDB-as-st=
atebackend-tp1175<wbr>2p11891.html</a>
</span><u></u><u></u></p>
</div>
<div>
<div>
<div style=3D"margin-top:4.8pt"><p class=3D"MsoNormal" style=3D"line-height=
:18.0pt">
<span style=3D"font-size:8.5pt;font-family:Geneva;color:#666666">To start a=
 new topic under Apache Flink User Mailing List archive., email [hidden ema=
il]
<br>
To unsubscribe from Apache Flink User Mailing List archive., click here.<br=
>
<a href=3D"http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.=
com/template/NamlServlet.jtp?macro=3Dmacro_viewer&amp;id=3Dinstant_html%21n=
abble%3Aemail.naml&amp;base=3Dnabble.naml.namespaces.BasicNamespace-nabble.=
view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&am=
p;breadcrumbs=3Dnotify_subscribers%21nabble%3Aemail.naml-instant_emails%21n=
abble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml" target=3D"_bla=
nk"><span style=3D"font-size:7.0pt;font-family:Times">NAML</span></a>
</span><u></u><u></u></p>
</div>
</div>
</div>
</blockquote>
</div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
</div>
</div>
</div>
</blockquote>
</div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
</div><p class=3D"MsoNormal">=C2=A0<u></u><u></u></p>
<div>
<div class=3D"MsoNormal">
<hr size=3D"2" width=3D"300" style=3D"width:225.0pt" align=3D"left">
</div>
</div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt">View this messa=
ge in context:
<a href=3D"http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.=
com/Re-Checkpointing-with-RocksDB-as-statebackend-tp11752p11913.html" targe=
t=3D"_blank">
Re: Checkpointing with RocksDB as statebackend</a><br>
Sent from the <a href=3D"http://apache-flink-user-mailing-list-archive.2336=
050.n4.nabble.com/" target=3D"_blank">
Apache Flink User Mailing List archive. mailing list archive</a> at <a href=
=3D"http://Nabble.com" target=3D"_blank">Nabble.com</a>.<u></u><u></u></p>
</div>
</div>
</div>
</div>
</blockquote>
</div><p class=3D"MsoNormal"><u></u>=C2=A0<u></u></p>
</div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt"><u></u>=C2=A0<u=
></u></p>
<div class=3D"MsoNormal" align=3D"center" style=3D"text-align:center">
<hr size=3D"1" width=3D"100%" noshade style=3D"color:#cccccc" align=3D"cent=
er">
</div>
</div>
</div>
<div>
<div>
<div>
<div><p class=3D"MsoNormal"><b><span style=3D"font-size:9.0pt;font-family:G=
eneva;color:#444444">If you reply to this email, your message will be added=
 to the discussion below:<u></u><u></u></span></b></p>
</div>
</div>
</div><p class=3D"MsoNormal"><span style=3D"font-size:9.0pt;font-family:Gen=
eva;color:#444444"><a href=3D"http://apache-flink-user-mailing-list-archive=
.2336050.n4.nabble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp1175=
2p11943.html" target=3D"_blank">http://apache-flink-user-maili<wbr>ng-list-=
archive.2336050.n4.nab<wbr>ble.com/Re-Checkpointing-with-<wbr>RocksDB-as-st=
atebackend-tp1175<wbr>2p11943.html</a>
<u></u><u></u></span></p>
</div>
<div>
<div>
<div style=3D"margin-top:4.8pt"><p class=3D"MsoNormal" style=3D"line-height=
:18.0pt"><span style=3D"font-size:8.5pt;font-family:Geneva;color:#666666">T=
o start a new topic under Apache Flink User Mailing List archive., email
<a>[hidden email]</a>
<br>
To unsubscribe from Apache Flink User Mailing List archive., <a>
click here</a>.<br>
<a href=3D"http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.=
com/template/NamlServlet.jtp?macro=3Dmacro_viewer&amp;id=3Dinstant_html%21n=
abble%3Aemail.naml&amp;base=3Dnabble.naml.namespaces.BasicNamespace-nabble.=
view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&am=
p;breadcrumbs=3Dnotify_subscribers%21nabble%3Aemail.naml-instant_emails%21n=
abble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml" target=3D"_bla=
nk"><span style=3D"font-size:7.0pt;font-family:Times">NAML</span></a>
<u></u><u></u></span></p>
</div>
</div>
</div>
</blockquote>
</div><p class=3D"MsoNormal"><u></u>=C2=A0<u></u></p>
</div><p class=3D"MsoNormal"><u></u>=C2=A0<u></u></p>
<div class=3D"MsoNormal">
<hr size=3D"2" width=3D"300" style=3D"width:225.0pt" align=3D"left">
</div><p class=3D"MsoNormal">View this message in context: <a href=3D"http:=
//apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Re-Checkpoin=
ting-with-RocksDB-as-statebackend-tp11752p11949.html" target=3D"_blank">
Re: Checkpointing with RocksDB as statebackend</a><br>
Sent from the <a href=3D"http://apache-flink-user-mailing-list-archive.2336=
050.n4.nabble.com/" target=3D"_blank">
Apache Flink User Mailing List archive. mailing list archive</a> at <a href=
=3D"http://Nabble.com" target=3D"_blank">Nabble.com</a>.<br>
<br>
<u></u><u></u></p>
</div></div></div>
</div>

</blockquote></div><br></div></div></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></blockquote></div><br></div></div></div></div></blockquote></div><br=
></div></div>

--001a11c1662a0e8b05054ab378cd--