Mailing-List: contact user-help@flink.apache.org; run by ezmlm
Precedence: bulk
MIME-Version: 1.0
In-Reply-To: <CAKhqdDz0-JZbozV_bxnUQ6UDmL7BmkFwMpUFv8q-c78Y=g=P3g@mail.gmail.com>
References: <CAJXnhA80gcgnVFm2aneY1eyd0XXwPyCKXDEjnxaDSZif_CwwOQ@mail.gmail.com>
 <B06B1916-1F08-4A6B-BACE-73A782F44988@data-artisans.com> <CAJXnhA9Fw661KpDwMBNrVoWg_ckp-RpA1oRUXbHyUu-RkRSMBg@mail.gmail.com>
 <70048DDD-E1C5-44FB-B254-54C2CAC9C029@apache.org> <CAJXnhA8tvgNMnyTGVkW3siVzdbEEXNkNu_U36nh0qCFhPmr7AA@mail.gmail.com>
 <CAKhqdDz0-JZbozV_bxnUQ6UDmL7BmkFwMpUFv8q-c78Y=g=P3g@mail.gmail.com>
From: vipul singh <neoeahit@gmail.com>
Date: Mon, 23 Oct 2017 23:22:58 -0700
Message-ID: <CAJXnhA_xhHxCcV1Yjcu4+vCVAwN9gvMONRrbN-bPOwo7DeNLxg@mail.gmail.com>
Subject: Re: Questions about checkpoints/savepoints
To: Tony Wei <tony19920430@gmail.com>
Cc: Aljoscha Krettek <aljoscha@apache.org>, Stefan Richter <s.richter@data-artisans.com>,
	user <user@flink.apache.org>
Content-Type: multipart/alternative; boundary="94eb2c191318a0f843055c44f91d"
archived-at: Tue, 24 Oct 2017 06:23:25 -0000

--94eb2c191318a0f843055c44f91d
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Thanks Tony, that was the issue. I was thinking that when we use Rocksdb
and provide an s3 path, it uses externalized checkpoints by default. Thanks
so much!

I have one followup question. Say in above case, I terminate the cluster,
and since the metadata is on s3, and not on local storage, does flink avoid
read after write consistency of s3? Would it be a valid concern, or we
handle that case in externalized checkpoints as well, and dont deal with
file system operations while dealing with retrieving externalized
checkpoints on s3.

Thanks,
Vipul


On Mon, Oct 23, 2017 at 11:00 PM, Tony Wei <tony19920430@gmail.com> wrote:

> Hi,
>
> Did you enable externalized checkpoints? [1]
>
> Best,
> Tony Wei
>
> [1] https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/setup/checkpoints.html#externalized-checkpoints
>
> 2017-10-24 13:07 GMT+08:00 vipul singh <neoeahit@gmail.com>:
>
>> Thanks Aljoscha for the answer above.
>>
>> I am experimenting with savepoints and checkpoints on my end, so that we
>> built fault tolerant application with exactly once semantics.
>>
>> I have been able to test various scenarios, but have doubts about one us=
e
>> case.
>>
>> My app is running on an emr cluster, and I am trying to test the case
>> when a emr cluster is terminated. I have read that
>> *state.checkpoints.dir *is responsible for storing metadata information,
>> and links to data files in *state.backend.fs.checkpointdir.*
>>
>> For my application I have configured both
>> *state.backend.fs.checkpointdir* and *state.checkpoints.dir*
>>
>> Also I have the following in my main app:
>>
>> env.enableCheckpointing(CHECKPOINT_TIME_MS)
>>
>> val CHECKPOINT_LOCATION =3D s"s3://${config.s3Bucket}/${config.s3BasePat=
h}/${config.s3ExtensionPath}/checkpoints/rocksdb"
>>
>> val backend:RocksDBStateBackend =3D
>>   new RocksDBStateBackend(CHECKPOINT_LOCATION)
>>
>> env.setStateBackend(backend)
>> env.getCheckpointConfig.setMinPauseBetweenCheckpoints(CHECKPOINT_MIN_PAU=
SE)
>> env.getCheckpointConfig.setCheckpointTimeout(CHECKPOINT_TIMEOUT_MS)
>> env.getCheckpointConfig.setMaxConcurrentCheckpoints(CHECKPOINT_MAX_CONCU=
RRENT)
>>
>>
>> In the application startup logs I can see
>> *state.backend.fs.checkpointdir* and *state.checkpoints.dir, *values
>> being loaded. However when the checkpoint happens I dont see any content=
 in
>> the metadata dir. Is there something I am missing? Please let me know. I=
 am
>> using flink version 1.3
>>
>> Thanks,
>> Vipul
>>
>>
>>
>> On Tue, Oct 10, 2017 at 7:55 AM, Aljoscha Krettek <aljoscha@apache.org>
>> wrote:
>>
>>> Hi,
>>>
>>> Flink does not rely on file system operations to list contents, all
>>> necessary file paths are stored in the meta data file, as you guessed. =
This
>>> is the reason savepoints also work with file systems that "only" have
>>> read-after-write consistency.
>>>
>>> Best,
>>> Aljoscha
>>>
>>>
>>> On 10. Oct 2017, at 03:01, vipul singh <neoeahit@gmail.com> wrote:
>>>
>>> Thanks Stefan for the answers above. These are really helpful.
>>>
>>> I have a few followup questions:
>>>
>>>    1. I see my savepoints are created in a folder, which has a
>>>    _metadata file and another file. Looking at the code
>>>    <https://github.com/apache/flink/blob/6642768ad8f8c5d1856742a6d148f7=
724c20666c/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/=
savepoint/SavepointStore.java#L191>
>>>    it seems like the metadata file contains tasks states, operator
>>>    state and master states
>>>    <https://github.com/apache/flink/blob/6642768ad8f8c5d1856742a6d148f7=
724c20666c/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/=
savepoint/SavepointV2.java#L87>.
>>>    What is the purpose of the other file in the savepoint folder? My gu=
ess is
>>>    it should be a checkpoint file?
>>>    2. I am planning to use s3 as my state backend, so want to ensure
>>>    that application restarts are not affected by read-after-write consi=
stency
>>>    of s3( if I use s3 as a savepoint backend). I am curious how flink r=
estores
>>>    data from the _metadata file, and the other file? Does the _metadata=
 file
>>>    contain path to these other files? or would it do a listing on the s=
3
>>>    folder?
>>>
>>>
>>> Please let me know,
>>>
>>> Thanks,
>>> Vipul
>>>
>>> On Tue, Sep 26, 2017 at 2:36 AM, Stefan Richter <
>>> s.richter@data-artisans.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have answered your questions inline:
>>>>
>>>>
>>>>    1. It seems to me that checkpoints can be treated as flink internal
>>>>    recovery mechanism, and savepoints act more as user-defined recover=
y
>>>>    points. Would that be a correct assumption?
>>>>
>>>> You could see it that way, but I would describe savepoints more as
>>>> user-defined *restart* points than *recovery* points. Please take a lo=
ok at
>>>> my answers in this thread, because they cover most of your question:
>>>>
>>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nab
>>>> ble.com/difference-between-checkpoints-amp-savepoints-td14787.html .
>>>>
>>>>
>>>>    1. While cancelling an application with -s option, it specifies the
>>>>    savepoint location. Is there a way during application startup to id=
entify
>>>>    the last know savepoint from a folder by itself, and restart from t=
here.
>>>>    Since I am saving my savepoints on s3, I want to avoid issues arisi=
ng from
>>>>    *ls* command on s3 due to read-after-write consistency of s3.
>>>>
>>>> I don=E2=80=99t think that this feature exists, you have to specify th=
e
>>>> savepoint.
>>>>
>>>>
>>>>    1. Suppose my application has a checkpoint at point t1, and say i
>>>>    cancel this application sometime in future before the next availabl=
e
>>>>    checkpoint( say t1+x). If I start the application without specifyin=
g the
>>>>    savepoint, it will start from the last known checkpoint(at t1), whi=
ch wont
>>>>    have the application state saved, since I had cancelled the applica=
tion.
>>>>    Would this is a correct assumption?
>>>>
>>>> If you restart a canceled application it will not consider checkpoints=
.
>>>> They are only considered in recovery on failure. You need to specify a
>>>> savepoint or externalized checkpoint for restarts to make explicit tha=
t you
>>>> intend to restart a job, and not to run a new instance of the job.
>>>>
>>>>
>>>>    1. Would using ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION
>>>>    be same as manually saving regular savepoints?
>>>>
>>>> Not the same, because checkpoints and savepoints are different in
>>>> certain aspects, but both methods leave you with something that surviv=
es
>>>> job cancelation and can be used to restart from a certain state.
>>>>
>>>> Best,
>>>> Stefan
>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Vipul
>>>
>>>
>>>
>>
>>
>> --
>> Thanks,
>> Vipul
>>
>
>


--=20
Thanks,
Vipul

--94eb2c191318a0f843055c44f91d
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Thanks Tony, that was the issue. I was thinking that when =
we use Rocksdb and provide an s3 path, it uses externalized checkpoints by =
default. Thanks so much!<div><br></div><div>I have one followup question. S=
ay in above case, I terminate the cluster, and since the metadata is on s3,=
 and not on local storage, does flink avoid read after write consistency of=
 s3? Would it be a valid concern, or we handle that case in externalized ch=
eckpoints as well, and dont deal with file system operations while dealing =
with retrieving externalized checkpoints on s3.=C2=A0</div><div><br></div><=
div>Thanks,</div><div>Vipul</div><div><br></div><div><br></div></div><div c=
lass=3D"gmail_extra"><br><div class=3D"gmail_quote">On Mon, Oct 23, 2017 at=
 11:00 PM, Tony Wei <span dir=3D"ltr">&lt;<a href=3D"mailto:tony19920430@gm=
ail.com" target=3D"_blank">tony19920430@gmail.com</a>&gt;</span> wrote:<br>=
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">Hi,<div><br></div><div>Did =
you enable externalized checkpoints? [1]</div><div><br></div><div>Best,</di=
v><div>Tony Wei</div><div><br></div><div>[1]=C2=A0<a href=3D"https://ci.apa=
che.org/projects/flink/flink-docs-release-1.3/setup/checkpoints.html#extern=
alized-checkpoints" target=3D"_blank">https://ci.apache.org/<wbr>projects/f=
link/flink-docs-<wbr>release-1.3/setup/checkpoints.<wbr>html#externalized-c=
heckpoints</a></div></div><div class=3D"HOEnZb"><div class=3D"h5"><div clas=
s=3D"gmail_extra"><br><div class=3D"gmail_quote">2017-10-24 13:07 GMT+08:00=
 vipul singh <span dir=3D"ltr">&lt;<a href=3D"mailto:neoeahit@gmail.com" ta=
rget=3D"_blank">neoeahit@gmail.com</a>&gt;</span>:<br><blockquote class=3D"=
gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-=
left:1ex"><div dir=3D"ltr">Thanks Aljoscha for the answer above.<div><br></=
div><div>I am experimenting with savepoints and checkpoints on my end, so t=
hat we built fault tolerant application with exactly once semantics.</div><=
div><br></div><div>I have been able to test various scenarios, but have dou=
bts about one use case.</div><div><br></div><div>My app is running on an em=
r cluster, and I am trying to test the case when a emr cluster is terminate=
d. I have read that <b>state.checkpoints.dir=C2=A0</b>is responsible for st=
oring metadata information, and links to data files in <b>state.backend.fs.=
checkpointdir<wbr>.</b></div><br>For my application I have configured both<=
br><b>state.backend.fs.checkpointdir</b> and <b>state.checkpoints.dir</b><d=
iv><br></div><div>Also I have the following in my main app:</div><div><pre =
style=3D"background-color:rgb(43,43,43);color:rgb(169,183,198);font-family:=
Menlo;font-size:9pt">env.enableCheckpointing(<span style=3D"color:rgb(152,1=
18,170);font-style:italic">CHECKP<wbr>OINT_TIME_MS</span>)<br><br><span sty=
le=3D"color:rgb(204,120,50);font-weight:bold">val </span>CHECKPOINT_LOCATIO=
N =3D <span style=3D"color:rgb(106,135,89)">s&quot;s3://</span><span style=
=3D"color:rgb(0,184,187);font-weight:bold">$</span>{config.s3Bucket}<span s=
tyle=3D"color:rgb(106,135,89)">/</span><span style=3D"color:rgb(0,184,187);=
font-weight:bold">$</span>{co<wbr>nfig.s3BasePath}<span style=3D"color:rgb(=
106,135,89)">/</span><span style=3D"color:rgb(0,184,187);font-weight:bold">=
$</span>{config.s3Ex<wbr>tensionPath}<span style=3D"color:rgb(106,135,89)">=
/checkpoints/rocks<wbr>db&quot;<br></span><span style=3D"color:rgb(106,135,=
89)"><br></span><span style=3D"color:rgb(204,120,50);font-weight:bold">val =
</span>backend:RocksDBStateBackend =3D<br>  <span style=3D"color:rgb(204,12=
0,50);font-weight:bold">new </span>RocksDBStateBackend(CHECKPOINT<wbr>_LOCA=
TION)<br><br>env.setStateBackend(backend)<br>env.getCheckpointConfig.setMin=
<wbr>PauseBetweenCheckpoints(<span style=3D"color:rgb(152,118,170);font-sty=
le:italic">CHECKP<wbr>OINT_MIN_PAUSE</span>)<br>env.getCheckpointConfig.set=
Che<wbr>ckpointTimeout(<span style=3D"color:rgb(152,118,170);font-style:ita=
lic">CHECKPOINT_<wbr>TIMEOUT_MS</span>)<br>env.getCheckpointConfig.setMax<w=
br>ConcurrentCheckpoints(<span style=3D"color:rgb(152,118,170);font-style:i=
talic">CHECKPOI<wbr>NT_MAX_CONCURRENT</span>)</pre></div><div><br></div><di=
v>In the application startup logs I can see <b>state.backend.fs.checkpointd=
ir</b> and <b>state.checkpoints.dir,=C2=A0</b>values being loaded. However =
when the checkpoint happens I dont see any content in the metadata dir. Is =
there something I am missing? Please let me know. I am using flink version =
1.3</div><div><br></div><div>Thanks,</div><div>Vipul</div><div><br></div><d=
iv><br></div></div><div class=3D"gmail_extra"><div><div class=3D"m_56077485=
53796994871h5"><br><div class=3D"gmail_quote">On Tue, Oct 10, 2017 at 7:55 =
AM, Aljoscha Krettek <span dir=3D"ltr">&lt;<a href=3D"mailto:aljoscha@apach=
e.org" target=3D"_blank">aljoscha@apache.org</a>&gt;</span> wrote:<br><bloc=
kquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #cc=
c solid;padding-left:1ex"><div style=3D"word-wrap:break-word;line-break:aft=
er-white-space">Hi,<div><br></div><div>Flink does not rely on file system o=
perations to list contents, all necessary file paths are stored in the meta=
 data file, as you guessed. This is the reason savepoints also work with fi=
le systems that &quot;only&quot; have read-after-write consistency.</div><d=
iv><br></div><div>Best,</div><div>Aljoscha<div><div class=3D"m_560774855379=
6994871m_-8468907600200373943h5"><br><div><br><blockquote type=3D"cite"><di=
v>On 10. Oct 2017, at 03:01, vipul singh &lt;<a href=3D"mailto:neoeahit@gma=
il.com" target=3D"_blank">neoeahit@gmail.com</a>&gt; wrote:</div><br class=
=3D"m_5607748553796994871m_-8468907600200373943m_6957746304798163292Apple-i=
nterchange-newline"><div><div dir=3D"ltr">Thanks Stefan for the answers abo=
ve. These are really helpful.<div><br></div><div>I have a few followup ques=
tions:</div><div><ol><li>I see my savepoints are created in a folder, which=
 has a _metadata file and another file. Looking at the <a href=3D"https://g=
ithub.com/apache/flink/blob/6642768ad8f8c5d1856742a6d148f7724c20666c/flink-=
runtime/src/main/java/org/apache/flink/runtime/checkpoint/savepoint/Savepoi=
ntStore.java#L191" target=3D"_blank">code</a> it seems like the metadata fi=
le contains <a href=3D"https://github.com/apache/flink/blob/6642768ad8f8c5d=
1856742a6d148f7724c20666c/flink-runtime/src/main/java/org/apache/flink/runt=
ime/checkpoint/savepoint/SavepointV2.java#L87" target=3D"_blank">tasks stat=
es, operator state and master states</a>. What is the purpose of the other =
file in the savepoint folder? My guess is it should be a checkpoint file?=
=C2=A0</li><li>I am planning to use s3 as my state backend, so want to ensu=
re that application restarts are not affected by read-after-write consisten=
cy of s3( if I use s3 as a savepoint backend). I am curious how flink resto=
res data from the _metadata file, and the other file? Does the _metadata fi=
le contain path to these other files? or would it do a listing on the s3 fo=
lder?</li></ol><div><br></div></div><div>Please let me know,</div><div><br>=
</div><div>Thanks,</div><div>Vipul</div></div><div class=3D"gmail_extra"><b=
r><div class=3D"gmail_quote">On Tue, Sep 26, 2017 at 2:36 AM, Stefan Richte=
r <span dir=3D"ltr">&lt;<a href=3D"mailto:s.richter@data-artisans.com" targ=
et=3D"_blank">s.richter@data-artisans.com</a>&gt;</span> wrote:<br><blockqu=
ote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc s=
olid;padding-left:1ex"><div style=3D"word-wrap:break-word"><div>Hi,</div><d=
iv><br></div><div>I have answered your questions inline:<span><br><blockquo=
te type=3D"cite"><div dir=3D"ltr"><div><div><ol><li>It seems to me that che=
ckpoints can be treated as flink internal recovery mechanism, and savepoint=
s act more as user-defined recovery points. Would that be a correct assumpt=
ion?<br></li></ol></div></div></div></blockquote></span><div>You could see =
it that way, but I would describe savepoints more as user-defined *restart*=
 points than *recovery* points. Please take a look at my answers in this th=
read, because they cover most of your question:</div><div><br></div><div><a=
 href=3D"http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.co=
m/difference-between-checkpoints-amp-savepoints-td14787.html" target=3D"_bl=
ank">http://apache-flink-user-maili<wbr>ng-list-archive.2336050.n4.nab<wbr>=
ble.com/difference-between-che<wbr>ckpoints-amp-savepoints-td1478<wbr>7.htm=
l</a>=C2=A0.</div><span><br><blockquote type=3D"cite"><div dir=3D"ltr"><div=
><div><ol start=3D"2"><li>While cancelling an application with -s option, i=
t specifies the savepoint location. Is there a way during application start=
up to identify the last know savepoint from a folder by itself, and restart=
 from there. Since I am saving my savepoints on s3, I want to avoid issues =
arising from <i>ls</i> command on s3 due to read-after-write consistency of=
 s3.</li></ol></div></div></div></blockquote></span><div>I don=E2=80=99t th=
ink that this feature exists, you have to specify the savepoint.</div><span=
><br><blockquote type=3D"cite"><div dir=3D"ltr"><div><div><ol start=3D"3"><=
li>Suppose my application has a checkpoint at point t1, and say i cancel th=
is application sometime in future before the next available checkpoint( say=
 t1+x). If I start the application without specifying the savepoint, it wil=
l start from the last known checkpoint(at t1), which wont have the applicat=
ion state saved, since I had cancelled the application. Would this is a cor=
rect assumption?</li></ol></div></div></div></blockquote></span><div>If you=
 restart a canceled application it will not consider checkpoints. They are =
only considered in recovery on failure. You need to specify a savepoint or =
externalized checkpoint for restarts to make explicit that you intend to re=
start a job, and not to run a new instance of the job.</div><span><br><bloc=
kquote type=3D"cite"><div dir=3D"ltr"><div><div><ol start=3D"4"><li>Would u=
sing=C2=A0<span style=3D"box-sizing:border-box;color:rgb(51,51,51);font-fam=
ily:&quot;Helvetica Neue&quot;,Helvetica,Arial,sans-serif;font-size:14px"><=
code style=3D"box-sizing:border-box;font-family:Menlo,&#39;Lucida Console&#=
39;,monospace;font-size:12.6px;padding:1px;border-top-left-radius:4px;borde=
r-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-ra=
dius:4px;background-position:initial initial;background-repeat:initial init=
ial"><span style=3D"font-weight:700">ExternalizedCheckpointCl<wbr>eanup.RET=
AIN_ON_CANCELLATION=C2=A0</span></code></span>b<wbr>e same as manually savi=
ng regular savepoints?=C2=A0</li></ol></div>
</div></div>
</blockquote></span></div>Not the same, because checkpoints and savepoints =
are different in certain aspects, but both methods leave you with something=
 that survives job cancelation and can be used to restart from a certain st=
ate.<div><br></div><div>Best,</div><div>Stefan<br><div><br></div></div></di=
v></blockquote></div><br><br clear=3D"all"><div><br></div>-- <br><div class=
=3D"m_5607748553796994871m_-8468907600200373943m_6957746304798163292gmail_s=
ignature" data-smartmail=3D"gmail_signature"><div dir=3D"ltr"><div>Thanks,<=
/div><div>Vipul</div></div></div>
</div>
</div></blockquote></div><br></div></div></div></div></blockquote></div><br=
><br clear=3D"all"><div><br></div></div></div><span class=3D"m_560774855379=
6994871HOEnZb"><font color=3D"#888888">-- <br><div class=3D"m_5607748553796=
994871m_-8468907600200373943gmail_signature" data-smartmail=3D"gmail_signat=
ure"><div dir=3D"ltr"><div>Thanks,</div><div>Vipul</div></div></div>
</font></span></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br><br clear=3D"all"><div><br></div>-- <br>=
<div class=3D"gmail_signature" data-smartmail=3D"gmail_signature"><div dir=
=3D"ltr"><div>Thanks,</div><div>Vipul</div></div></div>
</div>

--94eb2c191318a0f843055c44f91d--