From user-return-27516-archive-asf-public=cust-asf.ponee.io@flink.apache.org  Fri May  3 08:38:15 2019
Return-Path: <user-return-27516-archive-asf-public=cust-asf.ponee.io@flink.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [207.244.88.153])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 5A97218064D
	for <archive-asf-public@cust-asf.ponee.io>; Fri,  3 May 2019 10:38:10 +0200 (CEST)
Received: (qmail 62900 invoked by uid 500); 3 May 2019 08:37:57 -0000
Mailing-List: contact user-help@flink.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:user-help@flink.apache.org>
List-Unsubscribe: <mailto:user-unsubscribe@flink.apache.org>
List-Post: <mailto:user@flink.apache.org>
List-Id: <user.flink.apache.org>
Delivered-To: mailing list user@flink.apache.org
Received: (qmail 62890 invoked by uid 99); 3 May 2019 08:37:57 -0000
Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142)
    by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 May 2019 08:37:57 +0000
Received: from localhost (localhost [127.0.0.1])
	by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id C1FB2180D6C
	for <user@flink.apache.org>; Fri,  3 May 2019 08:37:56 +0000 (UTC)
X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org
X-Spam-Flag: NO
X-Spam-Score: 2.811
X-Spam-Level: **
X-Spam-Status: No, score=2.811 tagged_above=-999 required=6.31
	tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,
	DKIM_VALID_EF=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001,
	RCVD_IN_MSPIKE_H2=-0.001, SCC_5_SHORT_WORD_LINES=1,
	T_SPF_PERMERROR=0.01, URIBL_BLOCKED=0.001, WEIRD_PORT=0.001]
	autolearn=disabled
Authentication-Results: spamd3-us-west.apache.org (amavisd-new);
	dkim=pass (1024-bit key) header.d=rovio.com
Received: from mx1-lw-us.apache.org ([10.40.0.8])
	by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024)
	with ESMTP id rnetwktzBQPA for <user@flink.apache.org>;
	Fri,  3 May 2019 08:37:47 +0000 (UTC)
Received: from mail-lf1-f67.google.com (mail-lf1-f67.google.com [209.85.167.67])
	by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 57AFD60E5A
	for <user@flink.apache.org>; Fri,  3 May 2019 08:37:46 +0000 (UTC)
Received: by mail-lf1-f67.google.com with SMTP id n134so2032495lfn.11
        for <user@flink.apache.org>; Fri, 03 May 2019 01:37:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=rovio.com; s=google;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to
         :cc;
        bh=hBRU8cH20uUoGBxu7gdkhAehdP4XdKWNWCg1mpAtm1o=;
        b=gg/MlHYSopforJTdNVa7vGnkcIsraTmPD9iodop6aSlPaH79y15nL6MZVZHa5xrhXB
         In66vfy5SbqQchauVz0s8r94yTPo7uzPM4rk+HVKdZc9Q7NSfekcGEUvbKmmJ+44AmV5
         Z9UmOoIX+XETQ0G2ShuyuNRe4/66RHoOWr/Hs=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to:cc;
        bh=hBRU8cH20uUoGBxu7gdkhAehdP4XdKWNWCg1mpAtm1o=;
        b=QsNd6yww5zzCWbnFGMqS8I5yjzLWxYD5hZTyoaOzcBN8OGQRKIo8mq3q2LC8P2XlG5
         WMXWBZYbuzPpHsIu5ztznMu2en9Fpc2/wv2kSxy97mWWyhpQavVw5vb3xPM3qV1eTdQk
         mTMpdMLF9KEk0nDvqvoHocPDHUj/nX0dZw6YYYdvUfrHrM2q2K4w5jlGll6FcdTH7ITx
         VGXYwMxQoj8EVPMBaODX/touD/mDPiIvvfB2npBWNOhuxajPmQarpbdROVdzRK/JrmUZ
         R+ilLaD0NseVrLrXLAUPUm6V3GMSKsMPgNtvt9LdPB7MohIRqQyxufQGoJPsE8CHNlXZ
         S2UA==
X-Gm-Message-State: APjAAAXt2a30HLei8NxMOAFXv/N7C5dCjrs3Vf3kOOY5yfhXtg9MqMQa
	QuEbAOqQ76iEbQTk7TP7kYy8SDnVQX3ltpMrdP5Yx2Vq
X-Google-Smtp-Source: APXvYqyCBukQOfoCyhjhoNIYTXxuA9Sq9mfH/zGfEklP8coUW1mdYK7R57fGAY4wfCL+gd/lAW/nejfUOeY1SHOZWoA=
X-Received: by 2002:ac2:4a89:: with SMTP id l9mr4227547lfp.60.1556872664164;
 Fri, 03 May 2019 01:37:44 -0700 (PDT)
MIME-Version: 1.0
References: <CAMJEyBYCJ3KKfpSd4YesH3Xf-rhM5-X6WW7fiY3Z1f=N4bufcw@mail.gmail.com>
 <CAMJEyBZfuoLj_0X0OysEUXMDhf-FadWUgCo-JNVySRX5z6zjpA@mail.gmail.com>
 <CAMJEyBZ7CiYWyAwpM_+nxYSpfVpBfZZUPVfsR_jwhe9c-6k3jQ@mail.gmail.com>
 <514F56D1-15DE-4926-BC89-70F072A2AB9A@data-artisans.com> <CAMJEyBZugA4fwdAurcgr=E8m1uNs8-Nx0cnKgGGrmnaCW-EuxA@mail.gmail.com>
 <2E2CD23C-14AF-4150-A913-6099A18A974A@data-artisans.com> <CAMJEyBZPCQmLz+pVa4GcUS5Vviq78USMEnvgnHwBtqOVmukcNw@mail.gmail.com>
 <5A4DC214-2A5B-4D2A-82EA-E27ACE0FDC40@data-artisans.com> <CAMJEyBYwUtun9Y3swVS=psQ7Lcf20rBu0P7toiYkJn97UTh4gA@mail.gmail.com>
 <89CF5755-6128-4057-A402-22DA01734437@data-artisans.com> <CAMJEyBZ5wSrN9oUWem=L5uHWfWSjqwUYu2iGBsAyc2SJ42RW5w@mail.gmail.com>
 <F8A8E511-C871-41C6-A8D8-B18470E13F7A@data-artisans.com> <CAMJEyBa30hbFv4U0TuinJaL2iUe_WYufDpF_8Htrj06Hpx410g@mail.gmail.com>
 <4E274FF6-57FB-48FE-B715-5E74FCDE5937@data-artisans.com> <CAMJEyBYyTjtVYqjOay6M6D_+KK6rkCmdMTP8h0TkSXQaWpzbnw@mail.gmail.com>
 <CAMJEyBZaToU-L1CHBCpVBGKkYb8cB5z94hWLWk5YWdfpvbG0XA@mail.gmail.com>
 <CAMJEyBb22rWrYL=bHpP8TKpOH+fxougxOxEZFzeJ1ndGtkxmpw@mail.gmail.com>
 <DD1FEFFB-8383-4ABE-A090-287669AE4533@data-artisans.com> <CAC27z=NEX4ukTX-xDpj2kZpymu7TzxHtoaL9=i=+Y=z3njLWhQ@mail.gmail.com>
 <CAMJEyBZgvO6cUK1uTRPQfRZ27pB4XpWd6O=M2_BFjkQF-d1rbQ@mail.gmail.com>
 <BF55D2FD-736F-4ED1-9969-8A49EC61E2CD@data-artisans.com> <CAMJEyBa7d58MWjouVOXL-YB5Kh6dybN96d6G-AHWFAiOsR_MRg@mail.gmail.com>
 <967C8F58-E81C-4733-8E09-76558FF843C1@data-artisans.com> <CAMJEyBYCzYYYYN=fsW99T2bWS1XnH6xhsSjgGJ3JfshCKZq=iw@mail.gmail.com>
 <CAMJEyBbAKbrmD9h89BR0kmFv-g2Qv6PSNL=uHX9BzsDgg=A3wQ@mail.gmail.com>
 <CAMJEyBY++UJJyD_Wwa-T_xK5k8s61qbkM5cAHGASPQynFLZeEw@mail.gmail.com>
 <CAMJEyBaze9LU32ire7GB1PmJRn8wpb18cg3A1szjZ6+1FVAe4g@mail.gmail.com>
 <CAMJEyBYskJJXxt3goXeaSrMZ5Dt1v1YhaydSnXE=idUboAGBFg@mail.gmail.com>
 <CAMJEyBY7Y+p46LDfPb8+Aed3LsWnyzOiED9+fmmzYCKpfFPmMA@mail.gmail.com>
 <CAMJEyBZSpTuYOpX7kFMEWPYRZsgMT=D_-wzuOsN0qw_RkQVhyQ@mail.gmail.com>
 <CAMJEyBYCDwAZes-p2SqM2BzvHLK4iGhfASRqZ9qNOCQB37hyzw@mail.gmail.com>
 <CAMJEyBau6mNwD=yuQpPiooU_kwB_spWYM8oorXVf6F_BxYNULA@mail.gmail.com>
 <CA+faj9zeHxD0nhhwRAWQi5d67mKk+wUOxn-251O_R5i7S+8rdw@mail.gmail.com>
 <CAMJEyBaLDk6Lrih9hQepo8dJodRVc1f+n=KON+Waqx-CPOrGxQ@mail.gmail.com>
 <CA+faj9wTszfD=bEtacADQsJdNcc9F9AwaGMK+PDXaWRfkP4TuQ@mail.gmail.com>
 <CAAjxVPMdwQNyPyAtQC6wUqvZVWWKvTwCXorYmgisyP=fJD8wEA@mail.gmail.com>
 <CAMJEyBbj=uvx1K7FiPTa-2N6WkemsOYLGmWgV71OB5Lb1cQTCQ@mail.gmail.com>
 <CAAjxVPM0K7eznWU2qW+dUGJJfaK0jbEk62Cn12ffQt2_M09nCA@mail.gmail.com>
 <CAMJEyBYNcnDXTcrp5tn-RodRrwD_m9-gymL6Yt48QMnAWga8Bg@mail.gmail.com>
 <CAAjxVPP3tyjCfaH+4uN4+FOhMv5jXLPD0Ye1Sf6QGiw0WRj9Ug@mail.gmail.com>
 <CAMJEyBYWyt4SQOkMPiy7jJzj=+y_=MC-m0_ZpCNuDRxe5oVMhQ@mail.gmail.com>
 <CAMJEyBawODJpXHru5yAeXczW_PqBhNovVmw+O=mrgTTjUtg1UQ@mail.gmail.com>
 <CAAjxVPPAkUpNdF2KFJkB1ptKfv6+DzP3ARWKHNFwYHM5-p6kZQ@mail.gmail.com>
 <CAMJEyBa1P_BpczsdHKaOi3AyD-ULtCXupZXqjUNOtbSxjc4H+g@mail.gmail.com>
 <CAAjxVPPHZs8rNf3hZ5knAy7PFXs0VuKwCcdb92r-eMPzbj72eQ@mail.gmail.com>
 <CA+faj9wF-nmrM1JQ8FtOgXDPXDacx-f5gjTXiY_iAjf+26LHBw@mail.gmail.com>
 <CAMJEyBbZ-54bMnNxOZs-Lk+dCoL3k5WF74_=cZ-UvFMb83ae8g@mail.gmail.com>
 <CAAjxVPPW5sGvNZ633UX6qEBGUC_igtigjpqg5rRtvq=WxdoABQ@mail.gmail.com>
 <CAMJEyBaDQ3EpgTXBMG34QQ3+VUbz-fofqr8L1MXXWNqR-u8=7w@mail.gmail.com>
 <CAAjxVPM_iC2xa27=r00L_OYfs=z+UWYgde-7-OsdPLAYKkea1w@mail.gmail.com>
 <CAMJEyBaRUjM6wbhHAuUL-MJq4MQ2AFCSM4cBnvf2omarKrTgyQ@mail.gmail.com>
 <CAAjxVPM0a6grnMHR45N8r5FLhYg=Kp-mkXVHVTT-+m07wd+6gQ@mail.gmail.com>
 <CAMJEyBbHJy_8+fXJNsRpT2cvVSYLqGfgyr=AnSC1Wf5F88z1sA@mail.gmail.com>
 <CAAjxVPNEaPaWYPFooeCFo+dA+gO4GTmFxxxdA3=azfD0ZLO_5g@mail.gmail.com>
 <CAMJEyBY8HCq-uRfoXpzfBuS8_AwHXQ8YpLa-xb-r+1iQrr=FaA@mail.gmail.com>
 <CAAjxVPPz3YrPEsnYsY8XA-D-bkc3pCxzxnfQzsSKX7G8J7mhiw@mail.gmail.com>
 <CAMJEyBYwEXhj23sAwyF-2yuECOMfpt8qGYd7BHECqEuEUxBZYw@mail.gmail.com>
 <CAAjxVPMzPvuXuMDWkiBw1A51o5Vqe+m+LqXsMW3mkUu9_dFFog@mail.gmail.com>
 <CAAjxVPPCVWT_8gdyj7QdKQuufhK_hNbMbnJT8iPa70454qwo8Q@mail.gmail.com>
 <CAMJEyBZ-_h-=dGUC7PP0Xtkgk9qCu6t+2s1TwS5u+MesNXLAJQ@mail.gmail.com>
 <CAC27z=NAtRY7-gqvDHP6KzfkM_gj5uRe8svYzpnKNZr5i+-J8Q@mail.gmail.com>
 <CAMJEyBZjtUpdszfLjhgNv2+Ear+kkx+kNxRTMWGuM2qdf2q1aQ@mail.gmail.com>
 <727031CA-EC5C-4641-8024-4B03B329B781@ververica.com> <CAAjxVPNVD627wARWtUxf6nTwq_hDxNKZSawFPhxDjTV-nTckDQ@mail.gmail.com>
 <CAMJEyBaDUwv6pgXDCqs8P5Fp8kVbiQ6G6sC-q+nSi4qpaAZ=bQ@mail.gmail.com>
 <CAMJEyBZZdYbo0AQyHw2FYqwLdXirEFLF3Jq1N=1tRKYmQkC-aQ@mail.gmail.com>
 <CAAjxVPNeHP-gfM5P7x8Qno4fJbpi9UkW+OhV6K4-THQEoJVBPw@mail.gmail.com>
 <CAMJEyBZi_7rDmT0+y6LAWyLvRAMaNgAP7eeDpeR9Vk3_B10ayg@mail.gmail.com>
 <CAMJEyBbejtYACh=JcU8ABKnyFnEJZwyZ8AruCi=Ebx=dcKVvMQ@mail.gmail.com>
 <CAO+bLu+2YQL+_Wb7D-do6uHWhYH0dzGoaH1jTfPYjGrD6TMTgg@mail.gmail.com> <CAAjxVPMzw4Pwzv5YpMPix1yZjV-9R8kbMWdUkM_LY6tjAgsgLQ@mail.gmail.com>
In-Reply-To: <CAAjxVPMzw4Pwzv5YpMPix1yZjV-9R8kbMWdUkM_LY6tjAgsgLQ@mail.gmail.com>
From: Juho Autio <juho.autio@rovio.com>
Date: Fri, 3 May 2019 11:37:32 +0300
Message-ID: <CAMJEyBa=c=AqxZOFek94_2AWsNRFmcSH-RAxER2+34L5Vw8SSA@mail.gmail.com>
Subject: Re: Data loss when restoring from savepoint
To: Konstantin Knauf <konstantin@ververica.com>
Cc: Stefan Richter <s.richter@ververica.com>, user <user@flink.apache.org>
Content-Type: multipart/alternative; boundary="0000000000001bca7a0587f7aa7c"

--0000000000001bca7a0587f7aa7c
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Konstantin, thanks for providing the new code.

Here are the latest results for jobs run with extended DEBUG logging.

20190427 (killed & restored), missing_rows.count(): 3470
20190428 (no kill / restore), missing_rows.count(): 0

I have shared the logs from 27th (after restore) in private with Konstantin=
.

On Fri, Apr 26, 2019 at 5:05 PM Konstantin Knauf <konstantin@ververica.com>
wrote:

> Hi Juho,
>
> sorry for not being more responsive the last two weeks, I was on vacation
> for a good part of it. The fact that this also happens with Timers on
> RocksDB is again confusing. The code that we mainly had a look at so far =
is
> not used by the rocksdb configuration. So the inconsistencies that we saw
> in the logs, don't apply to the RocksDB configuration.
>
> Anyway, I agree to further track down the issue for the heap timers first=
,
> and then to move on to RocksDB. I have added more fine grained logging to
> the branch [1]. The two additional classes, which you need to set the
> logging level to DEBUG for, are
>
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl
>
> org.apache.flink.streaming.api.operators.InternalTimerServiceSerializatio=
nProxy
>
> Please run through the usual procedure of doing a savepoint and provide
> the logs during recovery.
>
> Thank you for your perseverance,
>
> Konstantin
>
> [1] https://github.com/knaufk/flink/tree/logging-timers
>
>
> On Thu, Apr 18, 2019 at 4:06 PM Oytun Tez <oytun@motaword.com> wrote:
>
>> Thanks for the update, Juho, and please do keep updating :) I've been
>> watching the thread silently, I am sure your findings helps many others =
who
>> watch the thread.
>>
>>
>>
>>
>>
>> ---
>> Oytun Tez
>>
>> *M O T A W O R D*
>> The World's Fastest Human Translation Platform.
>> oytun@motaword.com =E2=80=94 www.motaword.com
>>
>>
>> On Thu, Apr 18, 2019 at 8:26 AM Juho Autio <juho.autio@rovio.com> wrote:
>>
>>> In the meanwhile, some additional results, continued with ROCKSDB timer
>>> service:
>>>
>>> 20190416 (no cancellation), missing_rows.count(): 0
>>> 20190417 (cancel with savepoint & restore), missing_rows.count(): 54
>>>
>>> On Tue, Apr 16, 2019 at 2:35 PM Juho Autio <juho.autio@rovio.com> wrote=
:
>>>
>>>> Ouch, we have a data loss case now also with ROCKSDB timer service
>>>> factory. This time the job had failed for some reason & restored check=
point
>>>> by itself (I mean I didn=E2=80=99t cancel with savepoint this time. Pr=
evious
>>>> restore from savepoint was at 14-04-2019 06:21:45 UTC).
>>>>
>>>> In this case the number of lost ids was quite high:
>>>>
>>>> 20190415, missing_rows.count(): 706605
>>>>
>>>> I don't know if the ROCKSDB timer service is a factor towards higher
>>>> instability, but indeed I'd like to go back to testing with
>>>> InteralTimerServiceImpl as well. Will switch back to that when the upd=
ated
>>>> branch is available. Also I'm not sure if the cause of data loss is si=
milar
>>>> now with ROCKSDB timer service factory (lost timers or maybe something
>>>> else), because we didn't have corresponding DEBUG logging for this
>>>> implementation.
>>>>
>>>> On Mon, Apr 15, 2019 at 11:27 AM Konstantin Knauf <
>>>> konstantin@ververica.com> wrote:
>>>>
>>>>> Hi Juho,
>>>>>
>>>>> this is good news indeed! I have had a look at the _metadata files an=
d
>>>>> logs on Friday and it looks like a) the timer state is contained in t=
he
>>>>> savepoint files and b) the timer state is also initially read by the
>>>>> TaskStateManagerImpl, but they it is somehow lost until the reach the
>>>>> InteralTimerServiceImpl. I will provide updated version of my branch
>>>>> with more logging output to find the reason for this today or tomorro=
w. It
>>>>> would be great, if you could test this again then.
>>>>>
>>>>> Best,
>>>>>
>>>>> Konstantin
>>>>>
>>>>> On Mon, Apr 15, 2019 at 9:49 AM Juho Autio <juho.autio@rovio.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Great news:
>>>>>> There's no data loss (for the 3 days so far that were run) with
>>>>>> state.backend.rocksdb.timer-service.factory: ROCKSDB.
>>>>>>
>>>>>> Each day the job was once cancelled with savepoint & restored.
>>>>>>
>>>>>> 20190412, missing_rows.count(): 0
>>>>>> 20190413, missing_rows.count(): 0
>>>>>> 20190414, missing_rows.count(): 0
>>>>>>
>>>>>> Btw, now we don't get the DEBUG logs of
>>>>>> org.apache.flink.streaming.api.operators.InternalTimerServiceImpl an=
y more,
>>>>>> so I didn't know how to check from logs how many timers are restored=
. But
>>>>>> based on the results I'm assuming that all were successfully restore=
d.
>>>>>>
>>>>>> We'll keep testing this a bit more, but seems really promising
>>>>>> indeed. I thought at least letting it run for some days without
>>>>>> cancellations and on the other hand cancelling many times within the=
 same
>>>>>> day etc.
>>>>>>
>>>>>> Can I provide some additional debug logs or such to help find the bu=
g
>>>>>> when 'heap' is used for timers? Did you already analyze the _metadat=
a files
>>>>>> that I sent?
>>>>>>
>>>>>> On Thu, Apr 11, 2019 at 4:21 PM Juho Autio <juho.autio@rovio.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Shared _metadata files also, in private.
>>>>>>>
>>>>>>> The job is now running with
>>>>>>> state.backend.rocksdb.timer-service.factory: ROCKSDB. I started it =
from
>>>>>>> empty state because I wasn't sure would this change be migrated
>>>>>>> automatically(?). I guess clean setup like this is a good idea any =
way.
>>>>>>> First day that is fully processed with this conf will be tomorrow=
=3DFriday,
>>>>>>> and results can be compared on the next day.. I'll report back on t=
hat on
>>>>>>> Monday. I verified from Flink UI that the property is found in
>>>>>>> Configuration, but I still feel a bit unsure about if it's actually=
 being
>>>>>>> used. I wonder if there's some INFO level logging that could be che=
cked to
>>>>>>> confirm that?
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> On Thu, Apr 11, 2019 at 4:01 PM Konstantin Knauf <
>>>>>>> konstantin@ververica.com> wrote:
>>>>>>>
>>>>>>>> Hi Juho,
>>>>>>>>
>>>>>>>> thank you. I will have a look at your logs later today or tomorrow=
.
>>>>>>>> Could you also provide the metadata file of the savepoints in ques=
tion? It
>>>>>>>> is located in the parent directory of that savepoint and should fo=
llow this
>>>>>>>> naming ptterns "savepoints_.*_savepoint_.*__metadata".
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Konstantin
>>>>>>>>
>>>>>>>> On Thu, Apr 11, 2019 at 2:39 PM Stefan Richter <
>>>>>>>> s.richter@ververica.com> wrote:
>>>>>>>>
>>>>>>>>> No, it also matters for savepoints. I think the doc here is
>>>>>>>>> misleading, it is currently synchronous for all cases of RocksDB =
keyed
>>>>>>>>> state and heap timers.
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Stefan
>>>>>>>>>
>>>>>>>>> On 11. Apr 2019, at 14:30, Juho Autio <juho.autio@rovio.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Thanks Till. Any way, that's irrelevant in case of a savepoint,
>>>>>>>>> right?
>>>>>>>>>
>>>>>>>>> On Thu, Apr 11, 2019 at 2:54 PM Till Rohrmann <
>>>>>>>>> trohrmann@apache.org> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Juho,
>>>>>>>>>>
>>>>>>>>>> yes, it means that the snapshotting of the timer state does not
>>>>>>>>>> happen asynchronously but synchronously within the Task executor=
 thread.
>>>>>>>>>> During this operation, your operator won't make any progress, po=
tentially
>>>>>>>>>> causing backpressure for upstream operators.
>>>>>>>>>>
>>>>>>>>>> If you want to use fully asynchronous snapshots while also using
>>>>>>>>>> timer state, you should use the RocksDB backed timers.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Till
>>>>>>>>>>
>>>>>>>>>> On Thu, Apr 11, 2019 at 10:32 AM Juho Autio <juho.autio@rovio.co=
m>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Ok, I'm testing that
>>>>>>>>>>> state.backend.rocksdb.timer-service.factory: ROCKSDB in the mea=
nwhile.
>>>>>>>>>>>
>>>>>>>>>>> Btw, what does this actually mean (from
>>>>>>>>>>> https://ci.apache.org/projects/flink/flink-docs-stable/ops/stat=
e/large_state_tuning.html
>>>>>>>>>>> ):
>>>>>>>>>>>
>>>>>>>>>>> > The combination RocksDB state backend / with incremental
>>>>>>>>>>> checkpoint / with heap-based timers currently does NOT support =
asynchronous
>>>>>>>>>>> snapshots for the timers state. Other state like keyed state is=
 still
>>>>>>>>>>> snapshotted asynchronously. Please note that this is not a regr=
ession from
>>>>>>>>>>> previous versions and will be resolved with FLINK-10026.
>>>>>>>>>>>
>>>>>>>>>>> Is it just that snapshots are not asynchronous, so they cause
>>>>>>>>>>> some pauses? Does "not supported" here mean just some performan=
ce impact,
>>>>>>>>>>> or also correctness?
>>>>>>>>>>>
>>>>>>>>>>> Our job at hand is using RocksDB state backend and incremental
>>>>>>>>>>> checkpointing. However at least the restores that we've been te=
sting here
>>>>>>>>>>> have been from a *savepoint*, not an incremental checkpoint.
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Apr 10, 2019 at 4:46 PM Konstantin Knauf <
>>>>>>>>>>> konstantin@ververica.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Juho,
>>>>>>>>>>>>
>>>>>>>>>>>> one more thing we could try in a separate experiment is to
>>>>>>>>>>>> change the timer state backend to RocksDB as well by setting
>>>>>>>>>>>> state.backend.rocksdb.timer-service.factory: ROCKSDB
>>>>>>>>>>>> in the flink-conf.yaml and see if this also leads to the loss
>>>>>>>>>>>> of records. That would narrow it down quite a bit.
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>>
>>>>>>>>>>>> Konstantin
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Apr 10, 2019 at 1:02 PM Konstantin Knauf <
>>>>>>>>>>>> konstantin@ververica.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Juho,
>>>>>>>>>>>>>
>>>>>>>>>>>>> sorry for the late reply. Please continue to use the custom
>>>>>>>>>>>>> Flink build and add additional logging for TaskStateManagerIm=
pl by adding
>>>>>>>>>>>>> the following line to your log4j configuration.
>>>>>>>>>>>>>
>>>>>>>>>>>>> log4j.logger.org.apache.flink.runtime.state.TaskStateManagerI=
mpl=3DDEBUG
>>>>>>>>>>>>>
>>>>>>>>>>>>> Afterwards, do a couple of savepoint & restore until you see =
a
>>>>>>>>>>>>> number of restores < 80 as before and share the logs with me =
(at least for
>>>>>>>>>>>>> TaskStateMangerImpl & InternalTimerServiceImpl).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Konstantin
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Apr 4, 2019 at 9:03 AM Juho Autio <
>>>>>>>>>>>>> juho.autio@rovio.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Konstantin,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for the follow-up.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> There are only 76 lines for restore in Job 3 instead of 80.
>>>>>>>>>>>>>>> It would be very useful to know, if these lines were lost b=
y the log
>>>>>>>>>>>>>>> aggregation or really did not exist.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I fetched the actual taskmanager.log files to verify (we
>>>>>>>>>>>>>> store the original files on s3). Then did grep for
>>>>>>>>>>>>>> "InternalTimerServiceImpl  - Restored".
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is for "job 1. (start - end) first restore with debug
>>>>>>>>>>>>>> logging":
>>>>>>>>>>>>>> Around 2019-03-26 09:08:43,352 - 78 hits
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is for "job 3. (start-middle) 3rd restore with debug
>>>>>>>>>>>>>> logging (following day)":
>>>>>>>>>>>>>> Around 2019-03-27 07:39:06,414 - 76 hits
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So yeah, we can rely on our log delivery to Kibana.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Note that as a new piece of information I found that the sam=
e
>>>>>>>>>>>>>> job also did an automatic restore from checkpoint around 201=
9-03-30 20:36
>>>>>>>>>>>>>> and there were 79 hits instead of 80. So it doesn't seem to =
be only a
>>>>>>>>>>>>>> problem in case of savepoints, can happen with a checkpoint =
restore as well.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Were there any missing records in the output for the day of
>>>>>>>>>>>>>>> the Job 1 -> Job 2 transition (26th of March)?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 20190326: missing 2592
>>>>>>>>>>>>>> 20190327: missing 4270
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This even matches with the fact that on 26th 2 timers were
>>>>>>>>>>>>>> missed in restore but on 27th it was 4.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What's next? :)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Apr 4, 2019 at 12:32 AM Konstantin Knauf <
>>>>>>>>>>>>>> konstantin@ververica.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Juho,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> one thing that makes the log output a little bit hard to
>>>>>>>>>>>>>>> analyze is the fact, that the "Snapshot" lines include Save=
points as well
>>>>>>>>>>>>>>> as Checkpoints. To identify the savepoints, I looked at the=
 last 80 lines
>>>>>>>>>>>>>>> per job, which seems plausible given the timestamps of the =
lines.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So, let's compare the number of timers before and after
>>>>>>>>>>>>>>> restore:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Job 1 -> Job 2
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 23.091.002 event time timers for both. All timers for the
>>>>>>>>>>>>>>> same window. So this looks good.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Job 2 -> Job 3
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 18.565.234 timers during snapshotting. All timers for the
>>>>>>>>>>>>>>> same window.
>>>>>>>>>>>>>>> 17.636.774 timers during restore. All timers for the same
>>>>>>>>>>>>>>> window.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> There are only 76 lines for restore in Job 3 instead of 80.
>>>>>>>>>>>>>>> It would be very useful to know, if these lines were lost b=
y the log
>>>>>>>>>>>>>>> aggregation or really did not exist.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Were there any missing records in the output for the day of
>>>>>>>>>>>>>>> the Job 1 -> Job 2 transition (26th of March)?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Konstantin
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Mar 29, 2019 at 2:21 PM Juho Autio <
>>>>>>>>>>>>>>> juho.autio@rovio.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I created a zip with these files:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> job 1. (start - end) first restore with debug logging
>>>>>>>>>>>>>>>> job 2. (start-middle) second restore with debug logging
>>>>>>>>>>>>>>>> (same day)
>>>>>>>>>>>>>>>> job 2. (middle - end) before savepoint & cancel (following
>>>>>>>>>>>>>>>> day)
>>>>>>>>>>>>>>>> job 3. (start-middle) 3rd restore with debug logging
>>>>>>>>>>>>>>>> (following day)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It can be downloaded here:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> https://www.dropbox.com/s/33z0jbolueokao6/flink_debug_logs=
.zip?dl=3D0
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, Mar 28, 2019 at 7:08 PM Konstantin Knauf <
>>>>>>>>>>>>>>>> konstantin@ververica.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Juho,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yes, the number is the last number in the line. Feel free
>>>>>>>>>>>>>>>>> to share all lines.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Konstantin
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Mar 28, 2019 at 5:00 PM Juho Autio <
>>>>>>>>>>>>>>>>> juho.autio@rovio.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi Konstantin!
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I would be interested in any changes in the number of
>>>>>>>>>>>>>>>>>>> timers, not only the number of logged messages.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Sorry for the delay. I see, the count is the number of
>>>>>>>>>>>>>>>>>> timers that last number on log line. For example for thi=
s row it's 270409:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> March 26th 2019, 11:08:39.822 DEBUG
>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTimerS=
erviceImpl Restored:
>>>>>>>>>>>>>>>>>>> TimeWindow{start=3D1553558400000, end=3D1553644800000} =
-> 270409
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The log lines don't contain task id =E2=80=93 how should=
 they be
>>>>>>>>>>>>>>>>>> compared across different snapshots? Or should I share a=
ll of these logs
>>>>>>>>>>>>>>>>>> (at least couple of snapshots around the point of restor=
e) and you'll
>>>>>>>>>>>>>>>>>> compare them?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Tue, Mar 26, 2019 at 9:55 PM Konstantin Knauf <
>>>>>>>>>>>>>>>>>> konstantin@ververica.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi Juho,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I based the branch on top of the current 1.6.4 branch. =
I
>>>>>>>>>>>>>>>>>>> can rebase on 1.6.2 for any future iterations. I would =
be interested in any
>>>>>>>>>>>>>>>>>>> changes in the number of timers, not only the number of=
 logged messages.
>>>>>>>>>>>>>>>>>>> The sum of all counts should be the same during snapsho=
tting and restore.
>>>>>>>>>>>>>>>>>>> While a window is open, this number should always incre=
ase (when comparing
>>>>>>>>>>>>>>>>>>> multiple snapshots).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Konstantin
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Tue, Mar 26, 2019 at 11:01 AM Juho Autio <
>>>>>>>>>>>>>>>>>>> juho.autio@rovio.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi Konstantin,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I got that debug logging working.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> You would now need to take a savepoint and restore
>>>>>>>>>>>>>>>>>>>>> sometime in the middle of the day and should be able =
to check
>>>>>>>>>>>>>>>>>>>>> a) if there are any timers for the very old windows,
>>>>>>>>>>>>>>>>>>>>> for which there is still some content lingering aroun=
d
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> No timers for old windows were logged.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> All timers are for the same time window, for example:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> March 26th 2019, 11:08:39.822 DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl Restored:
>>>>>>>>>>>>>>>>>>>>> TimeWindow{start=3D1553558400000, end=3D1553644800000=
} -> 270409
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Those milliseconds correspond to:
>>>>>>>>>>>>>>>>>>>> Tue Mar 26 00:00:00 UTC 2019 =E2=80=93 Wed Mar 27 00:0=
0:00 UTC
>>>>>>>>>>>>>>>>>>>> 2019.
>>>>>>>>>>>>>>>>>>>> - So this seems normal
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> b) if there less timers after restore for the current
>>>>>>>>>>>>>>>>>>>>> window. The missing timers would be recreated, as soo=
n as any additional
>>>>>>>>>>>>>>>>>>>>> records for the same key arrive within the window. Th=
is means the number of
>>>>>>>>>>>>>>>>>>>>> missing records might be less then the number of miss=
ing timers.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Grepping for "Restored" gives 78 hits. That's
>>>>>>>>>>>>>>>>>>>> suspicious because this job's parallelism is 80. The f=
ollowing group for
>>>>>>>>>>>>>>>>>>>> grep "Snapshot" already gives 80 hits. Ok actually tha=
t would match with
>>>>>>>>>>>>>>>>>>>> what you wrote: "missing timers would be recreated, as=
 soon as any
>>>>>>>>>>>>>>>>>>>> additional records for the same key arrive within the =
window".
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I tried killing & restoring once more. This time
>>>>>>>>>>>>>>>>>>>> grepping for "Restored" gives 80 hits. Note that it's =
possible that some
>>>>>>>>>>>>>>>>>>>> logs had been lost around the time of restoration beca=
use I'm browsing the
>>>>>>>>>>>>>>>>>>>> logs through Kibana (ELK stack).
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I will try kill & restore again tomorrow around noon &
>>>>>>>>>>>>>>>>>>>> collect the same info. Is there anything else that you=
'd like me to share?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> By the way, it seems that your branch* is not based on
>>>>>>>>>>>>>>>>>>>> 1.6.2 release, why so? It probably doesn't matter, but=
 in general would be
>>>>>>>>>>>>>>>>>>>> good to minimize the scope of changes. But let's roll =
with this for now, I
>>>>>>>>>>>>>>>>>>>> don't want to build another package because it seems l=
ike we're able to
>>>>>>>>>>>>>>>>>>>> replicate the issue with this version :)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> Juho
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> *)
>>>>>>>>>>>>>>>>>>>> https://github.com/apache/flink/compare/release-1.6.2.=
..knaufk:logging-timers
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Wed, Mar 20, 2019 at 2:20 PM Konstantin Knauf <
>>>>>>>>>>>>>>>>>>>> konstantin@ververica.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hi Juho,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I created a branch [1] which logs the number of event
>>>>>>>>>>>>>>>>>>>>> time timers per namespace during snapshot and restore=
.  Please refer to [2]
>>>>>>>>>>>>>>>>>>>>> to build Flink from sources.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> You need to set the logging level to DEBUG for
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl. If you
>>>>>>>>>>>>>>>>>>>>> use log4j this is a one-liner in your log4j.propertie=
s:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> log4j.logger.org.apache.flink.streaming.api.operators=
.InternalTimerServiceImpl=3DDEBUG
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The only additional logs will be the lines added in
>>>>>>>>>>>>>>>>>>>>> the branch. The lines are of the following format (<W=
indow> -> <Number of
>>>>>>>>>>>>>>>>>>>>> Timers>), e.g.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Snapshot: TimeWindow{start=3D1553083589256, end=3D155=
3083589258} -> 1
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Snapshot: TimeWindow{start=3D1553083589256, end=3D155=
3083589258} -> 2
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Snapshot: TimeWindow{start=3D1553083589456, end=3D155=
3083589458} -> 2
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Snapshot: TimeWindow{start=3D1553083589356, end=3D155=
3083589358} -> 2
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Snapshot: TimeWindow{start=3D1553083589482, end=3D155=
3083589484} -> 2
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Snapshot: TimeWindow{start=3D1553083589456, end=3D155=
3083589458} -> 1
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Snapshot: TimeWindow{start=3D1553083589256, end=3D155=
3083589258} -> 2
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Snapshot: TimeWindow{start=3D1553083589356, end=3D155=
3083589358} -> 1
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Snapshot: TimeWindow{start=3D1553083589456, end=3D155=
3083589458} -> 2
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Snapshot: TimeWindow{start=3D1553083589482, end=3D155=
3083589484} -> 1
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Snapshot: TimeWindow{start=3D1553083589356, end=3D155=
3083589358} -> 2
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Snapshot: TimeWindow{start=3D1553083589482, end=3D155=
3083589484} -> 2
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Restored: TimeWindow{start=3D1553083589256, end=3D155=
3083589258} -> 1
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Restored: TimeWindow{start=3D1553083589456, end=3D155=
3083589458} -> 1
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Restored: TimeWindow{start=3D1553083589356, end=3D155=
3083589358} -> 1
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Restored: TimeWindow{start=3D1553083589482, end=3D155=
3083589484} -> 1
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Restored: TimeWindow{start=3D1553083589256, end=3D155=
3083589258} -> 2
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Restored: TimeWindow{start=3D1553083589456, end=3D155=
3083589458} -> 2
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Restored: TimeWindow{start=3D1553083589356, end=3D155=
3083589358} -> 2
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Restored: TimeWindow{start=3D1553083589482, end=3D155=
3083589484} -> 2
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Restored: TimeWindow{start=3D1553083589256, end=3D155=
3083589258} -> 2
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Restored: TimeWindow{start=3D1553083589456, end=3D155=
3083589458} -> 2
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Restored: TimeWindow{start=3D1553083589356, end=3D155=
3083589358} -> 2
>>>>>>>>>>>>>>>>>>>>> DEBUG
>>>>>>>>>>>>>>>>>>>>> org.apache.flink.streaming.api.operators.InternalTime=
rServiceImpl  -
>>>>>>>>>>>>>>>>>>>>> Restored: TimeWindow{start=3D1553083589482, end=3D155=
3083589484} -> 2
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> You would now need to take a savepoint and restore
>>>>>>>>>>>>>>>>>>>>> sometime in the middle of the day and should be able =
to check
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> a) if there are any timers for the very old windows,
>>>>>>>>>>>>>>>>>>>>> for which there is still some content lingering aroun=
d
>>>>>>>>>>>>>>>>>>>>> b) if there less timers after restore for the current
>>>>>>>>>>>>>>>>>>>>> window. The missing timers would be recreated, as soo=
n as any additional
>>>>>>>>>>>>>>>>>>>>> records for the same key arrive within the window. Th=
is means the number of
>>>>>>>>>>>>>>>>>>>>> missing records might be less then the number of miss=
ing timers.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Looking forward to the results!
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Konstantin
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>> https://github.com/knaufk/flink/tree/logging-timers
>>>>>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>> https://ci.apache.org/projects/flink/flink-docs-relea=
se-1.6/start/building.html#build-flink
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Tue, Mar 19, 2019 at 2:06 PM Juho Autio <
>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks, answers below.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> * Which Flink version do you need this for?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> 1.6.2
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> * You use RocksDBStatebackend, correct? If so, which
>>>>>>>>>>>>>>>>>>>>>> value do your set for "state.backend.rocksdb.timer-s=
ervice.factory" in the
>>>>>>>>>>>>>>>>>>>>>> flink-conf.yaml.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Yes, RocksDBStatebackend. We don't
>>>>>>>>>>>>>>>>>>>>>> set state.backend.rocksdb.timer-service.factory at a=
ll, so whatever is the
>>>>>>>>>>>>>>>>>>>>>> default in Flink 1.6.2? Based on the docs it seems t=
hat it would be "heap"
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> https://ci.apache.org/projects/flink/flink-docs-rele=
ase-1.6/ops/state/large_state_tuning.html
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Mon, Mar 18, 2019 at 6:26 PM Konstantin Knauf <
>>>>>>>>>>>>>>>>>>>>>> konstantin@ververica.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Hi Juho,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I will prepare a Flink branch for you, which logs
>>>>>>>>>>>>>>>>>>>>>>> the number of event time timers per window before s=
napshot and after
>>>>>>>>>>>>>>>>>>>>>>> restore. With this we should be able to check, if t=
imers are lost during
>>>>>>>>>>>>>>>>>>>>>>> savepoints.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Two questions:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> * Which Flink version do you need this for? 1.6?
>>>>>>>>>>>>>>>>>>>>>>> * You use RocksDBStatebackend, correct? If so, whic=
h
>>>>>>>>>>>>>>>>>>>>>>> value do your set for "state.backend.rocksdb.timer-=
service.factory" in the
>>>>>>>>>>>>>>>>>>>>>>> flink-conf.yaml.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Konstantin
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Thu, Mar 14, 2019 at 12:20 PM Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Hi Konstantin,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Reading timers from snapshot doesn't seem
>>>>>>>>>>>>>>>>>>>>>>>> straightforward. I wrote in private with Gyula, he=
 gave more suggestions
>>>>>>>>>>>>>>>>>>>>>>>> (thanks!) but still it seems that it may be a rath=
er big effort for me to
>>>>>>>>>>>>>>>>>>>>>>>> figure it out. Would you be able to help with that=
? If yes, there's this
>>>>>>>>>>>>>>>>>>>>>>>> existing unit test that can be extended to test re=
ading timers:
>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/king/bravo/blob/master/bravo/sr=
c/test/java/com/king/bravo/ReducerStateReadingTest.java#L37-L38
>>>>>>>>>>>>>>>>>>>>>>>> . The test already has a state with some values in=
 reducer window state, so
>>>>>>>>>>>>>>>>>>>>>>>> I'm assuming that it must also contain some window=
 timers.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> This is what Gyula wrote to me:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Maybe I was wrong when I said the createOperatorSt=
ateBackendsFromSnapshot
>>>>>>>>>>>>>>>>>>>>>>>> is the way to do it.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On a second thought Timers are probably stored as
>>>>>>>>>>>>>>>>>>>>>>>> raw keyed state in the operator. I don=E2=80=99t r=
emember building any utility to
>>>>>>>>>>>>>>>>>>>>>>>> read that.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> At the moment I am quite busy with other work so
>>>>>>>>>>>>>>>>>>>>>>>> wont have time to build it for you, so you might h=
ave to figure it out
>>>>>>>>>>>>>>>>>>>>>>>> yourself.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I would try to look at how keyed states are read:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Look at the implementation of:
>>>>>>>>>>>>>>>>>>>>>>>> createOperatorStateBackendsFromSnapshot()
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Instead of getManagedOperatorState you want to try
>>>>>>>>>>>>>>>>>>>>>>>> getRawKeyedState and also look at how Flink restor=
es it internally for
>>>>>>>>>>>>>>>>>>>>>>>> Timers
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I would start looking around here I guess:
>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/flink/blob/master/flink-=
streaming-java/src/main/java/org/apache/flink/streaming/api/operators/Abstr=
actStreamOperator.java#L238
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/flink/blob/e8daa49a593ed=
c401cd44761b25b1324b11be4a6/flink-streaming-java/src/main/java/org/apache/f=
link/streaming/api/operators/StreamTaskStateInitializerImpl.java#L199
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Mar 12, 2019 at 5:41 PM Gyula F=C3=B3ra <
>>>>>>>>>>>>>>>>>>>>>>>> gyula.fora@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Should be possible to read timer states by:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> OperatorStateReader#createOperatorStateBackendFro=
mSnapshot
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Then you have to get the timer state out of the
>>>>>>>>>>>>>>>>>>>>>>>>> OperatorStateBackend, but keep in mind that this =
will restore the operator
>>>>>>>>>>>>>>>>>>>>>>>>> states in memory.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Gyula
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Mar 12, 2019 at 4:29 PM Konstantin Knauf =
<
>>>>>>>>>>>>>>>>>>>>>>>>> konstantin@ververica.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Juho,
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> okay, so it seems that although the watermark
>>>>>>>>>>>>>>>>>>>>>>>>>> passed the endtime of the event time windows,  t=
he window was not triggered
>>>>>>>>>>>>>>>>>>>>>>>>>> for some of the keys.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> The timers, which would trigger the firing of th=
e
>>>>>>>>>>>>>>>>>>>>>>>>>> window, are also part of the keyed state and are=
 snapshotted/restored. I
>>>>>>>>>>>>>>>>>>>>>>>>>> would like to check if timers (as opposed to the=
 window content itself) are
>>>>>>>>>>>>>>>>>>>>>>>>>> maybe lost during the savepoint & restore proced=
ure. Using Bravo, are you
>>>>>>>>>>>>>>>>>>>>>>>>>> also able to inspect the timer state of the save=
points? In particular, I
>>>>>>>>>>>>>>>>>>>>>>>>>> would be interested if for two subsequent savepo=
ints all timers (i.e. one
>>>>>>>>>>>>>>>>>>>>>>>>>> timer per window and key including the missing k=
eys) are present in the
>>>>>>>>>>>>>>>>>>>>>>>>>> savepoint.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> @Gyula F=C3=B3ra <gyula.fora@gmail.com>: Does Br=
avo
>>>>>>>>>>>>>>>>>>>>>>>>>> support reading timer state as well?
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Konstantin
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Mar 7, 2019 at 6:41 PM Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Right, the window operator is the one by name
>>>>>>>>>>>>>>>>>>>>>>>>>>> "DistinctFunction".
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> http
>>>>>>>>>>>>>>>>>>>>>>>>>>> http://10.1.59.75:20888/proxy/application_15519=
56351667_0001/jobs/3e4ffaadbd84af3488286863f00d4f23/vertices/19ede2f818524a=
7f310857e537fa6808/metrics\?get\=3D0.currentInputWatermark,1.currentInputWa=
termark,2.currentInputWatermark,3.currentInputWatermark,4.currentInputWater=
mark,5.currentInputWatermark,6.currentInputWatermark,7.currentInputWatermar=
k,8.currentInputWatermark,9.currentInputWatermark,10.currentInputWatermark,=
11.currentInputWatermark,12.currentInputWatermark,13.currentInputWatermark,=
14.currentInputWatermark,15.currentInputWatermark,16.currentInputWatermark,=
17.currentInputWatermark,18.currentInputWatermark,19.currentInputWatermark,=
20.currentInputWatermark,21.currentInputWatermark,22.currentInputWatermark,=
23.currentInputWatermark,24.currentInputWatermark,25.currentInputWatermark,=
26.currentInputWatermark,27.currentInputWatermark,28.currentInputWatermark,=
29.currentInputWatermark,30.currentInputWatermark,31.currentInputWatermark,=
32.currentInputWatermark,33.currentInputWatermark,34.currentInputWatermark,=
35.currentInputWatermark,36.currentInputWatermark,37.currentInputWatermark,=
38.currentInputWatermark,39.currentInputWatermark,40.currentInputWatermark,=
41.currentInputWatermark,42.currentInputWatermark,43.currentInputWatermark,=
44.currentInputWatermark,45.currentInputWatermark,46.currentInputWatermark,=
47.currentInputWatermark,48.currentInputWatermark,49.currentInputWatermark,=
50.currentInputWatermark,51.currentInputWatermark,52.currentInputWatermark,=
53.currentInputWatermark,54.currentInputWatermark,55.currentInputWatermark,=
56.currentInputWatermark,57.currentInputWatermark,58.currentInputWatermark,=
59.currentInputWatermark,60.currentInputWatermark,61.currentInputWatermark,=
62.currentInputWatermark,63.currentInputWatermark,64.currentInputWatermark,=
65.currentInputWatermark,66.currentInputWatermark,67.currentInputWatermark,=
68.currentInputWatermark,69.currentInputWatermark,70.currentInputWatermark,=
71.currentInputWatermark,72.currentInputWatermark,73.currentInputWatermark,=
74.currentInputWatermark,75.currentInputWatermark,76.currentInputWatermark,=
77.currentInputWatermark,78.currentInputWatermark,79.currentInputWatermark
>>>>>>>>>>>>>>>>>>>>>>>>>>> | jq '.[].value' --raw-output | uniq -c
>>>>>>>>>>>>>>>>>>>>>>>>>>>   80 1551980102743
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> date -r "$((1551980102743/1000))"
>>>>>>>>>>>>>>>>>>>>>>>>>>> Thu Mar  7 19:35:02 EET 2019
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> To me that makes sense =E2=80=93 how would the =
window be
>>>>>>>>>>>>>>>>>>>>>>>>>>> triggered at all, if not all sub-tasks have a h=
igh enough watermark, so
>>>>>>>>>>>>>>>>>>>>>>>>>>> that the operator level watermark can be advanc=
ed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Mar 7, 2019 at 5:33 PM Konstantin Knauf=
 <
>>>>>>>>>>>>>>>>>>>>>>>>>>> konstantin@ververica.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Juho,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> great, we are getting closer :)  Could you
>>>>>>>>>>>>>>>>>>>>>>>>>>>> please check the "Watermarks" tab the Flink UI=
 of this job and check if the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> current watermark for all parallel subtasks of=
 the WindowOperator is close
>>>>>>>>>>>>>>>>>>>>>>>>>>>> to the current date/time?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Konstantin
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Mar 7, 2019 at 3:01 PM Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Wow, indeed the missing data from previous
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> date is still found in the savepoint!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Actually what I now found is that there is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> still data from even older dates in the state=
:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> %%spark
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> state_json_next_day.groupBy(state_json_next_d=
ay.ts.substr(1,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 10).alias('day')).count().orderBy('day').show=
(n=3D1000)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> +----------+--------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> |       day|   count|
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> +----------+--------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> |2018-08-22|    4206|
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ..
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (manually truncated)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ..
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> |2019-02-03|       4|
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> |2019-02-14|   12881|
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> |2019-02-15|    1393|
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> |2019-02-25|    8774|
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> |2019-03-06|    9293|
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> |2019-03-07|28113105|
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> +----------+--------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Of course that's the expected situation after
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> we have learned that some window contents are=
 left untriggered.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I don't have the logs any more, but I think
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on 2018-08-22 I have reset the state, and sin=
ce then it's been always
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> kept/restored from savepoint. I can also see =
some dates there on which I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> didn't cancel the stream. But I can't be sure=
 if it has gone through some
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> automatic restart by flink. So we can't rule =
out that some window contents
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wouldn't sometimes also be missed during norm=
al operation. However,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> savepoint restoration at least makes the prob=
lem more prominent. I have
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> previously mentioned that I would suspect thi=
s to be some kind of race
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> condition that is affected by load on the clu=
ster. Reason for my suspicion
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is that during savepoint restoration the clus=
ter is also catching up kafka
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> offsets on full speed, so it is considerably =
more loaded than usually.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Otherwise this problem might not have much to=
 do with savepoints of course.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Are you able to investigate the problem in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Flink code based on this information?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Many thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Juho
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Mar 6, 2019 at 1:41 PM Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the investigation & summary.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> As you suggested, I will next take savepoint=
s
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on two subsequent days & check the reducer s=
tate for both days.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Mar 6, 2019 at 1:18 PM Konstantin
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Knauf <konstantin@ververica.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (Moving the discussion back to the ML)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Juho,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> after looking into your code, we are still
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pretty much in the dark with respect what i=
s going wrong.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Let me try to summarize, what we know given
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> your experiments so far:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1) the lost records were processed and put
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> into state *before* the restart of the job,=
 not afterwards
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2) the lost records are part of the state
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> after the restore (because they are contain=
ed in subsequent savepoints)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3) the sinks are not the problem (because
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the metrics of the WindowOperator showed th=
at the missing records have not
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> been sent to the sinks)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 4) it is not the batch job used for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reference, which is wrong, because of 1)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 5) records are only lost when restarting
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> from a savepoint (not during normal operati=
ons)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> One explanation would be, that one of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> WindowOperators did not fire (for whatever =
reason) and the missing records
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> are still in the window's state when you ru=
n your test. Could you please
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> check, whether this is the case by taking a=
 savepoint on the next day and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> check if the missing records are contained =
in it.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Konstantin
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Feb 18, 2019 at 8:32 PM Juho Autio =
<
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Konstantin, thanks.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I gathered the additional info as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> discussed. No surprises there.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * do you know if all lost records are
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> contained in the last savepoint you took =
before the window fired? This
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would mean that no records are lost after=
 the last restore.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Indeed this is the case. I saved the list
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of all missing IDs, analyzed the savepoint=
 with Bravo, and the savepoint
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> state (already) contained all IDs that wer=
e eventually missed in output.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * could you please check the numRecordsOut
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> metric for the WindowOperator (FlinkUI ->=
 TaskMetrics -> Select TaskChain
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> containing WindowOperator -> find metric)=
? Is the count reported there
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> correct (no missing data)?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The number matches with output rows. The
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sum of numRecordsOut metrics was 45755630,=
 and count(*) of the output on s3
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> resulted in the same number. Batch output =
has a bit more IDs of course
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (this time it was 1194). You wrote "Is the=
 count reported there correct (no
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> missing data)?" but I have slightly differ=
ent viewpoint; I agree that the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reported count is correct (in flink's scop=
e, because the number is the same
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> as what's in output file). But I think "no=
 missing data" doesn't belong
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> here. Data is missing, but it's consistent=
ly missing from both output files
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and numRecordsOut metrics.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Next thing I'll work on is preparing the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> code to be shared..
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Btw, I used this script to count the sum o=
f
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> numRecordsOut (I'm going to look into enab=
ling Sl4jReporter eventually) :
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> JOB_URL=3D
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> http://10.1.56.245:20888/proxy/application=
_1550217512987_0001/jobs/068813ab8e6cbebaf7d306a0f41993c2
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> DistinctFunctionID=3D`http $JOB_URL \
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | jq '.vertices[] | select(.name =3D=3D
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "DistinctFunction") | .id' --raw-output`
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> echo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "DistinctFunctionID=3D$DistinctFunctionID"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> http
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> $JOB_URL/vertices/19ede2f818524a7f310857e5=
37fa6808/metrics | jq '.[] | .id'
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --raw-output | grep "[0-9][0-9]*\\.numReco=
rdsOut$" \
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | xargs -I@ sh -c "http GET
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> $JOB_URL/vertices/19ede2f818524a7f310857e5=
37fa6808/metrics?get=3D@ | jq
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> '.[0].value' --raw-output" > numRecordsOut=
.txt
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> # " eval_math( '+'.join( file.readlines ) =
)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> paste -sd+ numRecordsOut.txt | bc
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Feb 14, 2019 at 2:44 PM Konstantin
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Knauf <konstantin@ververica.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Juho,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * does the output of the streaming job
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> contain any data, which is not containe=
d in the batch
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> No.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * do you know if all lost records are
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> contained in the last savepoint you too=
k before the window fired? This
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would mean that no records are lost aft=
er the last restore.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I haven't built the tooling required to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> check all IDs like that, but yes, that's=
 my understanding currently. To
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> check that I would need to:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - kill the stream only once on a given
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> day (so that there's only one savepoint =
creation & restore)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - next day or later: save all missing id=
s
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> from batch output comparison
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - next day or later: read the savepoint
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> with bravo & check that it contains all =
of those missing IDs
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However I haven't built the tooling for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that yet. Do you think it's necessary to=
 verify that this assumption holds?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It would be another data point and might
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> help us to track down the problem. Wether=
 it is worth doing it, depends on
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the result, i.e. wether the current assum=
ption would be falsified or not,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> but we only know that in retrospect ;)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * could you please check the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> numRecordsOut metric for the WindowOper=
ator (FlinkUI -> TaskMetrics ->
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Select TaskChain containing WindowOpera=
tor -> find metric)? Is the count
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reported there correct (no missing data=
)?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Is that metric the result of window
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> trigger? If yes, you must mean that I ch=
eck the value of that metric on the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> next day after restore, so that it only =
contains the count for the output
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of previous day's window? The counter is=
 reset to 0 when job starts (even
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> when state is restored), right?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Yes, this metric would be incremented whe=
n
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the window is triggered. Yes, please chec=
k this metric after the window,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> during which the restore happened, is fir=
ed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If you don't have a MetricsReporter
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configured so far, I recommend to quickly=
 register a Sl4jReporter to log
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> out all metrics every X seconds (maybe ev=
en minutes for your use case):
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://ci.apache.org/projects/flink/flin=
k-docs-release-1.7/monitoring/metrics.html#slf4j-orgapacheflinkmetricsslf4j=
slf4jreporter.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Then you don't need to go trough the WebU=
I and can keep a history of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> metrics.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Otherwise, do you have any suggestions
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for how to instrument the code to narrow=
 down further where the data gets
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> lost? To me it would make sense to proce=
ed with this, because the problem
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> seems hard to reproduce outside of our e=
nvironment.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Let's focus on checking this metric above=
,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to make sure that the WindowOperator is a=
ctually emitting less records than
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the overall number of keys in the state a=
s your experiments suggest, and on
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sharing the code.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Feb 14, 2019 at 10:57 AM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Konstantin Knauf <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> konstantin@ververica.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Juho,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> you are right the problem has actually
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> been narrowed down quite a bit over tim=
e. Nevertheless, sharing the code
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (incl. flink-conf.yaml) might be a good=
 idea. Maybe something strikes the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> eye, that we have not thought about so =
far. If you don't feel comfortable
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sharing the code on the ML, feel free t=
o send me a PM.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Besides that, three more questions:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * does the output of the streaming job
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> contain any data, which is not containe=
d in the batch output?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * do you know if all lost records are
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> contained in the last savepoint you too=
k before the window fired? This
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would mean that no records are lost aft=
er the last restore.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * could you please check the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> numRecordsOut metric for the WindowOper=
ator (FlinkUI -> TaskMetrics ->
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Select TaskChain containing WindowOpera=
tor -> find metric)? Is the count
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reported there correct (no missing data=
)?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Konstantin
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 13, 2019 at 3:19 PM Gyula
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> F=C3=B3ra <gyula.fora@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sorry not posting on the mail list was
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> my mistake :/
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, 13 Feb 2019 at 15:01, Juho
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Autio <juho.autio@rovio.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for stepping in, did you post
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> outside of the mailing list on purpos=
e btw?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This I did long time ago:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To rule out for good any questions
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> about sink behaviour, the job was ki=
lled and started with an additional
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Kafka sink.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The same number of ids were missed i=
n
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> both outputs: KafkaSink & BucketingS=
ink.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (I wrote about that On Oct 1, 2018 in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this email thread)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> After that I did the savepoint
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> analysis with Bravo.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Currently I'm indeed trying to get
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> suggestions how to debug further, for=
 example, where to add additional
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> kafka output, to catch where the data=
 gets lost. That would probably be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> somewhere in Flink's internals.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I could try to share the full code
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> also, but IMHO the problem has been q=
uite well narrowed down, considering
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that data can be found in savepoint, =
savepoint is successfully restored,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and after restoring the data doesn't =
go to "user code" (like the reducer)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> any more.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 13, 2019 at 3:47 PM Gyula
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> F=C3=B3ra <gyula.fora@gmail.com> wrot=
e:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Juho!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think the reason you are not
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> getting much answers here is because=
 it is very hard to debug this problem
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> remotely.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Seemingly you do very normal
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> operations, the state contains all t=
he required data and nobody else has
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> hit a similar problem for ages.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> My best guess would be some bug with
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the deduplication or output writing =
logic but without a complete code
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> example its very hard to say anythin=
g useful.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Did you try writing it to Kafka to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> see if the output is there? (that wa=
y we could rule out the dedup probllem)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Gyula
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 13, 2019 at 2:37 PM Juho
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Autio <juho.autio@rovio.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Stefan (or anyone!), please, could =
I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> have some feedback on the findings =
that I reported on Dec 21, 2018? This is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> still a major blocker..
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Jan 31, 2019 at 11:46 AM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Juho Autio <juho.autio@rovio.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello, is there anyone that could
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> help with this?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Jan 11, 2019 at 8:14 AM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Juho Autio <juho.autio@rovio.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Stefan, would you have time to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> comment?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wednesday, January 2, 2019,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Juho Autio <juho.autio@rovio.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bump =E2=80=93 does anyone know =
if Stefan
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> will be available to comment the=
 latest findings? Thanks.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Dec 21, 2018 at 2:33 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Juho Autio <juho.autio@rovio.com=
>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Stefan, I managed to analyze
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> savepoint with bravo. It seems =
that the data that's missing from output
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> *is* found in savepoint.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I simplified my test case to th=
e
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> following:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - job 1 has bee running for ~10
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> days
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - savepoint X created & job 1
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cancelled
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - job 2 started with restore
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> from savepoint X
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Then I waited until the next da=
y
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> so that job 2 has triggered the=
 24 hour window.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Then I analyzed the output &
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> savepoint:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - compare job 2 output with the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> output of a batch pyspark scrip=
t =3D> find 4223 missing rows
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - pick one of the missing rows
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (say, id Z)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - read savepoint X with bravo,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filter for id Z =3D> Z was foun=
d in the savepoint!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> How can it be possible that the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> value is in state but doesn't e=
nd up in output after state has been
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> restored & window is eventually=
 triggered?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I also did similar analysis on
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the previous case where I savep=
ointed & restored the job multiple times (5)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> within the same 24-hour window.=
 A missing id that I drilled down to, was
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> found in all of those savepoint=
s, yet missing from the output that gets
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> written at the end of the day. =
This is even more surprising: that the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> missing ID was written to the n=
ew savepoints also after restoring. Is the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reducer state somehow decoupled=
 from the window contents?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Big thanks to bravo-developer
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Gyula for guiding me through to=
 be able read the reducer state!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/king/bravo/p=
ull/11
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Gyula also had an idea for how
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to troubleshoot the missing dat=
a in a scalable way: I could add some "side
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> effect kafka output" on individ=
ual operators. This should allow tracking
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> more closely at which point the=
 data gets lost. However, maybe this would
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> have to be in some Flink's inte=
rnal components, and I'm not sure which
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> those would be.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Juho
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2018 at 11:52 A=
M
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Juho Autio <juho.autio@rovio.co=
m>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Stefan,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bravo doesn't currently suppor=
t
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reading a reducer state. I gav=
e it a try but couldn't get to a working
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> implementation yet. If anyone =
can provide some insight on how to make this
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> work, please share at github:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/king/bravo/=
pull/11
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Oct 23, 2018 at 3:32 P=
M
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I was glad to find that bravo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> had now been updated to suppo=
rt installing bravo to a local maven repo.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I was able to load a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> checkpoint created by my job,=
 thanks to the example provided in bravo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> README, but I'm still missing=
 the essential piece.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> My code was:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>         OperatorStateReader
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reader =3D new OperatorStateR=
eader(env2, savepoint, "DistinctFunction");
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>         DontKnowWhatTypeThisI=
s
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reducingState =3D reader.read=
KeyedStates(what should I put here?);
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I don't know how to read the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> values collected from reduce(=
) calls in the state. Is there a way to access
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the reducing state of the win=
dow with bravo? I'm a bit confused how this
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> works, because when I check w=
ith debugger, flink internally uses
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a ReducingStateDescriptor wit=
h name=3Dwindow-contents, but still reading
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> operator state for "DistinctF=
unction" didn't at least throw an exception
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ("window-contents" threw =E2=
=80=93 obviously there's no operator by that name).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Juho
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Oct 15, 2018 at 2:25
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PM Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Stefan,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sorry but it doesn't seem
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> immediately clear to me what=
's a good way to use
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/king/brav=
o
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> How are people using it?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Would you for example modify=
 build.gradle somehow to publish the bravo as a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> library locally/internally? =
Or add code directly in the bravo project
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (locally) and run it from th=
ere (using an IDE, for example)? Also it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> doesn't seem like the bravo =
gradle project supports building a flink job
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> jar, but if it does, how do =
I do it?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Oct 4, 2018 at 9:30
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PM Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Good then, I'll try to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> analyze the savepoints with=
 Bravo. Thanks!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > How would you assume that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> backpressure would influenc=
e your updates? Updates to each local state
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> still happen event-by-event=
, in a single reader/writing thread.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sure, just an ignorant gues=
s
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> by me. I'm not familiar wit=
h most of Flink's internals. Any way high
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> backpressure is not a seen =
on this job after it has caught up the lag, so
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at I thought it would be wo=
rth mentioning.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Oct 4, 2018 at 6:24
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PM Stefan Richter <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> s.richter@data-artisans.com=
>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Am 04.10.2018 um 16:08
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> schrieb Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.com>:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > you could take a look at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bravo [1] to query your sa=
vepoints and to check if the state in the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> savepoint complete w.r.t y=
our expectations
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks. I'm not 100% if
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this is the case, but to m=
e it seemed like the missed ids were being logged
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> by the reducer soon after =
the job had started (after restoring a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> savepoint). But on the oth=
er hand, after that I also made another savepoint
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> & restored that, so what I=
 could check is: does that next savepoint have
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the missed ids that were l=
ogged (a couple of minutes before the savepoint
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> was created, so there shou=
ld've been more than enough time to add them to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the state before the savep=
oint was triggered) or not. Any way, if I would
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> be able to verify with Bra=
vo that the ids are missing from the savepoint
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (even though reduced logge=
d that it saw them), would that help in figuring
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> out where they are lost? I=
s there some major difference compared to just
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> looking at the final outpu=
t after window has been triggered?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think that makes a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> difference. For example, y=
ou can investigate if there is a state loss or a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> problem with the windowing=
. In the savepoint you could see which keys
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> exists and to which window=
s they are assigned. Also just to make sure there
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is no misunderstanding: on=
ly elements that are in the state at the start of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a savepoint are expected t=
o be part of the savepoint; all elements between
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> start and completion of th=
e savepoint are not expected to be part of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> savepoint.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > I also doubt that the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> problem is about backpress=
ure after restore, because the job will only
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> continue running after the=
 state restore is already completed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Yes, I'm not suspecting
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that the state restoring w=
ould be the problem either. My concern was about
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> backpressure possibly mess=
ing with the updates of reducing state? I would
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> tend to suspect that updat=
ing the state consistently is what fails, where
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> heavy load / backpressure =
might be a factor.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> How would you assume that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> backpressure would influen=
ce your updates? Updates to each local state
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> still happen event-by-even=
t, in a single reader/writing thread.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Oct 4, 2018 at 4:1=
8
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PM Stefan Richter <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> s.richter@data-artisans.co=
m>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> you could take a look at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bravo [1] to query your s=
avepoints and to check if the state in the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> savepoint complete w.r.t =
your expectations. I somewhat doubt that there is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a general problem with th=
e state/savepoints because many users are
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> successfully running it o=
n a large state and I am not aware of any data
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> loss problems, but nothin=
g is impossible. What the savepoint does is also
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> straight forward: iterate=
 a db snapshot and write all key/value pairs to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> disk, so all data that wa=
s in the db at the time of the savepoint, should
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> show up. I also doubt tha=
t the problem is about backpressure after restore,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> because the job will only=
 continue running after the state restore is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> already completed. Did yo=
u check if you are using exactly-one-semantics or
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at-least-once semantics? =
Also did you check that the kafka consumer start
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> position is configured pr=
operly [2]? Are watermarks generated as expected
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> after restore?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> One more unrelated
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> high-level comment that I=
 have: for a granularity of 24h windows, I wonder
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> if it would not make sens=
e to use a batch job instead?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Stefan
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/king/b=
ravo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://ci.apache.org/pro=
jects/flink/flink-docs-stable/dev/connectors/kafka.html#kafka-consumers-sta=
rt-position-configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Am 04.10.2018 um 14:53
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> schrieb Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.com>:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the suggestion=
s!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > In general, it would be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> tremendously helpful to h=
ave a minimal working example which allows to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reproduce the problem.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Definitely. The problem
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> with reproducing has been=
 that this only seems to happen in the bigger
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> production data volumes.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> That's why I'm hoping to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> find a way to debug this =
with the production data. With that it seems to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> consistently cause some m=
isses every time the job is killed/restored.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > check if it happens for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> shorter windows, like 1h =
etc
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> What would be the benefit
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of that compared to 24h w=
indow?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > simplify the job to not
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> use a reduce window but s=
imply a time window which outputs the window
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> events. Then counting the=
 input and output events should allow you to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> verify the results. If yo=
u are not seeing missing events, then it could
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> have something to do with=
 the reducing state used in the reduce function.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hm, maybe, but not sure
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> how useful that would be,=
 because it wouldn't yet prove that it's related
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to reducing, because not =
having a reduce function could also mean smaller
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> load on the job, which mi=
ght alone be enough to make the problem not
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> manifest.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Is there a way to debug
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> what goes into the reduci=
ng state (including what gets removed or
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> overwritten and what rest=
ored), if that makes sense..? Maybe some suitable
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> logging could be used to =
prove that the lost data is written to the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reducing state (or at lea=
st asked to be written), but not found any more
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> when the window closes an=
d state is flushed?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On configuration once
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> more, we're using RocksDB=
 state backend with asynchronous incremental
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> checkpointing. The state =
is restored from savepoints though, we haven't
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> been using those checkpoi=
nts in these tests (although they could be used in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> case of crashes =E2=80=93=
 but we haven't had those now).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Oct 4, 2018 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3:25 PM Till Rohrmann <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> trohrmann@apache.org>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Juho,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> another idea to further
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> narrow down the problem =
could be to simplify the job to not use a reduce
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> window but simply a time=
 window which outputs the window events. Then
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> counting the input and o=
utput events should allow you to verify the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> results. If you are not =
seeing missing events, then it could have something
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to do with the reducing =
state used in the reduce function.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In general, it would be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> tremendously helpful to =
have a minimal working example which allows to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reproduce the problem.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Oct 4, 2018 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2:02 PM Andrey Zagrebin =
<
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> andrey@data-artisans.com=
>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Juho,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> can you try to reduce
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the job to minimal repr=
oducible example and share the job and input?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For example:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - some simple records a=
s
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> input, e.g. tuples of p=
rimitive types saved as cvs
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - minimal deduplication
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> job which processes the=
m and misses records
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - check if it happens
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for shorter windows, li=
ke 1h etc
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - setup which you use
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for the job, ideally lo=
cally reproducible or cloud
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Andrey
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 4 Oct 2018, at 11:13=
,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sorry to insist, but we
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> seem to be blocked for =
any serious usage of state in Flink if we can't rely
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on it to not miss data =
in case of restore.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Would anyone have
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> suggestions for how to =
troubleshoot this? So far I have verified with DEBUG
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> logs that our reduce fu=
nction gets to process also the data that is missing
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> from window output.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Oct 1, 2018 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 11:56 AM Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Andrey,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To rule out for good
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> any questions about si=
nk behaviour, the job was killed and started with an
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> additional Kafka sink.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The same number of ids
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> were missed in both ou=
tputs: KafkaSink & BucketingSink.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I wonder what would be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the next steps in debu=
gging?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Sep 21, 2018 a=
t
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3:49 PM Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, Andrey.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > so it means that th=
e
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> savepoint does not lo=
ose at least some dropped records.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm not sure what you
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mean by that? I mean,=
 it was known from the beginning, that not everything
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is lost before/after =
restoring a savepoint, just some records around the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> time of restoration. =
It's not 100% clear whether records are lost before
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> making a savepoint or=
 after restoring it. Although, based on the new DEBUG
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> logs it seems more li=
ke losing some records that are seen ~soon after
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> restoring. It seems l=
ike Flink would be somehow confused either about the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> restored state vs. ne=
w inserts to state. This could also be somehow linked
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to the high back pres=
sure on the kafka source while the stream is catching
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> up.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > If it is feasible
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for your setup, I sug=
gest to insert one more map function after reduce and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> before sink.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > etc.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Isn't that the same
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> thing that we discuss=
ed before? Nothing is sent to BucketingSink before the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> window closes, so I d=
on't see how it would make any difference if we
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> replace the Bucketing=
Sink with a map function or another sink type. We
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> don't create or resto=
re savepoints during the time when BucketingSink gets
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> input or has open buc=
kets =E2=80=93 that happens at a much later time of day. I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would focus on figuri=
ng out why the records are lost while the window is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> open. But I don't kno=
w how to do that. Would you have any additional
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> suggestions?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Sep 21, 2018
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at 3:30 PM Andrey Zag=
rebin <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> andrey@data-artisans.=
com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Juho,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> so it means that the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> savepoint does not l=
oose at least some dropped records.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If it is feasible fo=
r
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> your setup, I sugges=
t to insert one more map function after reduce and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> before sink.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The map function
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> should be called rig=
ht after window is triggered but before flushing to s3.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The result of reduce
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (deduped record) cou=
ld be logged there.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This should allow to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> check whether the pr=
ocessed distinct records were buffered in the state
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> after the restoratio=
n from the savepoint or not. If they were buffered we
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> should see that ther=
e was an attempt to write them to the sink from the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> state.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Another suggestion i=
s
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to try to write reco=
rds to some other sink or to both.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> E.g. if you can
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> access file system o=
f workers, maybe just into local files and check
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> whether the records =
are also dropped there.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Andrey
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 20 Sep 2018, at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 15:37, Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.com=
>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Andrey!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I was finally able t=
o
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> gather the DEBUG log=
s that you suggested. In short, the reducer logged that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it processed at leas=
t some of the ids that were missing from the output.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "At least some",
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> because I didn't hav=
e the job running with DEBUG logs for the full 24-hour
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> window period. So I =
was only able to look up if I can find
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> *some* of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> missing ids in the D=
EBUG logs. Which I did indeed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I changed the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> DistinctFunction.jav=
a to do this:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>     @Override
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>     public
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Map<String, String> =
reduce(Map<String, String> value1, Map<String, String>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> value2) {
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> LOG.debug("DistinctF=
unction.reduce returns: {}=3D{}", value1.get("field"),
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> value1.get("id"));
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>         return value=
1;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Then:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> vi
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> flink-1.6.0/conf/log=
4j.properties
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> log4j.logger.org.apa=
che.flink.streaming.runtime.tasks.StreamTask=3DDEBUG
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> log4j.logger.com.rov=
io.ds.flink.uniqueid.DistinctFunction=3DDEBUG
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Then I ran the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> following kind of te=
st:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Cancelled the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on-going job with sa=
vepoint created at ~Sep 18 08:35 UTC 2018
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Started a new
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cluster & job with D=
EBUG enabled at ~09:13, restored from that previous
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cluster's savepoint
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Ran until caught u=
p
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> offsets
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Cancelled the job
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> with a new savepoint
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Started a new job
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> _without_ DEBUG, whi=
ch restored the new savepoint, let it keep running so
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that it will eventua=
lly write the output
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Then on the next day=
,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> after results had be=
en flushed when the 24-hour window closed, I compared
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the results again wi=
th a batch version's output. And found some missing ids
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> as usual.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I drilled down to on=
e
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> specific missing id =
(I'm replacing the actual value with AN12345 below),
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> which was not found =
in the stream output, but was found in batch output &
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> flink DEBUG logs.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Related to that id, =
I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> gathered the followi=
ng information:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2018-09-18~09:13:21,=
000
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> job started & savepo=
int is restored
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2018-09-18
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 09:14:29,085 missing=
 id is processed for the first time, proved by this log
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> line:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2018-09-18
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 09:14:29,085 DEBUG c=
om.rovio.ds.flink.uniqueid.DistinctFunction
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>       - DistinctFunc=
tion.reduce returns: s.aid1=3DAN12345
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2018-09-18
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 09:15:14,264 first s=
ynchronous part of checkpoint
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2018-09-18
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 09:15:16,544 first a=
synchronous part of checkpoint
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> more occurrences of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> checkpoints (~1 min =
checkpointing time + ~1 min delay before next)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> more occurrences of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> DistinctFunction.red=
uce
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> )
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2018-09-18
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 09:23:45,053 missing=
 id is processed for the last time
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2018-09-18~10:20:00,=
000
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> savepoint created & =
job cancelled
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To be noted, there
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> was high backpressur=
e after restoring from savepoint until the stream
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> caught up with the k=
afka offsets. Although, our job uses assign timestamps
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> & watermarks on the =
flink kafka consumer itself, so event time of all
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> partitions is synchr=
onized. As expected, we don't get any late data in the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> late data side outpu=
t.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> From this we can see
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that the missing ids=
 are processed by the reducer, but they must get lost
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> somewhere before the=
 24-hour window is triggered.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think it's worth
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mentioning once more=
 that the stream doesn't miss any ids if we let it's
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> running without inte=
rruptions / state restoring.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> What's next?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Aug 29, 2018
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at 3:49 PM Andrey Za=
grebin <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> andrey@data-artisans=
.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Juho,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > only when the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 24-hour window trig=
gers, BucketingSink gets a burst of input
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This is of course
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> totally true, my un=
derstanding is the same. We cannot exclude problem there
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for sure, just save=
points are used a lot w/o problem reports
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and BucketingSink i=
s known to be problematic with s3. That is why, I asked
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> you:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > You also wrote
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that the timestamps=
 of lost event are 'probably' around the time of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> savepoint, if it is=
 not yet for sure I would also check it.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Although, bucketing
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sink might loose an=
y data at the end of the day (also from the middle). The
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fact, that it is al=
ways around the time of taking a savepoint and not
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> random, is surely s=
uspicious and possible savepoint failures need to be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> investigated.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regarding the s3
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> problem, s3 doc say=
s:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > The caveat is tha=
t
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> if you make a HEAD =
or GET request to the key name (to find if the object
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> exists) before crea=
ting the object, Amazon S3 provides
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 'eventual consisten=
cy' for read-after-write.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The algorithm you
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> suggest is how it i=
s roughly implemented now
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (BucketingSink.open=
NewPartFile). My understanding is that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 'eventual consisten=
cy=E2=80=99 means that even if you just created file (its name
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is key) it can be t=
hat you do not get it in the list or exists (HEAD)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> returns false and y=
ou risk to rewrite the previous part.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The BucketingSink w=
as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> designed for a stan=
dard file system. s3 is used over a file system wrapper
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> atm but does not al=
ways provide normal file system guarantees. See also
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> last example in [1]=
.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Andrey
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://codeburst.i=
o/quick-explanation-of-the-s3-consistency-model-6c9f325e3f82
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 29 Aug 2018, at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 12:11, Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.co=
m>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Andrey, thank you
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> very much for the d=
ebugging suggestions, I'll try them.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In the meanwhile tw=
o
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> more questions, ple=
ase:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > Just to keep in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mind this problem w=
ith s3 and exclude it for sure. I would also check
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> whether the size of=
 missing events is around the batch size of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> BucketingSink or no=
t.Fair enough, but I also want to focus on debugging the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> most probable subje=
ct first. So what do you think about this =E2=80=93 true or
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> false: only when th=
e 24-hour window triggers, BucketinSink gets a burst of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> input. Around the s=
tate restoring point (middle of the day) it doesn't get
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> any input, so it ca=
n't lose anything either. Isn't this true, or have I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> totally missed how =
Flink works in triggering window results? I would not
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> expect there to be =
any optimization that speculatively triggers early
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> results of a regula=
r time window to the downstream operators.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > The old
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> BucketingSink has i=
n general problem with s3. Internally BucketingSink
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> queries s3 as a fil=
e system to list already written file parts (batches)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and determine index=
 of the next part to start. Due to eventual consistency
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of checking file ex=
istence in s3 [1], the BucketingSink can rewrite the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> previously written =
part and basically loose it.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I was wondering,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> what does S3's "rea=
d-after-write consistency" (mentioned on the page you
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> linked) actually me=
an. It seems that this might be possible:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - LIST keys, find
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> current max index
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - choose next index
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> =3D max + 1
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - HEAD next index:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> if it exists, keep =
adding + 1 until key doesn't exist on S3
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> But definitely
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sounds easier if a =
sink keeps track of files in a way that's guaranteed to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> be consistent.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Juho
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Aug 27, 201=
8
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at 2:04 PM Andrey Z=
agrebin <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> andrey@data-artisan=
s.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:Hi,true, Stre=
amingFileSink does not support s3 in 1.6.0, it is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> planned for the nex=
t 1.7 release, sorry for confusion.The old BucketingSink
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> has in general prob=
lem with s3. Internally BucketingSink queries s3 as a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> file system to list=
 already written file parts (batches) and determine
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> index of the next p=
art to start. Due to eventual consistency of checking
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> file existence in s=
3 [1], the BucketingSink can rewrite the previously
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> written part and ba=
sically loose it. It should be fixed for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> StreamingFileSink i=
n 1.7 where Flink keeps its own track of written parts
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and does not rely o=
n s3 as a file system. I also include Kostas, he might
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> add more details. J=
ust to keep in mind this problem with s3 and exclude it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for sure  I would a=
lso check whether the size of missing events is around
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the batch size of B=
ucketingSink or not. You also wrote that the timestamps
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of lost event are '=
probably' around the time of the savepoint, if it is not
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> yet for sure I woul=
d also check it.Have you already checked the log files
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of job manager and =
task managers for the job running before and after the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> restore from the ch=
eck point? Is everything successful there, no errors,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> relevant warnings o=
r exceptions?As the next step, I would suggest to log
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> all encountered eve=
nts in DistinctFunction.reduce if possible for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> production data and=
 check whether the missed events are eventually
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> processed before or=
 after the savepoint. The following log message
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> indicates a border =
between the events that should be included into the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> savepoint (logged b=
efore) or not:=E2=80=9C{} ({}, synchronous part) in thread {}
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> took {} ms=E2=80=9D=
 (template)Also check if the savepoint has been overall
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> completed:"{} ({}, =
asynchronous part) in thread {} took {}
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ms."Best,Andrey[1]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://docs.aws.am=
azon.com/AmazonS3/latest/dev/Introduction.htmlOn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 24 Aug 2018, at 20:=
41, Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.co=
m>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:Hi,Using Stre=
amingFileSink is not a convenient option for production
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> use for us as it do=
esn't support s3*. I could use StreamingFileSink just to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> verify, but I don't=
 see much point in doing so. Please consider my previous
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> comment:> I realize=
d that BucketingSink must not play any role in this
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> problem. This is be=
cause only when the 24-hour window triggers,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> BucketingSink gets =
a burst of input. Around the state restoring point
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (middle of the day)=
 it doesn't get any input, so it can't lose anything
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> either (right?).I c=
ould also use a kafka sink instead, but I can't imagine
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> how there could be =
any difference. It's very real that the sink doesn't get
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> any input for a lon=
g time until the 24-hour window closes, and then it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> quickly writes out =
everything because it's not that much data eventually
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for the distinct va=
lues.Any ideas for debugging what's happening around the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> savepoint & restora=
tion time?*) I actually implemented StreamingFileSink as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> an alternative sink=
. This was before I came to realize that most likely the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sink component has =
nothing to do with the data loss problem. I tried it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> with s3n:// path ju=
st to see an exception being thrown. In the source code
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I indeed then found=
 an explicit check for the target path scheme to be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "hdfs://". On Fri, =
Aug 24, 2018 at 7:49 PM Andrey Zagrebin <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> andrey@data-artisan=
s.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:Ok, I think b=
efore further debugging the window reduced state, could
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> you try the new =E2=
=80=98StreamingFileSink=E2=80=99 [1] introduced in Flink 1.6.0 instead
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of the previous 'Bu=
cketingSink=E2=80=99?Cheers,Andrey[1]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://ci.apache.o=
rg/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.htmlOn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 24 Aug 2018, at 18:=
03, Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.co=
m>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:Yes, sorry fo=
r my confusing comment. I just meant that it seems like
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> there's a bug somew=
here now that the output is missing some data.> I would
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wait and check the =
actual output in s3 because it is the main result of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> jobYes, and that's =
what I have already done. There seems to be always some
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> data loss with the =
production data volumes, if the job has been restarted
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on that day.Would y=
ou have any suggestions for how to debug this
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> further?Many thanks=
 for stepping in.On Fri, Aug 24, 2018 at 6:37 PM Andrey
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Zagrebin <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> andrey@data-artisan=
s.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:Hi Juho,So it=
 is a per key deduplication job.Yes, I would wait and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> check the actual ou=
tput in s3 because it is the main result of the job and>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The late data aroun=
d the time of taking savepoint might be not included
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> into the savepoint =
but it should be behind the snapshotted offset in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Kafka.is not a bug,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it is a possible be=
haviour.The savepoint is a snapshot of the data in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> transient which is =
already consumed from Kafka.Basically the full contents
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of the window resul=
t is split between the savepoint and what can come after
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the savepoint'ed of=
fset in Kafka but before the window result is written
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> into s3. Allowed la=
teness should not affect it, I am just saying that the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> final result in s3 =
should include all records after it. This is what should
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> be guaranteed but n=
ot the contents of the intermediate
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> savepoint.Cheers,An=
dreyOn 24 Aug 2018, at 16:52, Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.co=
m>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:Thanks for yo=
ur answer!I check for the missed data from the final
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> output on s3. So I =
wait until the next day, then run the same thing
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> re-implemented in b=
atch, and compare the output.> The late data around the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> time of taking save=
point might be not included into the savepoint but it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> should be behind th=
e snapshotted offset in Kafka.Yes, I would definitely
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> expect that. It see=
ms like there's a bug somewhere.> Then it should just
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> come later after th=
e restore and should be reduced within the allowed
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> lateness into the f=
inal result which is saved into s3.Well, as far as I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> know, allowed laten=
ess doesn't play any role here, because I started
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> running the job wit=
h allowedLateness=3D0, and still get the data loss, while
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> my late data output=
 doesn't receive anything.> Also, is this
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> `DistinctFunction.r=
educe` just an example or the actual implementation,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> basically saving ju=
st one of records inside the 24h window in s3? then what
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is missing there?Ye=
s, it's the actual implementation. Note that there's a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> keyBy before the Di=
stinctFunction. So there's one record for each key
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (which is the combi=
nation of a couple of fields). In practice I've seen
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that we're missing =
~2000-4000 elements on each restore, and the total
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> output is obviously=
 much more than that.Here's the full code for the key
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> selector:public cla=
ss MapKeySelector implements
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> KeySelector<Map<Str=
ing,String>, Object> {    private final String[]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fields;    public M=
apKeySelector(String... fields) {        this.fields =3D
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fields;    }    @Ov=
erride    public Object getKey(Map<String, String>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> event) throws Excep=
tion {        Tuple key =3D
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Tuple.getTupleClass=
(fields.length).newInstance();        for (int i =3D 0; i
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> < fields.length; i+=
+) {
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> key.setField(event.=
getOrDefault(fields[i], ""), i);        }        return
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> key;    }}And a mor=
e exact example on how it's used:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   .keyBy(new MapKey=
Selector("ID", "PLAYER_ID", "FIELD", "KEY_NAME",
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "KEY_VALUE"))      =
          .timeWindow(Time.days(1))
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .reduce(new Distinc=
tFunction())On Fri, Aug 24, 2018 at 5:26 PM Andrey
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Zagrebin <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> andrey@data-artisan=
s.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:Hi Juho,Where=
 exactly does the data miss? When do you notice that? Do
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> you check it:- debu=
gging `DistinctFunction.reduce` right after resume in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the middle of the d=
ay or - some distinct records miss in the final output
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of BucketingSink in=
 s3 after window result is actually triggered and saved
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> into s3 at the end =
of the day? is this the main output?The late data around
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the time of taking =
savepoint might be not included into the savepoint but
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it should be behind=
 the snapshotted offset in Kafka. Then it should just
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> come later after th=
e restore and should be reduced within the allowed
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> lateness into the f=
inal result which is saved into s3.Also, is this
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> `DistinctFunction.r=
educe` just an example or the actual implementation,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> basically saving ju=
st one of records inside the 24h window in s3? then what
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is missing there?Ch=
eers,AndreyOn 23 Aug 2018, at 15:42, Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.co=
m>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:I changed to =
allowedLateness=3D0, no change, still missing data when
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> restoring from save=
point.On Tue, Aug 21, 2018 at 10:43 AM Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.co=
m>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:I realized th=
at BucketingSink must not play any role in this problem.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This is because onl=
y when the 24-hour window triggers, BucketinSink gets a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> burst of input. Aro=
und the state restoring point (middle of the day) it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> doesn't get any inp=
ut, so it can't lose anything either (right?).I will
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> next try removing t=
he allowedLateness entirely from the equation.In the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> meanwhile, please l=
et me know if you have any suggestions for debugging the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> lost data, for exam=
ple what logs to enable.We use FlinkKafkaConsumer010
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> btw. Are there any =
known issues with that, that could contribute to lost
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> data when restoring=
 a savepoint?On Fri, Aug 17, 2018 at 4:23 PM Juho Autio <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.co=
m>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:Some data is =
silently lost on my Flink stream job when state is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> restored from a sav=
epoint.Do you have any debugging hints to find out where
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> exactly the data ge=
ts dropped?My job gathers distinct values using a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 24-hour window. It =
doesn't have any custom state management.When I cancel
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the job with savepo=
int and restore from that savepoint, some data is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> missed. It seems to=
 be losing just a small amount of data. The event time
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of lost data is pro=
bably around the time of savepoint. In other words the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> rest of the time wi=
ndow is not entirely missed =E2=80=93 collection works correctly
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> also for (most of t=
he) events that come in after restoring.When the job
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> processes a full 24=
-hour window without interruptions it doesn't miss
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> anything.Usually th=
e problem doesn't happen in test environments that have
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> smaller parallelism=
 and smaller data volumes. But in production volumes the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> job seems to be con=
sistently missing at least something on every
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> restore.This issue =
has consistently happened since the job was initially
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> created. It was at =
first run on an older version of Flink 1.5-SNAPSHOT and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it still happens on=
 both Flink 1.5.2 & 1.6.0.I'm wondering if this could be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for example some sy=
nchronization issue between the kafka consumer offsets
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> vs. what's been wri=
tten by BucketingSink?1. Job content, simplified
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   kafkaStream      =
          .flatMap(new ExtractFieldsFunction())
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>       .keyBy(new Ma=
pKeySelector(1, 2, 3, 4))
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .timeWindow(Time.da=
ys(1))                .allowedLateness(allowedLateness)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>               .side=
OutputLateData(lateDataTag)                .reduce(new
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> DistinctFunction())=
                .addSink(sink)                // use a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fixed number of out=
put partitions                .setParallelism(8))/** *
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Usage: .keyBy("the"=
, "distinct", "fields").reduce(new
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> DistinctFunction())=
 */public class DistinctFunction implements
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ReduceFunction<java=
.util.Map<String, String>> {    @Override    public
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Map<String, String>=
 reduce(Map<String, String> value1, Map<String, String>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> value2) {        re=
turn value1;    }}2. State configurationboolean
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> enableIncrementalCh=
eckpointing =3D true;String statePath =3D "
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> s3n://bucket/savepo=
ints";new
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> RocksDBStateBackend=
(statePath,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> enableIncrementalCh=
eckpointing);Checkpointing Mode Exactly OnceInterval 1m
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0sTimeout 10m 0sMin=
imum Pause Between Checkpoints 1m 0sMaximum Concurrent
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Checkpoints 1Persis=
t Checkpoints Externally Enabled (retain on
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cancellation)3. Buc=
ketingSink configurationWe use BucketingSink, I don't
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> think there's anyth=
ing special here, if not the fact that we're writing to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> S3.        String o=
utputPath =3D "
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> s3://bucket/output"=
;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>       BucketingSink=
<Map<String, String>> sink =3D new
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> BucketingSink<Map<S=
tring, String>>(outputPath)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .setBucketer(new Pr=
ocessdateBucketer())
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .setBatchSize(batch=
Size)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .setInactiveBucketT=
hreshold(inactiveBucketThreshold)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .setInactiveBucketC=
heckInterval(inactiveBucketCheckInterval);
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sink.setWriter(new =
IdJsonWriter());4. Kafka & event timeMy flink job reads
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the data from Kafka=
, using a BoundedOutOfOrdernessTimestampExtractor on the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> kafka consumer to s=
ynchronize watermarks accross all kafka partitions. We
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> also write late dat=
a to side output, but nothing is written there =E2=80=93 if it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would, it could exp=
lain missed data in the main output (I'm also sure that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> our late data writi=
ng works, because we previously had some actual late
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> data which ended up=
 there).5. allowedLatenessIt may be or may not be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> relevant that I hav=
e also enabled allowedLateness with 1 minute lateness on
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the 24-hour window:=
If that makes sense, I could try removing
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> allowedLateness ent=
irely? That would be just to rule out that Flink doesn't
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> have a bug that's r=
elated to restoring state in combination with the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> allowedLateness fea=
ture. After all, all of our data should be in a good
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> enough order to not=
 be late, given the max out of orderness used on kafka
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> consumer timestamp =
extractor.Thank you in advance!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> *Juho Autio*
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Senior Data Engineer
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Data Engineering, Games
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Rovio Entertainment Corporation
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mobile: + 358 (0)45 313 0122
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> juho.autio@rovio.com
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> www.rovio.com
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> *This message and its attachments may
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> contain confidential information and =
is intended solely for the attention
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and use of the named addressee(s). If=
 you are not the intended recipient
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and / or you have received this messa=
ge in error, please contact the sender
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> immediately and delete all material y=
ou have received in this message. You
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> are hereby notified that any use of t=
he information, which you have
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> received in error in whatsoever form,=
 is strictly prohibited. Thank you for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> your co-operation.*
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Konstantin Knauf | Solutions Architect
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> +49 160 91394525
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <https://www.ververica.com/>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Follow us @VervericaData
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Join Flink Forward
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <https://flink-forward.org/> - The
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Apache Flink Conference
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Stream Processing | Event Driven | Real
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Time
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Data Artisans GmbH | Invalidenstrasse
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 115, 10115 Berlin, Germany
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Data Artisans GmbH
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Registered at Amtsgericht
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Charlottenburg: HRB 158244 B
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Managing Directors: Dr. Kostas Tzoumas,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Dr. Stephan Ewen
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Konstantin Knauf | Solutions Architect
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> +49 160 91394525
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <https://www.ververica.com/>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Follow us @VervericaData
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Join Flink Forward
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <https://flink-forward.org/> - The Apache
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Flink Conference
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Stream Processing | Event Driven | Real
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Time
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Data Artisans GmbH | Invalidenstrasse 115=
,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 10115 Berlin, Germany
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Data Artisans GmbH
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Registered at Amtsgericht Charlottenburg:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> HRB 158244 B
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Managing Directors: Dr. Kostas Tzoumas,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Dr. Stephan Ewen
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Konstantin Knauf | Solutions Architect
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> +49 160 91394525
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <https://www.ververica.com/>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Follow us @VervericaData
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Join Flink Forward
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <https://flink-forward.org/> - The Apache
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Flink Conference
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Stream Processing | Event Driven | Real Tim=
e
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Data Artisans GmbH | Invalidenstrasse 115,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 10115 Berlin, Germany
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Data Artisans GmbH
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Registered at Amtsgericht Charlottenburg:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> HRB 158244 B
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Managing Directors: Dr. Kostas Tzoumas, Dr.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Stephan Ewen
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Konstantin Knauf | Solutions Architect
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +49 160 91394525
>>>>>>>>>>>>>>>>>>>>>>>>>>>> <https://www.ververica.com/>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Follow us @VervericaData
>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Join Flink Forward <https://flink-forward.org/=
>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> - The Apache Flink Conference
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Stream Processing | Event Driven | Real Time
>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Data Artisans GmbH | Invalidenstrasse 115,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 10115 Berlin, Germany
>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Data Artisans GmbH
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Registered at Amtsgericht Charlottenburg: HRB
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 158244 B
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Managing Directors: Dr. Kostas Tzoumas, Dr.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Stephan Ewen
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>> Konstantin Knauf | Solutions Architect
>>>>>>>>>>>>>>>>>>>>>>>>>> +49 160 91394525
>>>>>>>>>>>>>>>>>>>>>>>>>> <https://www.ververica.com/>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Follow us @VervericaData
>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>> Join Flink Forward <https://flink-forward.org/>
>>>>>>>>>>>>>>>>>>>>>>>>>> - The Apache Flink Conference
>>>>>>>>>>>>>>>>>>>>>>>>>> Stream Processing | Event Driven | Real Time
>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>> Data Artisans GmbH | Invalidenstrasse 115, 10115
>>>>>>>>>>>>>>>>>>>>>>>>>> Berlin, Germany
>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>> Data Artisans GmbH
>>>>>>>>>>>>>>>>>>>>>>>>>> Registered at Amtsgericht Charlottenburg: HRB
>>>>>>>>>>>>>>>>>>>>>>>>>> 158244 B
>>>>>>>>>>>>>>>>>>>>>>>>>> Managing Directors: Dr. Kostas Tzoumas, Dr.
>>>>>>>>>>>>>>>>>>>>>>>>>> Stephan Ewen
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>> Konstantin Knauf | Solutions Architect
>>>>>>>>>>>>>>>>>>>>>>> +49 160 91394525
>>>>>>>>>>>>>>>>>>>>>>> <https://www.ververica.com/>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Follow us @VervericaData
>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>> Join Flink Forward <https://flink-forward.org/> -
>>>>>>>>>>>>>>>>>>>>>>> The Apache Flink Conference
>>>>>>>>>>>>>>>>>>>>>>> Stream Processing | Event Driven | Real Time
>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>> Data Artisans GmbH | Invalidenstrasse 115, 10115
>>>>>>>>>>>>>>>>>>>>>>> Berlin, Germany
>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>> Data Artisans GmbH
>>>>>>>>>>>>>>>>>>>>>>> Registered at Amtsgericht Charlottenburg: HRB 15824=
4
>>>>>>>>>>>>>>>>>>>>>>> B
>>>>>>>>>>>>>>>>>>>>>>> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan
>>>>>>>>>>>>>>>>>>>>>>> Ewen
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>> Konstantin Knauf | Solutions Architect
>>>>>>>>>>>>>>>>>>>>> +49 160 91394525
>>>>>>>>>>>>>>>>>>>>> <https://www.ververica.com/>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Follow us @VervericaData
>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>> Join Flink Forward <https://flink-forward.org/> - The
>>>>>>>>>>>>>>>>>>>>> Apache Flink Conference
>>>>>>>>>>>>>>>>>>>>> Stream Processing | Event Driven | Real Time
>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>> Data Artisans GmbH | Invalidenstrasse 115, 10115
>>>>>>>>>>>>>>>>>>>>> Berlin, Germany
>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>> Data Artisans GmbH
>>>>>>>>>>>>>>>>>>>>> Registered at Amtsgericht Charlottenburg: HRB 158244 =
B
>>>>>>>>>>>>>>>>>>>>> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan
>>>>>>>>>>>>>>>>>>>>> Ewen
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> Konstantin Knauf | Solutions Architect
>>>>>>>>>>>>>>>>>>> +49 160 91394525
>>>>>>>>>>>>>>>>>>> <https://www.ververica.com/>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Follow us @VervericaData
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> Join Flink Forward <https://flink-forward.org/> - The
>>>>>>>>>>>>>>>>>>> Apache Flink Conference
>>>>>>>>>>>>>>>>>>> Stream Processing | Event Driven | Real Time
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin=
,
>>>>>>>>>>>>>>>>>>> Germany
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> Data Artisans GmbH
>>>>>>>>>>>>>>>>>>> Registered at Amtsgericht Charlottenburg: HRB 158244 B
>>>>>>>>>>>>>>>>>>> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewe=
n
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Konstantin Knauf | Solutions Architect
>>>>>>>>>>>>>>>>> +49 160 91394525
>>>>>>>>>>>>>>>>> <https://www.ververica.com/>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Follow us @VervericaData
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Join Flink Forward <https://flink-forward.org/> - The
>>>>>>>>>>>>>>>>> Apache Flink Conference
>>>>>>>>>>>>>>>>> Stream Processing | Event Driven | Real Time
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin,
>>>>>>>>>>>>>>>>> Germany
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Data Artisans GmbH
>>>>>>>>>>>>>>>>> Registered at Amtsgericht Charlottenburg: HRB 158244 B
>>>>>>>>>>>>>>>>> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Konstantin Knauf | Solutions Architect
>>>>>>>>>>>>>>> +49 160 91394525
>>>>>>>>>>>>>>> <https://www.ververica.com/>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Follow us @VervericaData
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Join Flink Forward <https://flink-forward.org/> - The
>>>>>>>>>>>>>>> Apache Flink Conference
>>>>>>>>>>>>>>> Stream Processing | Event Driven | Real Time
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin,
>>>>>>>>>>>>>>> Germany
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Data Artisans GmbH
>>>>>>>>>>>>>>> Registered at Amtsgericht Charlottenburg: HRB 158244 B
>>>>>>>>>>>>>>> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Konstantin Knauf | Solutions Architect
>>>>>>>>>>>>> +49 160 91394525
>>>>>>>>>>>>> <https://www.ververica.com/>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Follow us @VervericaData
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Join Flink Forward <https://flink-forward.org/> - The Apache
>>>>>>>>>>>>> Flink Conference
>>>>>>>>>>>>> Stream Processing | Event Driven | Real Time
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin,
>>>>>>>>>>>>> Germany
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Data Artisans GmbH
>>>>>>>>>>>>> Registered at Amtsgericht Charlottenburg: HRB 158244 B
>>>>>>>>>>>>> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Konstantin Knauf | Solutions Architect
>>>>>>>>>>>> +49 160 91394525
>>>>>>>>>>>> <https://www.ververica.com/>
>>>>>>>>>>>>
>>>>>>>>>>>> Follow us @VervericaData
>>>>>>>>>>>> --
>>>>>>>>>>>> Join Flink Forward <https://flink-forward.org/> - The Apache
>>>>>>>>>>>> Flink Conference
>>>>>>>>>>>> Stream Processing | Event Driven | Real Time
>>>>>>>>>>>> --
>>>>>>>>>>>> Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germa=
ny
>>>>>>>>>>>> --
>>>>>>>>>>>> Data Artisans GmbH
>>>>>>>>>>>> Registered at Amtsgericht Charlottenburg: HRB 158244 B
>>>>>>>>>>>> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Konstantin Knauf | Solutions Architect
>>>>>>>>
>>>>>>>> +49 160 91394525
>>>>>>>>
>>>>>>>> <https://www.ververica.com/>
>>>>>>>>
>>>>>>>> Follow us @VervericaData
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>>>>>>>> Conference
>>>>>>>>
>>>>>>>> Stream Processing | Event Driven | Real Time
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>>>>>>>>
>>>>>>>> --
>>>>>>>> Data Artisans GmbH
>>>>>>>> Registered at Amtsgericht Charlottenburg: HRB 158244 B
>>>>>>>> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
>>>>>>>>
>>>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Konstantin Knauf | Solutions Architect
>>>>>
>>>>> +49 160 91394525
>>>>>
>>>>> <https://www.ververica.com/>
>>>>>
>>>>> Follow us @VervericaData
>>>>>
>>>>> --
>>>>>
>>>>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>>>>> Conference
>>>>>
>>>>> Stream Processing | Event Driven | Real Time
>>>>>
>>>>> --
>>>>>
>>>>> Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>>>>>
>>>>> --
>>>>> Data Artisans GmbH
>>>>> Registered at Amtsgericht Charlottenburg: HRB 158244 B
>>>>> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
>>>>>
>>>>
>
> --
>
> Konstantin Knauf | Solutions Architect
>
> +49 160 91394525
>
>
> Planned Absences: 17.04.2019 - 26.04.2019
>
>
> <https://www.ververica.com/>
>
> Follow us @VervericaData
>
> --
>
> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
> --
>
> Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>
> --
> Data Artisans GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
>

--0000000000001bca7a0587f7aa7c
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div di=
r=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"lt=
r"><div dir=3D"ltr"><div>Konstantin, thanks for providing the new code.</di=
v><div><br></div><div>Here are the latest results for jobs run with extende=
d DEBUG logging.</div><div dir=3D"ltr"><br></div><div dir=3D"ltr">20190427 =
(killed &amp; restored), missing_rows.count(): 3470<br></div><div dir=3D"lt=
r"><div dir=3D"ltr">20190428=C2=A0(no kill / restore), missing_rows.count()=
:=C2=A00<br></div><br class=3D"m_-1673664296254471563gmail-Apple-interchang=
e-newline"></div><div>I have shared the logs from 27th (after restore) in p=
rivate with Konstantin.</div></div></div></div></div><br><div class=3D"gmai=
l_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Fri, Apr 26, 2019 at 5:05=
 PM Konstantin Knauf &lt;<a href=3D"mailto:konstantin@ververica.com" target=
=3D"_blank">konstantin@ververica.com</a>&gt; wrote:<br></div><blockquote cl=
ass=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid=
 rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><div =
dir=3D"ltr"><div dir=3D"ltr"><div>Hi Juho, <br></div><div><br></div><div>so=
rry for not being more responsive the last two weeks, I was on vacation for=
 a good part of it. The fact that this also happens with Timers on RocksDB =
is again confusing. The code that we mainly had a look at so far is not use=
d by the rocksdb configuration. So the inconsistencies that we saw in the l=
ogs, don&#39;t apply to the RocksDB configuration. <br></div><div><br></div=
><div> Anyway, I agree to further track down the issue for the heap timers =
first, and then to move on to RocksDB. I have added more fine grained loggi=
ng to the branch [1]. The two additional classes, which you need to set the=
 logging level to DEBUG for, are</div><div><br></div><div>org.apache.flink.=
streaming.api.operators.StreamTaskStateInitializerImpl</div><div>org.apache=
.flink.streaming.api.operators.InternalTimerServiceSerializationProxy</div>=
<div><br></div><div>Please run through the usual procedure of doing a savep=
oint and provide the logs during recovery. <br></div><div><br></div><div>Th=
ank you for your perseverance, <br></div><div><br></div><div>Konstantin<br>=
</div><div><br></div><div>[1] <a href=3D"https://github.com/knaufk/flink/tr=
ee/logging-timers" target=3D"_blank">https://github.com/knaufk/flink/tree/l=
ogging-timers</a><br></div><div><br></div></div></div></div></div><br><div =
class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Thu, Apr 18,=
 2019 at 4:06 PM Oytun Tez &lt;<a href=3D"mailto:oytun@motaword.com" target=
=3D"_blank">oytun@motaword.com</a>&gt; wrote:<br></div><blockquote class=3D=
"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(2=
04,204,204);padding-left:1ex"><div dir=3D"ltr">Thanks for the update, Juho,=
 and please do keep updating :) I&#39;ve been watching the thread silently,=
 I am sure your findings helps many others who watch the thread.<div><div><=
br></div><div><br></div><div><br></div><div><br clear=3D"all"><div><div dir=
=3D"ltr" class=3D"m_-1673664296254471563gmail-m_4191132683903612140gmail-m_=
4780285061551269792gmail-m_3835006982768982814gmail-m_5222226725689792831gm=
ail_signature"><div dir=3D"ltr"><div style=3D"font-size:12px;font-family:He=
lvetica"><div style=3D"font-size:x-small"><span style=3D"font-size:12px"><f=
ont face=3D"Tahoma"><br></font></span></div><div style=3D"font-size:x-small=
"><span style=3D"font-size:12px"><font face=3D"Tahoma">---</font></span></d=
iv><div style=3D"font-size:x-small"><span style=3D"font-size:12px"><font fa=
ce=3D"Tahoma">Oytun Tez</font></span></div></div><div><div style=3D"font-si=
ze:12px;font-family:Helvetica"><div style=3D"font-family:&quot;Open Sans&qu=
ot;;font-size:x-small"></div><div style=3D"font-family:&quot;Open Sans&quot=
;;font-size:x-small"><br></div></div><div style=3D"font-size:12px;font-fami=
ly:Helvetica"><b style=3D"font-size:18px"><font face=3D"Tahoma" color=3D"#0=
00000">M O T A W O R D</font></b></div></div><div style=3D"font-size:12px;f=
ont-family:Helvetica"><div><div style=3D"line-height:1.15;margin-top:0pt;ma=
rgin-bottom:0pt"><span style=3D"vertical-align:baseline;color:rgb(12,18,25)=
;font-size:14px;font-style:italic;white-space:pre-wrap"><font face=3D"Palat=
ino">The World&#39;s Fastest Human Translation Platform.</font></span></div=
></div><div style=3D"font-size:14px"><font face=3D"Tahoma"><font style=3D"f=
ont-size:12px"><a href=3D"mailto:oytun@motaword.com" style=3D"color:rgb(17,=
85,204)" target=3D"_blank">oytun<font color=3D"#1155cc">@motaword.com</font=
></a>=C2=A0=E2=80=94=C2=A0</font><span style=3D"font-size:12px"><a href=3D"=
http://www.motaword.com/" style=3D"color:rgb(17,85,204)" target=3D"_blank">=
www.motaword.com</a></span></font></div></div></div></div></div><br></div><=
/div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_a=
ttr">On Thu, Apr 18, 2019 at 8:26 AM Juho Autio &lt;<a href=3D"mailto:juho.=
autio@rovio.com" target=3D"_blank">juho.autio@rovio.com</a>&gt; wrote:<br><=
/div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;bo=
rder-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><di=
v dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=
=3D"ltr"><div>In the meanwhile, some additional results, continued with=C2=
=A0ROCKSDB timer service:</div><div dir=3D"ltr"><br></div><div dir=3D"ltr">=
20190416 (no cancellation),=C2=A0missing_rows.count(): 0<br></div><div dir=
=3D"ltr">20190417 (cancel with savepoint &amp; restore),=C2=A0missing_rows.=
count(): 54</div></div></div></div></div></div><br><div class=3D"gmail_quot=
e"><div dir=3D"ltr" class=3D"gmail_attr">On Tue, Apr 16, 2019 at 2:35 PM Ju=
ho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com" target=3D"_blank">juho=
.autio@rovio.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" =
style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);pa=
dding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=
=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr=
"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div>Ouch, we have a d=
ata loss case now also with ROCKSDB timer service factory. This time the jo=
b had failed for some reason &amp; restored checkpoint by itself (I mean I =
didn=E2=80=99t cancel with savepoint this time. Previous restore from savep=
oint was at 14-04-2019 06:21:45 UTC).</div><div><br></div><div>In this case=
 the number of lost ids was quite high:</div><div><br></div><div>20190415, =
missing_rows.count():=C2=A0706605<br></div><div><br></div><div>I don&#39;t =
know if the ROCKSDB timer service is a factor towards higher instability, b=
ut indeed I&#39;d like to go back to testing with InteralTimerServiceImpl a=
s well. Will switch back to that when the updated branch is available. Also=
 I&#39;m not sure if the cause of data loss is similar now with ROCKSDB tim=
er service factory (lost timers or maybe something else), because we didn&#=
39;t have corresponding DEBUG logging for this implementation.</div><br><di=
v class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Mon, Apr 1=
5, 2019 at 11:27 AM Konstantin Knauf &lt;<a href=3D"mailto:konstantin@verve=
rica.com" target=3D"_blank">konstantin@ververica.com</a>&gt; wrote:<br></di=
v><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;borde=
r-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div>H=
i Juho, <br></div><div><br></div><div>this is good news indeed! I have had =
a look at the _metadata files and logs on Friday and it looks like a) the t=
imer state is contained in the savepoint files and b) the timer state is al=
so initially read by the <span style=3D"font-family:monospace,monospace">Ta=
skStateManagerImpl</span>, but they it is somehow lost until the reach the =
<span style=3D"font-family:monospace,monospace">InteralTimerServiceImpl</sp=
an>. I will provide updated version of my branch with more logging output t=
o find the reason for this today or tomorrow. It would be great, if you cou=
ld test this again then. <br></div><div><br></div><div>Best, <br></div><div=
><br></div><div>Konstantin<br></div></div><br><div class=3D"gmail_quote"><d=
iv dir=3D"ltr" class=3D"gmail_attr">On Mon, Apr 15, 2019 at 9:49 AM Juho Au=
tio &lt;<a href=3D"mailto:juho.autio@rovio.com" target=3D"_blank">juho.auti=
o@rovio.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=
=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding=
-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div>Hi,</div=
><div><br></div><div>Great news:</div><div>There&#39;s no data loss (for th=
e 3 days so far that were run) with state.backend.rocksdb.timer-service.fac=
tory: ROCKSDB.<br></div><div><br></div><div><div>Each day the job was once =
cancelled with savepoint &amp; restored.</div><div><br></div><div>20190412,=
 missing_rows.count(): 0</div><div>20190413, missing_rows.count(): 0</div><=
div>20190414, missing_rows.count(): 0</div></div><div dir=3D"ltr"><br></div=
><div dir=3D"ltr">Btw, now we don&#39;t get the DEBUG logs of org.apache.fl=
ink.streaming.api.operators.InternalTimerServiceImpl any more, so I didn=
9;t know how to check from logs how many timers are restored. But based on =
the results I&#39;m assuming that all were successfully restored.</div><div=
 dir=3D"ltr"><br></div><div>We&#39;ll keep testing this a bit more, but see=
ms really promising indeed. I thought at least letting it run for some days=
 without cancellations and on the other hand cancelling many times within t=
he same day etc.</div><div><br></div><div>Can I provide some additional deb=
ug logs or such to help find the bug when &#39;heap&#39; is used for timers=
? Did you already analyze the _metadata files that I sent?</div><br><div cl=
ass=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Thu, Apr 11, 2=
019 at 4:21 PM Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com" targe=
t=3D"_blank">juho.autio@rovio.com</a>&gt; wrote:<br></div><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rg=
b(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=
=3D"ltr">Shared _metadata files also, in private.</div><div dir=3D"ltr"><br=
></div><div>The job is now running with state.backend.rocksdb.timer-service=
.factory: ROCKSDB. I started it from empty state because I wasn&#39;t sure =
would this change be migrated automatically(?). I guess clean setup like th=
is is a good idea any way. First day that is fully processed with this conf=
 will be tomorrow=3DFriday, and results can be compared on the next day.. I=
&#39;ll report back on that on Monday. I verified from Flink UI that the pr=
operty is found in Configuration, but I still feel a bit unsure about if it=
&#39;s actually being used. I wonder if there&#39;s some INFO level logging=
 that could be checked to confirm that?</div><div><br></div><div>Thanks.</d=
iv><div><br></div><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmai=
l_attr">On Thu, Apr 11, 2019 at 4:01 PM Konstantin Knauf &lt;<a href=3D"mai=
lto:konstantin@ververica.com" target=3D"_blank">konstantin@ververica.com</a=
>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px=
 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><di=
v dir=3D"ltr"><div>Hi Juho, <br></div><div><br></div><div>thank you. I will=
 have a look at your logs later today or tomorrow. Could you also provide t=
he metadata file of the savepoints in question? It is located in the parent=
 directory of that savepoint and should follow this naming ptterns &quot;sa=
vepoints_.*_savepoint_.*__metadata&quot;. <br></div><div><br></div><div>Bes=
t, <br></div><div><br></div><div>Konstantin<br></div></div><br><div class=
=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Thu, Apr 11, 2019=
 at 2:39 PM Stefan Richter &lt;<a href=3D"mailto:s.richter@ververica.com" t=
arget=3D"_blank">s.richter@ververica.com</a>&gt; wrote:<br></div><blockquot=
e class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px s=
olid rgb(204,204,204);padding-left:1ex"><div>No, it also matters for savepo=
ints. I think the doc here is misleading, it is currently synchronous for a=
ll cases of RocksDB keyed state and heap timers.=C2=A0<div><br></div><div>B=
est,</div><div>Stefan<br><div><br><blockquote type=3D"cite"><div>On 11. Apr=
 2019, at 14:30, Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com" tar=
get=3D"_blank">juho.autio@rovio.com</a>&gt; wrote:</div><br class=3D"m_-167=
3664296254471563gmail-m_4191132683903612140gmail-m_4780285061551269792gmail=
-m_3835006982768982814gmail-m_5222226725689792831gmail-m_-14256835536159255=
75gmail-m_893230967860415499gmail-m_4005067001349631360gmail-m_-28834896774=
11389187gmail-m_8652012098191863984gmail-m_1534525690290176503gmail-m_82767=
19956164441117gmail-m_-992552166682970606Apple-interchange-newline"><div><d=
iv dir=3D"ltr"><div dir=3D"ltr">Thanks Till. Any way, that&#39;s irrelevant=
 in case of a savepoint, right?</div><br><div class=3D"gmail_quote"><div di=
r=3D"ltr" class=3D"gmail_attr">On Thu, Apr 11, 2019 at 2:54 PM Till Rohrman=
n &lt;<a href=3D"mailto:trohrmann@apache.org" target=3D"_blank">trohrmann@a=
pache.org</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=
=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding=
-left:1ex"><div dir=3D"ltr">Hi Juho,<div><br></div><div>yes, it means that =
the snapshotting of the timer state does not happen asynchronously but sync=
hronously within the Task executor thread. During this operation, your oper=
ator won&#39;t make any progress, potentially causing backpressure for upst=
ream operators.</div><div><br></div><div>If you want to use fully asynchron=
ous snapshots while also using timer state, you should use the RocksDB back=
ed timers.</div><div><br></div><div>Cheers,</div><div>Till</div></div><br><=
div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Thu, Apr=
 11, 2019 at 10:32 AM Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com=
" target=3D"_blank">juho.autio@rovio.com</a>&gt; wrote:<br></div><blockquot=
e class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px s=
olid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><=
div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr">Ok, I=
9;m testing that state.backend.rocksdb.timer-service.factory: ROCKSDB in th=
e meanwhile.</div><div dir=3D"ltr"><br></div><div dir=3D"ltr">Btw, what doe=
s this actually mean (from=C2=A0<a href=3D"https://ci.apache.org/projects/f=
link/flink-docs-stable/ops/state/large_state_tuning.html" target=3D"_blank"=
>https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/large_sta=
te_tuning.html</a>):<br><div><br></div><div>&gt; The combination RocksDB st=
ate backend / with incremental checkpoint / with heap-based timers currentl=
y does NOT support asynchronous snapshots for the timers state. Other state=
 like keyed state is still snapshotted asynchronously. Please note that thi=
s is not a regression from previous versions and will be resolved with FLIN=
K-10026.<br></div><div><br></div><div>Is it just that snapshots are not asy=
nchronous, so they cause some pauses? Does &quot;not supported&quot; here m=
ean just some performance impact, or also correctness?</div><div><br></div>=
<div>Our job at hand is using RocksDB state backend and incremental checkpo=
inting. However at least the restores that we&#39;ve been testing here have=
 been from a <i>savepoint</i>, not an incremental checkpoint.</div></div></=
div></div></div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=
=3D"gmail_attr">On Wed, Apr 10, 2019 at 4:46 PM Konstantin Knauf &lt;<a hre=
f=3D"mailto:konstantin@ververica.com" target=3D"_blank">konstantin@ververic=
a.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"ma=
rgin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:=
1ex"><div dir=3D"ltr"><div>Hi Juho, <br></div><div><br></div><div>one more =
thing we could try in a separate experiment is to change the timer state ba=
ckend to RocksDB as well by setting <br></div><div></div><div><h5 id=3D"m_-=
1673664296254471563gmail-m_4191132683903612140gmail-m_4780285061551269792gm=
ail-m_3835006982768982814gmail-m_5222226725689792831gmail-m_-14256835536159=
25575gmail-m_893230967860415499gmail-m_4005067001349631360gmail-m_-28834896=
77411389187gmail-m_8652012098191863984gmail-m_1534525690290176503gmail-m_82=
76719956164441117gmail-m_-992552166682970606gmail-m_-6380244061938198074gma=
il-m_7498041852700543318gmail-m_-5875148220391401525gmail-state-backend-roc=
ksdb-timer-service-factory"><span style=3D"font-family:monospace,monospace"=
><font size=3D"2"><span style=3D"font-weight:normal">state.backend.rocksdb.=
timer-service.factory: ROCKSDB</span></font></span></h5><div>in the flink-c=
onf.yaml and see if this also leads to the loss of records. That would narr=
ow it down quite a bit.<br></div><div><br></div><div>Best, <br></div><div><=
br></div><div>Konstantin<br></div><div><br></div><div><br></div></div></div=
><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On We=
d, Apr 10, 2019 at 1:02 PM Konstantin Knauf &lt;<a href=3D"mailto:konstanti=
n@ververica.com" target=3D"_blank">konstantin@ververica.com</a>&gt; wrote:<=
br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8e=
x;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"=
><div dir=3D"ltr"><div dir=3D"ltr"><div>Hi Juho, <br></div><div><br></div><=
div>sorry for the late reply.  Please continue to use the custom Flink buil=
d and add additional logging for TaskStateManagerImpl by adding the followi=
ng line to your log4j configuration.<br></div><div></div><div><div><pre sty=
le=3D"background-color:rgb(255,255,255);font-family:&quot;DejaVu Sans Mono&=
quot;;font-size:12pt"><font size=3D"2">log4j.logger.org.apache.flink.runtim=
e.state.TaskStateManagerImpl=3D</font><font size=3D"2">DEBUG</font><span st=
yle=3D"color:rgb(0,128,0);font-weight:bold"><br></span></pre></div></div></=
div></div><div>Afterwards, do a couple of savepoint &amp; restore until you=
 see a number of restores &lt; 80 as before and share the logs with me (at =
least for TaskStateMangerImpl &amp; InternalTimerServiceImpl). <br></div><d=
iv><br></div><div>Best, <br></div><div><br></div><div>Konstantin<br></div><=
div><div><br></div></div><div class=3D"gmail_quote"><div dir=3D"ltr" class=
=3D"gmail_attr">On Thu, Apr 4, 2019 at 9:03 AM Juho Autio &lt;<a href=3D"ma=
ilto:juho.autio@rovio.com" target=3D"_blank">juho.autio@rovio.com</a>&gt; w=
rote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0p=
x 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=
=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr=
"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div =
dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div>Hi Konstantin,</div><div=
><br></div><div>Thanks for the follow-up.</div><div>=C2=A0</div><blockquote=
 class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px so=
lid rgb(204,204,204);padding-left:1ex">There are only 76 lines for restore =
in Job 3 instead of 80. It would be very useful to know, if these lines wer=
e lost by the log aggregation or really did not exist.=C2=A0</blockquote><d=
iv><br></div><div>I fetched the actual taskmanager.log files to verify (we =
store the original files on s3). Then did grep for &quot;InternalTimerServi=
ceImpl=C2=A0 - Restored&quot;.</div><div><br></div><div>This is for &quot;j=
ob 1. (start - end) first restore with debug logging&quot;:<br></div><div>A=
round 2019-03-26 09:08:43,352 - 78 hits<br></div><div><br></div><div>This i=
s for &quot;job 3. (start-middle) 3rd restore with debug logging (following=
 day)&quot;:<br></div><div>Around=C2=A02019-03-27 07:39:06,414 - 76 hits</d=
iv><div><br></div><div><div>So yeah, we can rely on our log delivery to Kib=
ana.</div><br class=3D"m_-1673664296254471563gmail-m_4191132683903612140gma=
il-m_4780285061551269792gmail-m_3835006982768982814gmail-m_5222226725689792=
831gmail-m_-1425683553615925575gmail-m_893230967860415499gmail-m_4005067001=
349631360gmail-m_-2883489677411389187gmail-m_8652012098191863984gmail-m_153=
4525690290176503gmail-m_8276719956164441117gmail-m_-992552166682970606gmail=
-m_-6380244061938198074gmail-m_7498041852700543318gmail-m_-5875148220391401=
525gmail-m_4149013583796159872gmail-m_8232114597330332459gmail-Apple-interc=
hange-newline"></div><div>Note that as a new piece of information I found t=
hat the same job also did an automatic restore from checkpoint around 2019-=
03-30 20:36 and there were 79 hits instead of 80. So it doesn&#39;t seem to=
 be only a problem in case of savepoints, can happen with a checkpoint rest=
ore as well.</div><div><br></div><blockquote class=3D"gmail_quote" style=3D=
"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-le=
ft:1ex">Were there any missing records in the output for the day of the Job=
 1 -&gt; Job 2 transition (26th of March)?</blockquote><div><br></div><div>=
20190326: missing 2592<br></div><div>20190327: missing 4270<br></div><div><=
br></div><div>This even matches with the fact that on 26th 2 timers were mi=
ssed in restore but on 27th it was 4.</div><div><br></div><div>What&#39;s n=
ext? :)</div></div></div></div></div></div></div></div></div></div></div></=
div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On=
 Thu, Apr 4, 2019 at 12:32 AM Konstantin Knauf &lt;<a href=3D"mailto:konsta=
ntin@ververica.com" target=3D"_blank">konstantin@ververica.com</a>&gt; wrot=
e:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0=
.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"l=
tr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><di=
v>Hi Juho,=C2=A0</div><div><br></div><div>one thing that makes the log outp=
ut a little bit hard to analyze is the fact, that the &quot;Snapshot&quot; =
lines include Savepoints as well as Checkpoints. To identify the savepoints=
, I looked at the last 80 lines per job, which seems plausible given the ti=
mestamps of the lines.</div><div><br></div><div>So, let&#39;s compare the n=
umber of timers before and after restore: <br></div><div><br></div><div>Job=
 1 -&gt; Job 2<br></div></div><div dir=3D"ltr"><br></div><div dir=3D"ltr">2=
3.091.002 event time timers for both. All timers for the same window. So th=
is looks good.</div><div dir=3D"ltr"><br></div><div>Job 2 -&gt; Job 3</div>=
<div><br></div><div>18.565.234 timers during snapshotting. All timers for t=
he same window.</div>17.636.774 timers during restore. All timers for the s=
ame window. <br></div><div dir=3D"ltr"><br></div><div>There are only 76 lin=
es for restore in Job 3 instead of 80. It would be very useful to know, if =
these lines were lost by the log aggregation or really did not exist. <br><=
/div></div><div dir=3D"ltr"><br></div><div>Were there any missing records i=
n the output for the day of the Job 1 -&gt; Job 2 transition (26th of March=
)?</div><div><br></div><div>Best, <br></div><div><br></div><div>Konstantin<=
br></div><div dir=3D"ltr"><div dir=3D"ltr"><div><br></div><div dir=3D"ltr">=
<div><div><br></div><div><br></div></div><div><br></div><div dir=3D"ltr"><d=
iv class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Fri, Mar =
29, 2019 at 2:21 PM Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com" =
target=3D"_blank">juho.autio@rovio.com</a>&gt; wrote:<br></div><blockquote =
class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px sol=
id rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><di=
v dir=3D"ltr"><div dir=3D"ltr"><div>Thanks,</div><div><br></div><div>I crea=
ted a zip with these files:</div><div><br></div><div>job 1. (start - end) f=
irst restore with debug logging</div><div>job 2. (start-middle) second rest=
ore with debug logging (same day)</div><div>job 2. (middle - end) before sa=
vepoint &amp; cancel (following day)</div><div>job 3. (start-middle) 3rd re=
store with debug logging (following day)<br></div><div><br></div><div>It ca=
n be downloaded here:</div><div><a href=3D"https://www.dropbox.com/s/33z0jb=
olueokao6/flink_debug_logs.zip?dl=3D0" target=3D"_blank">https://www.dropbo=
x.com/s/33z0jbolueokao6/flink_debug_logs.zip?dl=3D0</a><br></div></div></di=
v></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr=
">On Thu, Mar 28, 2019 at 7:08 PM Konstantin Knauf &lt;<a href=3D"mailto:ko=
nstantin@ververica.com" target=3D"_blank">konstantin@ververica.com</a>&gt; =
wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0=
px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=
=3D"ltr"><div>Hi Juho, <br></div><div><br></div><div>Yes, the number is the=
 last number in the line. Feel free to share all lines.=C2=A0</div><div><br=
></div><div>Best, <br></div><div><br></div><div>Konstantin<br></div></div><=
br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Thu,=
 Mar 28, 2019 at 5:00 PM Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.=
com" target=3D"_blank">juho.autio@rovio.com</a>&gt; wrote:<br></div><blockq=
uote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1p=
x solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr=
"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr">Hi Konstantin!</div><d=
iv dir=3D"ltr"><br></div><div dir=3D"ltr"><blockquote class=3D"gmail_quote"=
 style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);p=
adding-left:1ex">I would be interested in any changes in the number of time=
rs, not only the number of logged messages.</blockquote><div><br></div>Sorr=
y for the delay. I see, the count is the number of timers that last number =
on log line. For example for this row it&#39;s 270409:<div><br></div><div><=
blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-l=
eft:1px solid rgb(204,204,204);padding-left:1ex">March 26th 2019, 11:08:39.=
822<span style=3D"white-space:pre-wrap">	</span>DEBUG<span style=3D"white-s=
pace:pre-wrap">	</span>org.apache.flink.streaming.api.operators.InternalTim=
erServiceImpl<span style=3D"white-space:pre-wrap">	</span>Restored: TimeWin=
dow{start=3D1553558400000, end=3D1553644800000} -&gt; 270409</blockquote></=
div><div><br></div><div>The log lines don&#39;t contain task id =E2=80=93 h=
ow should they be compared across different snapshots? Or should I share al=
l of these logs (at least couple of snapshots around the point of restore) =
and you&#39;ll compare them?<br></div><div><br></div><div>Thanks.</div></di=
v></div></div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=
=3D"gmail_attr">On Tue, Mar 26, 2019 at 9:55 PM Konstantin Knauf &lt;<a hre=
f=3D"mailto:konstantin@ververica.com" target=3D"_blank">konstantin@ververic=
a.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"ma=
rgin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:=
1ex"><div dir=3D"ltr"><div>Hi Juho, <br></div><div><br></div><div>I based t=
he branch on top of the current 1.6.4 branch. I can rebase on 1.6.2 for any=
 future iterations. I would be interested in any changes in the number of t=
imers, not only the number of logged messages. The sum of all counts should=
 be the same during snapshotting and restore. While a window is open, this =
number should always increase (when comparing multiple snapshots). <br></di=
v><div><br></div><div>Best, <br></div><div><br></div><div>Konstantin<br></d=
iv><div><br></div><div><br></div><div><br></div><div><br></div><div><br></d=
iv></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_att=
r">On Tue, Mar 26, 2019 at 11:01 AM Juho Autio &lt;<a href=3D"mailto:juho.a=
utio@rovio.com" target=3D"_blank">juho.autio@rovio.com</a>&gt; wrote:<br></=
div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;bor=
der-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div=
 dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D=
"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><=
div dir=3D"ltr"><div dir=3D"ltr"><div>Hi Konstantin,</div><div><br></div><d=
iv>I got that debug logging working.</div><div><br></div><div><blockquote c=
lass=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px soli=
d rgb(204,204,204);padding-left:1ex">You would now need to take a savepoint=
 and restore sometime in the middle of the day and should be able to check<=
br>a) if there are any timers for the very old windows, for which there is =
still some content lingering around<br></blockquote><div><br></div><div>No =
timers for old windows were logged.</div><div><br></div><div>All timers are=
 for the same time window, for example:</div><div><br></div><div><blockquot=
e class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px s=
olid rgb(204,204,204);padding-left:1ex">March 26th 2019, 11:08:39.822<span =
style=3D"white-space:pre-wrap">	</span>DEBUG<span style=3D"white-space:pre-=
wrap">	</span>org.apache.flink.streaming.api.operators.InternalTimerService=
Impl<span style=3D"white-space:pre-wrap">	</span>Restored: TimeWindow{start=
=3D1553558400000, end=3D1553644800000} -&gt; 270409</blockquote></div><div>=
<br></div><div>Those milliseconds correspond to:</div><div>Tue Mar 26 00:00=
:00 UTC 2019 =E2=80=93=C2=A0Wed Mar 27 00:00:00 UTC 2019.<br></div><div>- S=
o this seems normal</div><div>=C2=A0</div><blockquote class=3D"gmail_quote"=
 style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);p=
adding-left:1ex">b) if there less timers after restore for the current wind=
ow. The missing timers would be recreated, as soon as any additional record=
s for the same key arrive within the window. This means the number of missi=
ng records might be less then the number of missing timers.</blockquote></d=
iv><div><br></div><div><div>Grepping for &quot;Restored&quot; gives 78 hits=
. That&#39;s suspicious because this job&#39;s parallelism is 80. The follo=
wing group for grep &quot;Snapshot&quot; already gives 80 hits. Ok actually=
 that would match with what you wrote: &quot;missing timers would be recrea=
ted, as soon as any additional records for the same key arrive within the w=
indow&quot;.</div></div><div><br></div><div>I tried killing &amp; restoring=
 once more. This time grepping for &quot;Restored&quot; gives 80 hits. Note=
 that it&#39;s possible that some logs had been lost around the time of res=
toration because I&#39;m browsing the logs through Kibana (ELK stack).</div=
><div><br></div><div>I will try kill &amp; restore again tomorrow around no=
on &amp; collect the same info. Is there anything else that you&#39;d like =
me to share?</div><div><br></div><div>By the way, it seems that your branch=
* is not based on 1.6.2 release, why so? It probably doesn&#39;t matter, bu=
t in general would be good to minimize the scope of changes. But let&#39;s =
roll with this for now, I don&#39;t want to build another package because i=
t seems like we&#39;re able to replicate the issue with this version :)</di=
v><div><br></div><div>Thanks,</div><div>Juho</div><div dir=3D"ltr"><br></di=
v><div dir=3D"ltr">*)=C2=A0<a href=3D"https://github.com/apache/flink/compa=
re/release-1.6.2...knaufk:logging-timers" target=3D"_blank">https://github.=
com/apache/flink/compare/release-1.6.2...knaufk:logging-timers</a></div></d=
iv></div></div></div></div></div></div></div></div></div><br><div class=3D"=
gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Wed, Mar 20, 2019 at =
2:20 PM Konstantin Knauf &lt;<a href=3D"mailto:konstantin@ververica.com" ta=
rget=3D"_blank">konstantin@ververica.com</a>&gt; wrote:<br></div><blockquot=
e class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px s=
olid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><=
div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=
=3D"ltr"><div><div>Hi Juho,=C2=A0</div><div><br></div><div>I created a bran=
ch [1] which logs the number of event time timers per namespace during snap=
shot and restore.=C2=A0 Please refer to [2] to build Flink from sources.</d=
iv><div><br></div><div>You need to set the logging level to DEBUG for org.a=
pache.flink.streaming.api.operators.InternalTimerServiceImpl. If you use lo=
g4j this is a one-liner in your log4j.properties:</div><div></div><div><pre=
 style=3D"background-color:rgb(255,255,255);font-family:&quot;DejaVu Sans M=
ono&quot;;font-size:12pt"><font size=3D"2">log4j.logger.org.apache.flink.st=
reaming.api.operators.InternalTimerServiceImpl=3D</font><font size=3D"2">DE=
BUG</font><span style=3D"color:rgb(0,128,0);font-weight:bold"><br></span></=
pre></div><div></div><div>The only additional logs will be the lines added =
in the branch. The lines are of the following format (&lt;Window&gt; -&gt; =
&lt;Number of Timers&gt;), e.g.</div><div><br></div><font size=3D"1">DEBUG =
org.apache.flink.streaming.api.operators.InternalTimerServiceImpl=C2=A0 - S=
napshot: TimeWindow{start=3D1553083589256, end=3D1553083589258} -&gt; 1<br>=
DEBUG org.apache.flink.streaming.api.operators.InternalTimerServiceImpl=C2=
=A0 - Snapshot: TimeWindow{start=3D1553083589256, end=3D1553083589258} -&gt=
; 2<br>DEBUG org.apache.flink.streaming.api.operators.InternalTimerServiceI=
mpl=C2=A0 - Snapshot: TimeWindow{start=3D1553083589456, end=3D1553083589458=
} -&gt; 2<br>DEBUG org.apache.flink.streaming.api.operators.InternalTimerSe=
rviceImpl=C2=A0 - Snapshot: TimeWindow{start=3D1553083589356, end=3D1553083=
589358} -&gt; 2<br>DEBUG org.apache.flink.streaming.api.operators.InternalT=
imerServiceImpl=C2=A0 - Snapshot: TimeWindow{start=3D1553083589482, end=3D1=
553083589484} -&gt; 2<br>DEBUG org.apache.flink.streaming.api.operators.Int=
ernalTimerServiceImpl=C2=A0 - Snapshot: TimeWindow{start=3D1553083589456, e=
nd=3D1553083589458} -&gt; 1<br>DEBUG org.apache.flink.streaming.api.operato=
rs.InternalTimerServiceImpl=C2=A0 - Snapshot: TimeWindow{start=3D1553083589=
256, end=3D1553083589258} -&gt; 2<br>DEBUG org.apache.flink.streaming.api.o=
perators.InternalTimerServiceImpl=C2=A0 - Snapshot: TimeWindow{start=3D1553=
083589356, end=3D1553083589358} -&gt; 1<br>DEBUG org.apache.flink.streaming=
.api.operators.InternalTimerServiceImpl=C2=A0 - Snapshot: TimeWindow{start=
=3D1553083589456, end=3D1553083589458} -&gt; 2<br>DEBUG org.apache.flink.st=
reaming.api.operators.InternalTimerServiceImpl=C2=A0 - Snapshot: TimeWindow=
{start=3D1553083589482, end=3D1553083589484} -&gt; 1<br>DEBUG org.apache.fl=
ink.streaming.api.operators.InternalTimerServiceImpl=C2=A0 - Snapshot: Time=
Window{start=3D1553083589356, end=3D1553083589358} -&gt; 2<br>DEBUG org.apa=
che.flink.streaming.api.operators.InternalTimerServiceImpl=C2=A0 - Snapshot=
: TimeWindow{start=3D1553083589482, end=3D1553083589484} -&gt; 2<br>DEBUG o=
rg.apache.flink.streaming.api.operators.InternalTimerServiceImpl=C2=A0 - Re=
stored: TimeWindow{start=3D1553083589256, end=3D1553083589258} -&gt; 1<br>D=
EBUG org.apache.flink.streaming.api.operators.InternalTimerServiceImpl=C2=
=A0 - Restored: TimeWindow{start=3D1553083589456, end=3D1553083589458} -&gt=
; 1<br>DEBUG org.apache.flink.streaming.api.operators.InternalTimerServiceI=
mpl=C2=A0 - Restored: TimeWindow{start=3D1553083589356, end=3D1553083589358=
} -&gt; 1<br>DEBUG org.apache.flink.streaming.api.operators.InternalTimerSe=
rviceImpl=C2=A0 - Restored: TimeWindow{start=3D1553083589482, end=3D1553083=
589484} -&gt; 1<br>DEBUG org.apache.flink.streaming.api.operators.InternalT=
imerServiceImpl=C2=A0 - Restored: TimeWindow{start=3D1553083589256, end=3D1=
553083589258} -&gt; 2<br>DEBUG org.apache.flink.streaming.api.operators.Int=
ernalTimerServiceImpl=C2=A0 - Restored: TimeWindow{start=3D1553083589456, e=
nd=3D1553083589458} -&gt; 2<br>DEBUG org.apache.flink.streaming.api.operato=
rs.InternalTimerServiceImpl=C2=A0 - Restored: TimeWindow{start=3D1553083589=
356, end=3D1553083589358} -&gt; 2<br>DEBUG org.apache.flink.streaming.api.o=
perators.InternalTimerServiceImpl=C2=A0 - Restored: TimeWindow{start=3D1553=
083589482, end=3D1553083589484} -&gt; 2<br>DEBUG org.apache.flink.streaming=
.api.operators.InternalTimerServiceImpl=C2=A0 - Restored: TimeWindow{start=
=3D1553083589256, end=3D1553083589258} -&gt; 2<br>DEBUG org.apache.flink.st=
reaming.api.operators.InternalTimerServiceImpl=C2=A0 - Restored: TimeWindow=
{start=3D1553083589456, end=3D1553083589458} -&gt; 2<br>DEBUG org.apache.fl=
ink.streaming.api.operators.InternalTimerServiceImpl=C2=A0 - Restored: Time=
Window{start=3D1553083589356, end=3D1553083589358} -&gt; 2<br>DEBUG org.apa=
che.flink.streaming.api.operators.InternalTimerServiceImpl=C2=A0 - Restored=
: TimeWindow{start=3D1553083589482, end=3D1553083589484} -&gt; 2</font><br>=
<font size=3D"1"></font><div><font size=3D"1"><br></font></div><div><div>Yo=
u would now need to take a savepoint and restore sometime in the middle of =
the day and should be able to check</div><div><br></div><div>a) if there ar=
e any timers for the very old windows, for which there is still some conten=
t lingering around</div><div>b) if there less timers after restore for the =
current window. The missing timers would be recreated, as soon as any addit=
ional records for the same key arrive within the window. This means the num=
ber of missing records might be less then the number of missing timers.</di=
v><div><br></div><div>Looking forward to the results!<br></div><div><br></d=
iv><div>Cheers, <br></div><div><br></div><div>Konstantin<br></div><div><br>=
</div><div>[1] <a href=3D"https://github.com/knaufk/flink/tree/logging-time=
rs" target=3D"_blank">https://github.com/knaufk/flink/tree/logging-timers</=
a><br></div><div>[2] <a href=3D"https://ci.apache.org/projects/flink/flink-=
docs-release-1.6/start/building.html#build-flink" target=3D"_blank">https:/=
/ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#bu=
ild-flink</a><br></div></div></div></div></div></div></div><br><div class=
=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Tue, Mar 19, 2019=
 at 2:06 PM Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com" target=
=3D"_blank">juho.autio@rovio.com</a>&gt; wrote:<br></div><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rg=
b(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=
=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr=
"><div>Thanks, answers below.</div><div><br></div><div>* Which Flink versio=
n do you need this for?</div><div><br></div><div>1.6.2</div><div><br></div>=
<div>* You use RocksDBStatebackend, correct? If so, which value do your set=
 for &quot;state.backend.rocksdb.timer-service.factory&quot; in the flink-c=
onf.yaml.</div><div><br></div><div>Yes, RocksDBStatebackend. We don&#39;t s=
et=C2=A0state.backend.rocksdb.timer-service.factory at all, so whatever is =
the default in Flink 1.6.2? Based on the docs it seems that it would be &qu=
ot;heap&quot;</div><div><a href=3D"https://ci.apache.org/projects/flink/fli=
nk-docs-release-1.6/ops/state/large_state_tuning.html" target=3D"_blank">ht=
tps://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/state/large_s=
tate_tuning.html</a></div></div></div></div></div></div></div><br><div clas=
s=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Mon, Mar 18, 201=
9 at 6:26 PM Konstantin Knauf &lt;<a href=3D"mailto:konstantin@ververica.co=
m" target=3D"_blank">konstantin@ververica.com</a>&gt; wrote:<br></div><bloc=
kquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:=
1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div>Hi Juho,=
 <br></div><div><br></div><div>I will prepare a Flink branch for you, which=
 logs the number of event time timers per window before snapshot and after =
restore. With this we should be able to check, if timers are lost during sa=
vepoints.</div><div><br></div><div>Two questions:<br></div><div><br></div><=
div>* Which Flink version do you need this for? 1.6?</div><div>* You use Ro=
cksDBStatebackend, correct? If so, which value do your set for &quot;state.=
backend.rocksdb.timer-service.factory&quot; in the flink-conf.yaml.</div><d=
iv><br></div><div>Cheers, <br></div><div><br></div><div>Konstantin<br></div=
><div><br></div><div><br></div></div><br><div class=3D"gmail_quote"><div di=
r=3D"ltr" class=3D"gmail_attr">On Thu, Mar 14, 2019 at 12:20 PM Juho Autio =
&lt;<a href=3D"mailto:juho.autio@rovio.com" target=3D"_blank">juho.autio@ro=
vio.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"=
margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-lef=
t:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><div>Hi Konstantin,</div><div><br>=
</div><div>Reading timers from snapshot doesn&#39;t seem straightforward. I=
 wrote in private with Gyula, he gave more suggestions (thanks!) but still =
it seems that it may be a rather big effort for me to figure it out. Would =
you be able to help with that? If yes, there&#39;s this existing unit test =
that can be extended to test reading timers: <a href=3D"https://github.com/=
king/bravo/blob/master/bravo/src/test/java/com/king/bravo/ReducerStateReadi=
ngTest.java#L37-L38" target=3D"_blank">https://github.com/king/bravo/blob/m=
aster/bravo/src/test/java/com/king/bravo/ReducerStateReadingTest.java#L37-L=
38</a> . The test already has a state with some values in reducer window st=
ate, so I&#39;m assuming that it must also contain some window timers.</div=
><div><br></div><div>This is what Gyula wrote to me:</div><div><br></div><d=
iv><p class=3D"MsoNormal"><span lang=3D"EN-US">Maybe I was wrong when I sai=
d the=C2=A0</span><span style=3D"font-size:9pt;font-family:Menlo" lang=3D"E=
N-US">createOperatorStateBackendsFromSnapshot is the way to do it.<u></u><u=
></u></span></p><p class=3D"MsoNormal"><span style=3D"font-size:9pt;font-fa=
mily:Menlo" lang=3D"EN-US">On a second thought Timers are probably stored a=
s raw keyed state in the operator. I don=E2=80=99t remember building any ut=
ility to read that.<u></u><u></u></span></p><p class=3D"MsoNormal"><span st=
yle=3D"font-size:9pt;font-family:Menlo" lang=3D"EN-US"><u></u>=C2=A0<u></u>=
</span></p><p class=3D"MsoNormal"><span style=3D"font-size:9pt;font-family:=
Menlo" lang=3D"EN-US">At the moment I am quite busy with other work so wont=
 have time to build it for you, so you might have to figure it out yourself=
.<u></u><u></u></span></p><p class=3D"MsoNormal"><span style=3D"font-size:9=
pt;font-family:Menlo" lang=3D"EN-US">I would try to look at how keyed state=
s are read:<u></u><u></u></span></p><p class=3D"MsoNormal"><span style=3D"f=
ont-size:9pt;font-family:Menlo" lang=3D"EN-US"><u></u>=C2=A0<u></u></span><=
/p><p class=3D"MsoNormal"><span style=3D"font-size:9pt;font-family:Menlo" l=
ang=3D"EN-US">Look at the implementation of: createOperatorStateBackendsFro=
mSnapshot()<u></u><u></u></span></p><p class=3D"MsoNormal"><span style=3D"f=
ont-size:9pt;font-family:Menlo" lang=3D"EN-US">Instead of getManagedOperato=
rState you want to try getRawKeyedState and also look at how Flink restores=
 it internally for Timers<u></u><u></u></span></p><p class=3D"MsoNormal"><s=
pan style=3D"font-size:9pt;font-family:Menlo" lang=3D"EN-US">I would start =
looking around here I guess:=C2=A0<a href=3D"https://github.com/apache/flin=
k/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming=
/api/operators/AbstractStreamOperator.java#L238" target=3D"_blank">https://=
github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/=
apache/flink/streaming/api/operators/AbstractStreamOperator.java#L238</a><u=
></u><u></u></span></p><p class=3D"MsoNormal"><span style=3D"font-size:9pt;=
font-family:Menlo" lang=3D"EN-US"><u></u>=C2=A0<u></u></span></p><p class=
=3D"MsoNormal"><span lang=3D"EN-US"><a href=3D"https://github.com/apache/fl=
ink/blob/e8daa49a593edc401cd44761b25b1324b11be4a6/flink-streaming-java/src/=
main/java/org/apache/flink/streaming/api/operators/StreamTaskStateInitializ=
erImpl.java#L199" target=3D"_blank">https://github.com/apache/flink/blob/e8=
daa49a593edc401cd44761b25b1324b11be4a6/flink-streaming-java/src/main/java/o=
rg/apache/flink/streaming/api/operators/StreamTaskStateInitializerImpl.java=
#L199</a></span></p></div><div dir=3D"ltr"><br></div><div class=3D"gmail_qu=
ote"><div dir=3D"ltr" class=3D"gmail_attr">On Tue, Mar 12, 2019 at 5:41 PM =
Gyula F=C3=B3ra &lt;<a href=3D"mailto:gyula.fora@gmail.com" target=3D"_blan=
k">gyula.fora@gmail.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_=
quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,=
204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr">Should be possible=
 to read timer states by:=C2=A0</div><div dir=3D"ltr">OperatorStateReader#c=
reateOperatorStateBackendFromSnapshot</div><div dir=3D"ltr"><br></div><div>=
Then you have to get the timer state out of the OperatorStateBackend, but k=
eep in mind that this will restore the operator states in memory.</div><div=
><br></div><div>Gyula</div><div dir=3D"ltr"><br></div></div><br><div class=
=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Tue, Mar 12, 2019=
 at 4:29 PM Konstantin Knauf &lt;<a href=3D"mailto:konstantin@ververica.com=
" target=3D"_blank">konstantin@ververica.com</a>&gt; wrote:<br></div><block=
quote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1=
px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div>Hi Juho, =
<br></div><div><br></div><div>okay, so it seems that although the watermark=
 passed the endtime of the event time windows,=C2=A0 the window was not tri=
ggered for some of the keys. <br></div><div><br></div><div>The timers, whic=
h would trigger the firing of the window, are also part of the keyed state =
and are snapshotted/restored. I would like to check if timers (as opposed t=
o the window content itself) are maybe lost during the savepoint &amp; rest=
ore procedure. Using Bravo, are you also able to inspect the timer state of=
 the savepoints? In particular, I would be interested if for two subsequent=
 savepoints all timers (i.e. one timer per window and key  including the mi=
ssing keys) are present in the savepoint. <br></div><div><br></div><div><a =
class=3D"gmail_plusreply" id=3D"m_-1673664296254471563gmail-m_4191132683903=
612140gmail-m_4780285061551269792gmail-m_3835006982768982814gmail-m_5222226=
725689792831gmail-m_-1425683553615925575gmail-m_893230967860415499gmail-m_4=
005067001349631360gmail-m_-2883489677411389187gmail-m_8652012098191863984gm=
ail-m_1534525690290176503gmail-m_8276719956164441117gmail-m_-99255216668297=
0606gmail-m_-6380244061938198074gmail-m_7498041852700543318gmail-m_-5875148=
220391401525gmail-m_4149013583796159872gmail-m_8232114597330332459gmail-m_-=
3627641675392385003gmail-m_-7064368206193073819gmail-m_759940275789662500gm=
ail-m_-7195194058751096433gmail-m_7686631074541873795gmail-m_-9220687973887=
808895gmail-m_-8749503412901838398gmail-m_8693773043552728422gmail-m_658593=
948785641506gmail-m_7812035139415822206gmail-m_3251182001238802649gmail-m_-=
8861550513252221396gmail-m_-1089187645483971039gmail-m_-3729794827731263835=
gmail-m_7848788707979837703gmail-m_-4056553463436372153gmail-m_785529259571=
4730576plusReplyChip-0" href=3D"mailto:gyula.fora@gmail.com" target=3D"_bla=
nk">@Gyula F=C3=B3ra</a>: Does Bravo support reading timer state as well?</=
div><div><br></div><div>Cheers, <br></div><div><br></div><div>Konstantin<br=
></div><div><br></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=
=3D"gmail_attr">On Thu, Mar 7, 2019 at 6:41 PM Juho Autio &lt;<a href=3D"ma=
ilto:juho.autio@rovio.com" target=3D"_blank">juho.autio@rovio.com</a>&gt; w=
rote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0p=
x 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=
=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div>Right, the=
 window operator is the one by name &quot;DistinctFunction&quot;.</div><div=
><br></div><div><div>http <a href=3D"http://10.1.59.75:20888/proxy/applicat=
ion_1551956351667_0001/jobs/3e4ffaadbd84af3488286863f00d4f23/vertices/19ede=
2f818524a7f310857e537fa6808/metrics%5C?get%5C=3D0.currentInputWatermark,1.c=
urrentInputWatermark,2.currentInputWatermark,3.currentInputWatermark,4.curr=
entInputWatermark,5.currentInputWatermark,6.currentInputWatermark,7.current=
InputWatermark,8.currentInputWatermark,9.currentInputWatermark,10.currentIn=
putWatermark,11.currentInputWatermark,12.currentInputWatermark,13.currentIn=
putWatermark,14.currentInputWatermark,15.currentInputWatermark,16.currentIn=
putWatermark,17.currentInputWatermark,18.currentInputWatermark,19.currentIn=
putWatermark,20.currentInputWatermark,21.currentInputWatermark,22.currentIn=
putWatermark,23.currentInputWatermark,24.currentInputWatermark,25.currentIn=
putWatermark,26.currentInputWatermark,27.currentInputWatermark,28.currentIn=
putWatermark,29.currentInputWatermark,30.currentInputWatermark,31.currentIn=
putWatermark,32.currentInputWatermark,33.currentInputWatermark,34.currentIn=
putWatermark,35.currentInputWatermark,36.currentInputWatermark,37.currentIn=
putWatermark,38.currentInputWatermark,39.currentInputWatermark,40.currentIn=
putWatermark,41.currentInputWatermark,42.currentInputWatermark,43.currentIn=
putWatermark,44.currentInputWatermark,45.currentInputWatermark,46.currentIn=
putWatermark,47.currentInputWatermark,48.currentInputWatermark,49.currentIn=
putWatermark,50.currentInputWatermark,51.currentInputWatermark,52.currentIn=
putWatermark,53.currentInputWatermark,54.currentInputWatermark,55.currentIn=
putWatermark,56.currentInputWatermark,57.currentInputWatermark,58.currentIn=
putWatermark,59.currentInputWatermark,60.currentInputWatermark,61.currentIn=
putWatermark,62.currentInputWatermark,63.currentInputWatermark,64.currentIn=
putWatermark,65.currentInputWatermark,66.currentInputWatermark,67.currentIn=
putWatermark,68.currentInputWatermark,69.currentInputWatermark,70.currentIn=
putWatermark,71.currentInputWatermark,72.currentInputWatermark,73.currentIn=
putWatermark,74.currentInputWatermark,75.currentInputWatermark,76.currentIn=
putWatermark,77.currentInputWatermark,78.currentInputWatermark,79.currentIn=
putWatermark" target=3D"_blank">http://10.1.59.75:20888/proxy/application_1=
551956351667_0001/jobs/3e4ffaadbd84af3488286863f00d4f23/vertices/19ede2f818=
524a7f310857e537fa6808/metrics\?get\=3D0.currentInputWatermark,1.currentInp=
utWatermark,2.currentInputWatermark,3.currentInputWatermark,4.currentInputW=
atermark,5.currentInputWatermark,6.currentInputWatermark,7.currentInputWate=
rmark,8.currentInputWatermark,9.currentInputWatermark,10.currentInputWaterm=
ark,11.currentInputWatermark,12.currentInputWatermark,13.currentInputWaterm=
ark,14.currentInputWatermark,15.currentInputWatermark,16.currentInputWaterm=
ark,17.currentInputWatermark,18.currentInputWatermark,19.currentInputWaterm=
ark,20.currentInputWatermark,21.currentInputWatermark,22.currentInputWaterm=
ark,23.currentInputWatermark,24.currentInputWatermark,25.currentInputWaterm=
ark,26.currentInputWatermark,27.currentInputWatermark,28.currentInputWaterm=
ark,29.currentInputWatermark,30.currentInputWatermark,31.currentInputWaterm=
ark,32.currentInputWatermark,33.currentInputWatermark,34.currentInputWaterm=
ark,35.currentInputWatermark,36.currentInputWatermark,37.currentInputWaterm=
ark,38.currentInputWatermark,39.currentInputWatermark,40.currentInputWaterm=
ark,41.currentInputWatermark,42.currentInputWatermark,43.currentInputWaterm=
ark,44.currentInputWatermark,45.currentInputWatermark,46.currentInputWaterm=
ark,47.currentInputWatermark,48.currentInputWatermark,49.currentInputWaterm=
ark,50.currentInputWatermark,51.currentInputWatermark,52.currentInputWaterm=
ark,53.currentInputWatermark,54.currentInputWatermark,55.currentInputWaterm=
ark,56.currentInputWatermark,57.currentInputWatermark,58.currentInputWaterm=
ark,59.currentInputWatermark,60.currentInputWatermark,61.currentInputWaterm=
ark,62.currentInputWatermark,63.currentInputWatermark,64.currentInputWaterm=
ark,65.currentInputWatermark,66.currentInputWatermark,67.currentInputWaterm=
ark,68.currentInputWatermark,69.currentInputWatermark,70.currentInputWaterm=
ark,71.currentInputWatermark,72.currentInputWatermark,73.currentInputWaterm=
ark,74.currentInputWatermark,75.currentInputWatermark,76.currentInputWaterm=
ark,77.currentInputWatermark,78.currentInputWatermark,79.currentInputWaterm=
ark</a> | jq &#39;.[].value&#39; --raw-output | uniq -c</div><div>=C2=A0 80=
 1551980102743</div></div><div dir=3D"ltr"><br></div><div dir=3D"ltr"><div =
dir=3D"ltr">date -r &quot;$((1551980102743/1000))&quot;</div><div dir=3D"lt=
r">Thu Mar=C2=A0 7 19:35:02 EET 2019</div></div><div dir=3D"ltr"><br></div>=
To me that makes sense =E2=80=93 how would the window be triggered at all, =
if not all sub-tasks have a high enough watermark, so that the operator lev=
el watermark can be advanced.</div><div dir=3D"ltr"><br><div class=3D"gmail=
_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Thu, Mar 7, 2019 at 5:33 P=
M Konstantin Knauf &lt;<a href=3D"mailto:konstantin@ververica.com" target=
=3D"_blank">konstantin@ververica.com</a>&gt; wrote:<br></div><blockquote cl=
ass=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid=
 rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div>Hi Juho, <br></di=
v><div><br></div><div>great, we are getting closer :)=C2=A0 Could you pleas=
e check the &quot;Watermarks&quot; tab the Flink UI of this job and check i=
f the current watermark for all parallel subtasks of the WindowOperator is =
close to the current date/time?<br></div><div><br></div><div>Best, <br></di=
v><div><br></div><div>Konstantin<br></div><div><br></div></div><br><div cla=
ss=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Thu, Mar 7, 201=
9 at 3:01 PM Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com" target=
=3D"_blank">juho.autio@rovio.com</a>&gt; wrote:<br></div><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rg=
b(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=
=3D"ltr"><div dir=3D"ltr"><div>Wow, indeed the missing data from previous d=
ate is still found in the savepoint!</div><div><br></div><div>Actually what=
 I now found is that there is still data from even older dates in the state=
:</div><div><br></div><div><div>%%spark</div><div>state_json_next_day.group=
By(state_json_next_day.ts.substr(1, 10).alias(&#39;day&#39;)).count().order=
By(&#39;day&#39;).show(n=3D1000)</div></div><div><br></div><div><div>+-----=
-----+--------+</div><div>|=C2=A0 =C2=A0 =C2=A0 =C2=A0day|=C2=A0 =C2=A0coun=
t|</div><div>+----------+--------+</div><div>|2018-08-22|=C2=A0 =C2=A0 4206=
|</div><div>..</div><div>(manually truncated)</div><div>..</div><div>|2019-=
02-03|=C2=A0 =C2=A0 =C2=A0 =C2=A04|<br></div><div>|2019-02-14|=C2=A0 =C2=A0=
12881|</div><div>|2019-02-15|=C2=A0 =C2=A0 1393|</div><div>|2019-02-25|=C2=
=A0 =C2=A0 8774|</div><div>|2019-03-06|=C2=A0 =C2=A0 9293|</div><div>|2019-=
03-07|28113105|</div><div>+----------+--------+</div></div><div><br></div><=
div>Of course that&#39;s the expected situation after we have learned that =
some window contents are left untriggered.</div><div><br></div><div>I don&#=
39;t have the logs any more, but I think on=C2=A02018-08-22 I have reset th=
e state, and since then it&#39;s been always kept/restored from savepoint. =
I can also see some dates there on which I didn&#39;t cancel the stream. Bu=
t I can&#39;t be sure if it has gone through some automatic restart by flin=
k. So we can&#39;t rule out that some window contents wouldn&#39;t sometime=
s also be missed during normal operation. However, savepoint restoration at=
 least makes the problem more prominent. I have previously mentioned that I=
 would suspect this to be some kind of race condition that is affected by l=
oad on the cluster. Reason for my suspicion is that during savepoint restor=
ation the cluster is also catching up kafka offsets on full speed, so it is=
 considerably more loaded than usually. Otherwise this problem might not ha=
ve much to do with savepoints of course.</div><div><br></div><div>Are you a=
ble to investigate the problem in Flink code based on this information?</di=
v><div><br></div><div>Many thanks,</div><div>Juho</div><div><br></div><div =
class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Wed, Mar 6, =
2019 at 1:41 PM Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com" targ=
et=3D"_blank">juho.autio@rovio.com</a>&gt; wrote:<br></div><blockquote clas=
s=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid r=
gb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><div>Th=
anks for the investigation &amp; summary.</div><div><br></div><div>As you s=
uggested, I will next take savepoints on two subsequent days &amp; check th=
e reducer state for both days.</div><div><br></div><div class=3D"gmail_quot=
e"><div dir=3D"ltr" class=3D"gmail_attr">On Wed, Mar 6, 2019 at 1:18 PM Kon=
stantin Knauf &lt;<a href=3D"mailto:konstantin@ververica.com" target=3D"_bl=
ank">konstantin@ververica.com</a>&gt; wrote:<br></div><blockquote class=3D"=
gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(20=
4,204,204);padding-left:1ex"><div dir=3D"ltr"><div>(Moving the discussion b=
ack to the ML)<br></div><div><br></div><div>Hi Juho, <br></div><div><br></d=
iv><div>after looking into your code, we are still pretty much in the dark =
with respect what is going wrong. <br></div><div><br></div><div>Let me try =
to summarize, what we know given your experiments so far: <br></div><div><b=
r></div><div>1) the lost records were processed and put into state *before*=
 the restart of the job, not afterwards<br></div><div>2) the lost records a=
re part of the state after the restore (because they are contained in subse=
quent savepoints)</div><div>3)
 the sinks are not the problem (because the metrics of the=20
WindowOperator showed that the missing records have not been sent to the
 sinks)</div><div>4) it is not the batch job used for reference, which is w=
rong, because of 1)<br></div><div>5) records are only lost when restarting =
from a savepoint (not during normal operations)</div><div><br></div><div>On=
e explanation would be, that one of the WindowOperators did not fire (for w=
hatever reason) and the missing records are still in the window&#39;s state=
 when you run your test. Could you please check, whether this is the case b=
y taking a savepoint on the next day and check if the missing records are c=
ontained in it.<br></div><div><br></div><div>Best, <br></div><div><br></div=
><div>Konstantin<br></div></div><br><div class=3D"gmail_quote"><div dir=3D"=
ltr" class=3D"gmail_attr">On Mon, Feb 18, 2019 at 8:32 PM Juho Autio &lt;<a=
 href=3D"mailto:juho.autio@rovio.com" target=3D"_blank">juho.autio@rovio.co=
m</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin=
:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"=
><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div d=
ir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"l=
tr"><div>Hi Konstantin, thanks.</div><div><br></div><div>I gathered the add=
itional info as discussed. No surprises there.</div><div><br></div><blockqu=
ote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px=
 solid rgb(204,204,204);padding-left:1ex"><span style=3D"color:rgb(80,0,80)=
">* do you know if all lost records are contained in the last savepoint you=
 took before the window fired? This would mean that no records are lost aft=
er the last restore.</span></blockquote><div dir=3D"ltr"><br></div><div>Ind=
eed this is the case. I saved the list of all missing IDs, analyzed the sav=
epoint with Bravo, and the savepoint state (already) contained all IDs that=
 were eventually missed in output.</div><div><br></div><blockquote class=3D=
"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(2=
04,204,204);padding-left:1ex"><span style=3D"color:rgb(80,0,80)">* could yo=
u please check the numRecordsOut metric for the WindowOperator (FlinkUI -&g=
t; TaskMetrics -&gt; Select TaskChain containing WindowOperator -&gt; find =
metric)? Is the count reported there correct (no missing data)?</span></blo=
ckquote><div><br></div><div>The number matches with output rows. The sum of=
=C2=A0numRecordsOut metrics was 45755630, and count(*) of the output on s3 =
resulted in the same number. Batch output has a bit more IDs of course (thi=
s time it was 1194). You wrote &quot;Is the count reported there correct (n=
o missing data)?&quot; but I have slightly different viewpoint; I agree tha=
t the reported count is correct (in flink&#39;s scope, because the number i=
s the same as what&#39;s in output file). But I think &quot;no missing data=
&quot; doesn&#39;t belong here. Data is missing, but it&#39;s consistently =
missing from both output files and numRecordsOut=C2=A0metrics.</div><div><b=
r></div><div><br></div><div>Next thing I&#39;ll work on is preparing the co=
de to be shared..</div><div><br></div><div><br></div><div>Btw, I used this =
script to count the sum of numRecordsOut (I&#39;m going to look into enabli=
ng Sl4jReporter eventually) :</div><div><br></div><div><div>JOB_URL=3D<a hr=
ef=3D"http://10.1.56.245:20888/proxy/application_1550217512987_0001/jobs/06=
8813ab8e6cbebaf7d306a0f41993c2" target=3D"_blank">http://10.1.56.245:20888/=
proxy/application_1550217512987_0001/jobs/068813ab8e6cbebaf7d306a0f41993c2<=
/a></div><div><br></div><div>DistinctFunctionID=3D`http $JOB_URL \</div><di=
v><span style=3D"white-space:pre-wrap">	</span>| jq &#39;.vertices[] | sele=
ct(.name =3D=3D &quot;DistinctFunction&quot;) | .id&#39; --raw-output`</div=
><div>echo &quot;DistinctFunctionID=3D$DistinctFunctionID&quot;</div><div><=
br></div><div>http $JOB_URL/vertices/19ede2f818524a7f310857e537fa6808/metri=
cs | jq &#39;.[] | .id&#39; --raw-output | grep &quot;[0-9][0-9]*\\.numReco=
rdsOut$&quot; \</div><div><span style=3D"white-space:pre-wrap">	</span>| xa=
rgs -I@ sh -c &quot;http GET $JOB_URL/vertices/19ede2f818524a7f310857e537fa=
6808/metrics?get=3D@ | jq &#39;.[0].value&#39; --raw-output&quot; &gt; numR=
ecordsOut.txt</div><div><br></div><div># &quot; eval_math( &#39;+&#39;.join=
( file.readlines ) ) &quot;</div><div>paste -sd+ numRecordsOut.txt | bc</di=
v></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr=
">On Thu, Feb 14, 2019 at 2:44 PM Konstantin Knauf &lt;<a href=3D"mailto:ko=
nstantin@ververica.com" target=3D"_blank">konstantin@ververica.com</a>&gt; =
wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0=
px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=
=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div>Hi Juho, <=
br></div></div><br><div class=3D"gmail_quote"><br><blockquote class=3D"gmai=
l_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,20=
4,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"=
><div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;b=
order-left:1px solid rgb(204,204,204);padding-left:1ex">* does the output o=
f the streaming job contain any data, which is not contained in the batch=
=C2=A0</blockquote><div><br></div><div>No.</div></div><div><br></div><div><=
blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-l=
eft:1px solid rgb(204,204,204);padding-left:1ex">* do you know if all lost =
records are contained in the last savepoint you took before the window fire=
d? This would mean that no records are lost after the last restore.</blockq=
uote></div><div><br></div><div>I haven&#39;t built the tooling required to =
check all IDs like that, but yes, that&#39;s my understanding currently. To=
 check that I would need to:</div><div>- kill the stream only once on a giv=
en day (so that there&#39;s only one savepoint creation &amp; restore)</div=
><div>- next day or later: save all missing ids from batch output compariso=
n</div><div>- next day or later:=C2=A0read the savepoint with bravo &amp; c=
heck that it contains all of those missing IDs</div><div><br></div><div>How=
ever I haven&#39;t built the tooling for that yet. Do you think it&#39;s ne=
cessary to verify that this assumption holds?</div></div></div></div></bloc=
kquote><div><br></div><div>It would be another data point and might help us=
 to track down the problem. Wether it is worth doing it, depends on the res=
ult, i.e. wether the current assumption would be falsified or not, but we o=
nly know that in retrospect ;)<br></div><div>=C2=A0</div><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rg=
b(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=
=3D"ltr"><div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px=
 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">* could you=
 please check the numRecordsOut metric for the WindowOperator (FlinkUI -&gt=
; TaskMetrics -&gt; Select TaskChain containing WindowOperator -&gt; find m=
etric)? Is the count reported there correct (no missing data)?</blockquote>=
</div><div><br></div><div>Is that metric the result of window trigger? If y=
es, you must mean that I check the value of that metric on the next day aft=
er restore, so that it only contains the count for the output of previous d=
ay&#39;s window? The counter is reset to 0 when job starts (even when state=
 is restored), right?</div></div></div></div></blockquote><div><br></div><d=
iv>Yes, this metric would be incremented when the window is triggered. Yes,=
 please check this metric after the window, during which the restore happen=
ed, is fired. <br></div><div><br></div><div>If you don&#39;t have a Metrics=
Reporter configured so far, I recommend to quickly register a Sl4jReporter =
to log out all metrics every X seconds (maybe even minutes for your use cas=
e): <a href=3D"https://ci.apache.org/projects/flink/flink-docs-release-1.7/=
monitoring/metrics.html#slf4j-orgapacheflinkmetricsslf4jslf4jreporter" targ=
et=3D"_blank">https://ci.apache.org/projects/flink/flink-docs-release-1.7/m=
onitoring/metrics.html#slf4j-orgapacheflinkmetricsslf4jslf4jreporter</a>. T=
hen you don&#39;t need to go trough the WebUI and can keep a history of the=
 metrics.<br></div><div>=C2=A0</div><blockquote class=3D"gmail_quote" style=
=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding=
-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div>Otherwis=
e, do you have any suggestions for how to instrument the code to narrow dow=
n further where the data gets lost? To me it would make sense to proceed wi=
th this, because the problem seems hard to reproduce outside of our environ=
ment.</div></div></div></div></blockquote></div><div class=3D"gmail_quote">=
<br></div><div class=3D"gmail_quote">Let&#39;s focus on checking this metri=
c above, to make sure that the WindowOperator is actually emitting less rec=
ords than the overall number of keys in the state as your experiments sugge=
st, and on sharing the code. <br></div><div class=3D"gmail_quote"><div>=C2=
=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8e=
x;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"=
><div dir=3D"ltr"><div dir=3D"ltr"><div class=3D"gmail_quote"><div dir=3D"l=
tr" class=3D"gmail_attr">On Thu, Feb 14, 2019 at 10:57 AM Konstantin Knauf =
&lt;<a href=3D"mailto:konstantin@ververica.com" target=3D"_blank">konstanti=
n@ververica.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" s=
tyle=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);pad=
ding-left:1ex"><div dir=3D"ltr"><div>Hi Juho, <br></div><div><br></div><div=
>you are right the problem has actually been narrowed down quite a bit over=
 time. Nevertheless, sharing the code (incl. flink-conf.yaml) might be a go=
od idea. Maybe something strikes the eye, that we have not thought about so=
 far. If you don&#39;t feel comfortable sharing the code on the ML, feel fr=
ee to send me a PM.=C2=A0</div><div><br></div><div>Besides that, three more=
 questions: <br></div><div><br></div><div>* does the output of the streamin=
g job contain any data, which is not contained in the batch output?</div><d=
iv>* do you know if all lost records are contained in the last savepoint yo=
u took before the window fired? This would mean that no records are lost af=
ter the last restore.</div><div>* could you please check the numRecordsOut =
metric for the WindowOperator (FlinkUI -&gt; TaskMetrics -&gt; Select TaskC=
hain containing WindowOperator -&gt; find metric)? Is the count reported th=
ere correct (no missing data)?</div><div><br></div><div>Cheers, <br></div><=
div><br></div><div>Konstantin<br></div><div><br></div><div><br></div><br></=
div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On=
 Wed, Feb 13, 2019 at 3:19 PM Gyula F=C3=B3ra &lt;<a href=3D"mailto:gyula.f=
ora@gmail.com" target=3D"_blank">gyula.fora@gmail.com</a>&gt; wrote:<br></d=
iv><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;bord=
er-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div dir=3D"auto"=
>Sorry not posting on the mail list was my mistake :/</div></div><div dir=
=3D"auto"><br></div><div><br><div class=3D"gmail_quote"><div dir=3D"ltr" cl=
ass=3D"gmail_attr">On Wed, 13 Feb 2019 at 15:01, Juho Autio &lt;<a href=3D"=
mailto:juho.autio@rovio.com" target=3D"_blank">juho.autio@rovio.com</a>&gt;=
 wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px =
0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=
=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr">Thanks for step=
ping in, did you post outside of the mailing list on purpose btw?<div><br><=
/div><div>This I did long time ago:</div><div><br></div><div><blockquote cl=
ass=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid=
 rgb(204,204,204);padding-left:1ex">To rule out for good any questions abou=
t sink behaviour, the job was killed and started with an additional Kafka s=
ink.<br>The same number of ids were missed in both outputs: KafkaSink &amp;=
 BucketingSink.</blockquote><div><br></div><div>(I wrote about that On Oct =
1, 2018 in this email thread)</div><div><br></div><div>After that I did the=
 savepoint analysis with Bravo.</div><div><br></div><div>Currently I&#39;m =
indeed trying to get suggestions how to debug further, for example, where t=
o add additional kafka output, to catch where the data gets lost. That woul=
d probably be somewhere in Flink&#39;s internals.</div></div><div><br></div=
><div>I could try to share the full code also, but IMHO the problem has bee=
n quite well narrowed down, considering that data can be found in savepoint=
, savepoint is successfully restored, and after restoring the data doesn=
9;t go to &quot;user code&quot; (like the reducer) any more.</div></div></d=
iv></div></div><div dir=3D"ltr"><br><div class=3D"gmail_quote"><div dir=3D"=
ltr" class=3D"gmail_attr">On Wed, Feb 13, 2019 at 3:47 PM Gyula F=C3=B3ra &=
lt;<a href=3D"mailto:gyula.fora@gmail.com" target=3D"_blank">gyula.fora@gma=
il.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"m=
argin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left=
:1ex"><div dir=3D"ltr">Hi=C2=A0Juho!<div>I think the reason you are not get=
ting much answers here is because it is very hard to debug this problem rem=
otely.=C2=A0</div><div>Seemingly you do very normal operations, the state c=
ontains all the required data and nobody else has hit a similar problem for=
 ages.</div><div><br></div><div>My best guess would be some bug with the de=
duplication or output writing logic but without a complete code example its=
 very hard to say anything useful.</div><div>Did you try writing it to Kafk=
a to see if the output is there? (that way we could rule out the dedup prob=
llem)</div><div><br></div><div>Cheers,</div><div>Gyula</div></div><br><div =
class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Wed, Feb 13,=
 2019 at 2:37 PM Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com" tar=
get=3D"_blank">juho.autio@rovio.com</a>&gt; wrote:<br></div><blockquote cla=
ss=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid =
rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><div>S=
tefan (or anyone!), please, could I have some feedback on the findings that=
 I reported on Dec 21, 2018? This is still a major blocker..</div><br><div =
class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Thu, Jan 31,=
 2019 at 11:46 AM Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com" ta=
rget=3D"_blank">juho.autio@rovio.com</a>&gt; wrote:<br></div><blockquote cl=
ass=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid=
 rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div>Hello, is there a=
nyone that could help with this?</div><div dir=3D"ltr"><br><div class=3D"gm=
ail_quote"><div dir=3D"ltr">On Fri, Jan 11, 2019 at 8:14 AM Juho Autio &lt;=
<a href=3D"mailto:juho.autio@rovio.com" target=3D"_blank">juho.autio@rovio.=
com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"marg=
in:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1e=
x">Stefan, would you have time to comment?<br><br>On Wednesday, January 2, =
2019, Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com" target=3D"_bla=
nk">juho.autio@rovio.com</a>&gt; wrote:<br><blockquote class=3D"gmail_quote=
" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);=
padding-left:1ex"><div dir=3D"ltr"><div>Bump =E2=80=93 does anyone know if =
Stefan will be available to comment the latest findings? Thanks.</div><br><=
div class=3D"gmail_quote"><div dir=3D"ltr">On Fri, Dec 21, 2018 at 2:33 PM =
Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com" target=3D"_blank">ju=
ho.autio@rovio.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote=
" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);=
padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div d=
ir=3D"ltr"><div dir=3D"ltr">Stefan, I managed to analyze savepoint with bra=
vo. It seems that the data that&#39;s missing from output <i>is</i> found i=
n savepoint.</div><div dir=3D"ltr"><br></div><div dir=3D"ltr">I simplified =
my test case to the following:</div><div><br></div><div>- job 1 has bee run=
ning for ~10 days</div><div>- savepoint X created &amp; job 1 cancelled</di=
v><div>- job 2 started with restore from savepoint X</div><div><br></div><d=
iv>Then I waited until the next day so that job 2 has triggered the 24 hour=
 window.</div><div><br></div><div>Then I analyzed the output &amp; savepoin=
t:</div><div><br></div><div>- compare job 2 output with the output of a bat=
ch pyspark script =3D&gt; find 4223 missing rows</div><div>- pick one of th=
e missing rows (say, id Z)</div><div>- read savepoint X with bravo, filter =
for id Z =3D&gt; Z was found in the savepoint!</div><div><br></div><div><di=
v>How can it be possible that the value is in state but doesn&#39;t end up =
in output after state has been restored &amp; window is eventually triggere=
d?</div><br></div><div>I also did similar analysis on the previous case whe=
re I savepointed &amp; restored the job multiple times (5) within the same =
24-hour window. A missing id that I drilled down to, was found in all of th=
ose savepoints, yet missing from the output that gets written at the end of=
 the day. This is even more surprising: that the missing ID was written to =
the new savepoints also after restoring. Is the reducer state somehow decou=
pled from the window contents?</div><div><br></div><div>Big thanks to bravo=
-developer Gyula for guiding me through to be able read the reducer state! =
<a href=3D"https://github.com/king/bravo/pull/11" target=3D"_blank">https:/=
/github.com/king/bravo/pull/11</a></div><div><br></div><div>Gyula also had =
an idea for how to troubleshoot the missing data in a scalable way: I could=
 add some &quot;side effect kafka output&quot; on individual operators. Thi=
s should allow tracking more closely at which point the data gets lost. How=
ever, maybe this would have to be in some Flink&#39;s internal components, =
and I&#39;m not sure which those would be.</div><div><br></div><div>Cheers,=
</div><div>Juho</div><div dir=3D"ltr"><br></div><div class=3D"gmail_quote">=
<div dir=3D"ltr">On Mon, Nov 19, 2018 at 11:52 AM Juho Autio &lt;<a href=3D=
"mailto:juho.autio@rovio.com" target=3D"_blank">juho.autio@rovio.com</a>&gt=
; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px=
 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div di=
r=3D"ltr"><div dir=3D"ltr"><br></div><div>Hi Stefan,</div><div dir=3D"ltr">=
<br></div><div dir=3D"ltr">Bravo doesn&#39;t currently support reading a re=
ducer state. I gave it a try but couldn&#39;t get to a working implementati=
on yet. If anyone can provide some insight on how to make this work, please=
 share at github:</div><div dir=3D"ltr"><a href=3D"https://github.com/king/=
bravo/pull/11" target=3D"_blank">https://github.com/king/bravo/pull/11</a><=
br></div><div dir=3D"ltr"><br></div><div>Thanks.</div><br><div class=3D"gma=
il_quote"><div dir=3D"ltr">On Tue, Oct 23, 2018 at 3:32 PM Juho Autio &lt;<=
a href=3D"mailto:juho.autio@rovio.com" target=3D"_blank">juho.autio@rovio.c=
om</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margi=
n:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex=
"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div =
dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr">I was glad to find that bravo=
 had now been updated to support installing bravo to a local maven repo.<di=
v><br></div><div>I was able to load a checkpoint created by my job, thanks =
to the example provided in bravo README, but I&#39;m still missing the esse=
ntial piece.<br></div><div><br></div><div>My code was:</div><div><br></div>=
<div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 OperatorStateReader reader =3D new Op=
eratorStateReader(env2, savepoint, &quot;DistinctFunction&quot;);</div></di=
v><div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 DontKnowWhatTypeThisIs reducingStat=
e =3D reader.readKeyedStates(what should I put here?);</div></div><div><br>=
</div><div>I don&#39;t know how to read the values collected from reduce() =
calls in the state. Is there a way to access the reducing state of the wind=
ow with bravo? I&#39;m a bit confused how this works, because when I check =
with debugger, flink internally uses a=C2=A0ReducingStateDescriptor with=C2=
=A0name=3Dwindow-contents, but still reading operator state for &quot;Disti=
nctFunction&quot; didn&#39;t at least throw an exception (&quot;window-cont=
ents&quot; threw =E2=80=93 obviously there&#39;s no operator by that name).=
<br></div><div><br></div><div>Cheers,</div><div>Juho</div></div></div></div=
></div></div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr">On Mon, =
Oct 15, 2018 at 2:25 PM Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.c=
om" target=3D"_blank">juho.autio@rovio.com</a>&gt; wrote:<br></div><blockqu=
ote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px=
 solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"=
><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr">Hi Stefan,<div><br></di=
v><div>Sorry but it doesn&#39;t seem immediately clear to me what&#39;s a g=
ood way to use <a href=3D"https://github.com/king/bravo" target=3D"_blank">=
https://github.com/king/bravo</a>.</div><div><br></div><div>How are people =
using it? Would you for example modify build.gradle somehow to publish the =
bravo as a library locally/internally? Or add code directly in the bravo pr=
oject (locally) and run it from there (using an IDE, for example)? Also it =
doesn&#39;t seem like the bravo gradle project supports building a flink jo=
b jar, but if it does, how do I do it?</div><div><br></div><div>Thanks.</di=
v><div><br><div class=3D"gmail_quote"><div dir=3D"ltr">On Thu, Oct 4, 2018 =
at 9:30 PM Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com" target=3D=
"_blank">juho.autio@rovio.com</a>&gt; wrote:<br></div><blockquote class=3D"=
gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(20=
4,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"=
ltr">Good then, I&#39;ll try to analyze the savepoints with Bravo. Thanks!<=
/div><div dir=3D"ltr"><br></div><div dir=3D"ltr"><div dir=3D"ltr">&gt; How =
would you assume that backpressure would influence your updates? Updates to=
 each local state still happen event-by-event, in a single reader/writing t=
hread.</div><div dir=3D"ltr"><br></div><div>Sure, just an ignorant guess by=
 me. I&#39;m not familiar with most of Flink&#39;s internals. Any way high =
backpressure is not a seen on this job after it has caught up the lag, so a=
t I thought it would be worth mentioning.</div><br><div class=3D"gmail_quot=
e"><div dir=3D"ltr">On Thu, Oct 4, 2018 at 6:24 PM Stefan Richter &lt;<a hr=
ef=3D"mailto:s.richter@data-artisans.com" target=3D"_blank">s.richter@data-=
artisans.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" styl=
e=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);paddin=
g-left:1ex"><div>Hi,<br><div><br><blockquote type=3D"cite"><div>Am 04.10.20=
18 um 16:08 schrieb Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com" =
target=3D"_blank">juho.autio@rovio.com</a>&gt;:</div><br><div><div dir=3D"l=
tr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><di=
v dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=
=3D"ltr"><div dir=3D"ltr">&gt;=C2=A0you could take a look at Bravo [1] to q=
uery your savepoints and to check if the state in the savepoint complete w.=
r.t your expectations</div><div dir=3D"ltr"><br></div><div>Thanks. I&#39;m =
not 100% if this is the case, but to me it seemed like the missed ids were =
being logged by the reducer soon after the job had started (after restoring=
 a savepoint). But on the other hand, after that I also made another savepo=
int &amp; restored that, so what I could check is: does that next savepoint=
 have the missed ids that were logged (a couple of minutes before the savep=
oint was created, so there should&#39;ve been more than enough time to add =
them to the state before the savepoint was triggered) or not. Any way, if I=
 would be able to verify with Bravo that the ids are missing from the savep=
oint (even though reduced logged that it saw them), would that help in figu=
ring out where they are lost? Is there some major difference compared to ju=
st looking at the final output after window has been triggered?</div></div>=
</div></div></div></div></div></div></div></div></div></div></blockquote><d=
iv><br></div><div><br></div><div>I think that makes a difference. For examp=
le, you can investigate if there is a state loss or a problem with the wind=
owing. In the savepoint you could see which keys exists and to which window=
s they are assigned. Also just to make sure there is no misunderstanding: o=
nly elements that are in the state at the start of a savepoint are expected=
 to be part of the savepoint; all elements between start and completion of =
the savepoint are not expected to be part of the savepoint.</div><br><block=
quote type=3D"cite"><div><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"=
><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div d=
ir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><br></div><di=
v dir=3D"ltr">&gt;=C2=A0I also doubt that the problem is about backpressure=
 after restore, because the job will only continue running after the state =
restore is already completed.</div><div dir=3D"ltr"><br></div><div>Yes, I&#=
39;m not suspecting that the state restoring would be the problem either. M=
y concern was about backpressure possibly messing with the updates of reduc=
ing state? I would tend to suspect that updating the state consistently is =
what fails, where heavy load / backpressure might be a factor.</div></div><=
/div></div></div></div></div></div></div></div></div></div></blockquote><di=
v><br></div><div><br></div><div>How would you assume that backpressure woul=
d influence your updates? Updates to each local state still happen event-by=
-event, in a single reader/writing thread.</div><div><br></div><blockquote =
type=3D"cite"><div><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div =
dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"=
ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><br><div class=3D"g=
mail_quote"><div dir=3D"ltr">On Thu, Oct 4, 2018 at 4:18 PM Stefan Richter =
&lt;<a href=3D"mailto:s.richter@data-artisans.com" target=3D"_blank">s.rich=
ter@data-artisans.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_qu=
ote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,20=
4);padding-left:1ex"><div>Hi,<div><br></div><div>you could take a look at B=
ravo [1] to query your savepoints and to check if the state in the savepoin=
t complete w.r.t your expectations. I somewhat doubt that there is a genera=
l problem with the state/savepoints because many users are successfully run=
ning it on a large state and I am not aware of any data loss problems, but =
nothing is impossible. What the savepoint does is also straight forward: it=
erate a db snapshot and write all key/value pairs to disk, so all data that=
 was in the db at the time of the savepoint, should show up. I also doubt t=
hat the problem is about backpressure after restore, because the job will o=
nly continue running after the state restore is already completed. Did you =
check if you are using exactly-one-semantics or at-least-once semantics? Al=
so did you check that the kafka consumer start position is configured prope=
rly [2]? Are watermarks generated as expected after restore?</div><div><br>=
</div><div>One more unrelated high-level comment that I have: for a granula=
rity of 24h windows, I wonder if it would not make sense to use a batch job=
 instead?</div><div><br></div><div>Best,</div><div>Stefan</div><div><br></d=
iv><div>[1]=C2=A0<a href=3D"https://github.com/king/bravo" target=3D"_blank=
">https://github.com/king/bravo</a><br><div>[2]=C2=A0<a href=3D"https://ci.=
apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html#kafka=
-consumers-start-position-configuration" target=3D"_blank">https://ci.apach=
e.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html#kafka-cons=
umers-start-position-configuration</a></div><div><br><blockquote type=3D"ci=
te"><div>Am 04.10.2018 um 14:53 schrieb Juho Autio &lt;<a href=3D"mailto:ju=
ho.autio@rovio.com" target=3D"_blank">juho.autio@rovio.com</a>&gt;:</div><b=
r><div><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr">=
Thanks for the suggestions!<div><br></div><div>&gt; In general, it would be=
 tremendously helpful to have a minimal working example which allows to rep=
roduce the problem.</div><div><br></div><div>Definitely. The problem with r=
eproducing has been that this only seems to happen in the bigger production=
 data volumes.</div><div><br></div><div>That&#39;s why I&#39;m hoping to fi=
nd a way to debug this with the production data. With that it seems to cons=
istently cause some misses every time the job is killed/restored.</div><div=
><br></div><div><div>&gt; check if it happens for shorter windows, like 1h =
etc</div></div><div><br></div><div>What would be the benefit of that compar=
ed to 24h window?</div><div><br></div><div>&gt;=C2=A0simplify the job to no=
t use a reduce window but simply a time window which outputs the window eve=
nts. Then counting the input and output events should allow you to verify t=
he results. If you are not seeing missing events, then it could have someth=
ing to do with the reducing state used in the reduce function.</div><div><b=
r></div><div>Hm, maybe, but not sure how useful that would be, because it w=
ouldn&#39;t yet prove that it&#39;s related to reducing, because not having=
 a reduce function could also mean smaller load on the job, which might alo=
ne be enough to make the problem not manifest.</div><div><br></div><div>Is =
there a way to debug what goes into the reducing state (including what gets=
 removed or overwritten and what restored), if that makes sense..? Maybe so=
me suitable logging could be used to prove that the lost data is written to=
 the reducing state (or at least asked to be written), but not found any mo=
re when the window closes and state is flushed?</div><div><br></div><div>On=
 configuration once more, we&#39;re using RocksDB state backend with asynch=
ronous incremental checkpointing. The state is restored from savepoints tho=
ugh, we haven&#39;t been using those checkpoints in these tests (although t=
hey could be used in case of=C2=A0crashes =E2=80=93 but we haven&#39;t had =
those now).</div><div><br><div class=3D"gmail_quote"><div dir=3D"ltr">On Th=
u, Oct 4, 2018 at 3:25 PM Till Rohrmann &lt;<a href=3D"mailto:trohrmann@apa=
che.org" target=3D"_blank">trohrmann@apache.org</a>&gt; wrote:<br></div><bl=
ockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-lef=
t:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr">Hi Juho,<di=
v><br></div><div>another idea to further narrow down the problem could be t=
o simplify the job to not use a reduce window but simply a time window whic=
h outputs the window events. Then counting the input and output events shou=
ld allow you to verify the results. If you are not seeing missing events, t=
hen it could have something to do with the reducing state used in the reduc=
e function.</div><div><br></div><div>In general, it would be tremendously h=
elpful to have a minimal working example which allows to reproduce the prob=
lem.</div><div><br></div><div>Cheers,</div><div>Till</div></div><br><div cl=
ass=3D"gmail_quote"><div dir=3D"ltr">On Thu, Oct 4, 2018 at 2:02 PM Andrey =
Zagrebin &lt;<a href=3D"mailto:andrey@data-artisans.com" target=3D"_blank">=
andrey@data-artisans.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail=
_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204=
,204);padding-left:1ex"><div>Hi Juho,<div><br></div><div>can you try to red=
uce the job to minimal reproducible example and share the job and input?</d=
iv><div><br></div><div>For example:</div><div>- some simple records as inpu=
t, e.g. tuples of primitive types saved as cvs</div><div>- minimal deduplic=
ation job which processes them and misses records</div><div>- check if it h=
appens for shorter windows, like 1h etc</div><div>- setup which you use for=
 the job, ideally locally=C2=A0reproducible or cloud</div><div><br></div><d=
iv>Best,</div><div>Andrey</div><div><div><br><blockquote type=3D"cite"><div=
>On 4 Oct 2018, at 11:13, Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio=
.com" target=3D"_blank">juho.autio@rovio.com</a>&gt; wrote:</div><br><div><=
div dir=3D"ltr">Sorry to insist, but we seem to be blocked for any serious =
usage of state in Flink if we can&#39;t rely on it to not miss data in case=
 of restore.<div><br></div><div>Would anyone have suggestions for how to tr=
oubleshoot this? So far I have verified with DEBUG logs that our reduce fun=
ction gets to process also the data that is missing from window output.<br>=
<br><div class=3D"gmail_quote"><div dir=3D"ltr">On Mon, Oct 1, 2018 at 11:5=
6 AM Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com" target=3D"_blan=
k">juho.autio@rovio.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_=
quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,=
204);padding-left:1ex"><div dir=3D"ltr">Hi Andrey,<div><br></div><div>To ru=
le out for good any questions about sink behaviour, the job was killed and =
started with an additional Kafka sink.</div><div><br></div><div>The same nu=
mber of ids were missed in both outputs: KafkaSink &amp; BucketingSink.</di=
v><div><br></div><div>I wonder what would be the next steps in debugging?<b=
r><br><div class=3D"gmail_quote"><div dir=3D"ltr">On Fri, Sep 21, 2018 at 3=
:49 PM Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com" target=3D"_bl=
ank">juho.autio@rovio.com</a>&gt; wrote:<br></div><blockquote class=3D"gmai=
l_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,20=
4,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><div>Thanks, And=
rey.</div><div dir=3D"ltr"><br></div><div dir=3D"ltr">&gt; so it means that=
 the savepoint does not loose at least some dropped records.<br></div><div =
dir=3D"ltr"><br></div><div>I&#39;m not sure what you mean by that? I mean, =
it was known from the beginning, that not everything is lost before/after r=
estoring a savepoint, just some records around the time of restoration. It&=
#39;s not 100% clear whether records are lost before making a savepoint or =
after restoring it. Although, based on the new DEBUG logs it seems more lik=
e losing some records that are seen ~soon after restoring. It seems like Fl=
ink would be somehow confused either about the restored state vs. new inser=
ts to state. This could also be somehow linked to the high back pressure on=
 the kafka source while the stream is catching up.</div><div><br></div><div=
><div>&gt; If it is feasible for your setup, I suggest to insert one more m=
ap function after reduce and before sink.</div></div><div>&gt; etc.</div><d=
iv><br></div><div>Isn&#39;t that the same thing that we discussed before? N=
othing is sent to BucketingSink before the window closes, so I don&#39;t se=
e how it would make any difference if we replace the BucketingSink with a m=
ap function or another sink type. We don&#39;t create or restore savepoints=
 during the time when BucketingSink gets input or has open buckets =E2=80=
=93 that happens at a much later time of day. I would focus on figuring out=
 why the records are lost while the window is open. But I don&#39;t know ho=
w to do that. Would you have any additional suggestions?</div><br><div clas=
s=3D"gmail_quote"><div dir=3D"ltr">On Fri, Sep 21, 2018 at 3:30 PM Andrey Z=
agrebin &lt;<a href=3D"mailto:andrey@data-artisans.com" target=3D"_blank">a=
ndrey@data-artisans.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_=
quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,=
204);padding-left:1ex"><div>Hi Juho,<div><br></div><div>so it means that th=
e savepoint does not loose at least some dropped records.</div><div><br></d=
iv><div>If it is feasible for your setup, I suggest to insert one more map =
function after reduce and before sink.=C2=A0</div><div>The map function sho=
uld be called right after window is triggered but before flushing to s3.</d=
iv><div>The result of reduce (deduped record) could be logged there.</div><=
div>This should allow to check whether the processed distinct records were =
buffered in the state after the restoration from the savepoint or not. If t=
hey were buffered we should see that there was an attempt to write them to =
the sink from the state.</div><div><br></div><div>Another suggestion is to =
try to write records to some other sink or to both.=C2=A0</div><div>E.g. if=
 you can access file system of workers, maybe just into local files and che=
ck whether the records are also dropped there.</div><div><div><br></div><di=
v>Best,</div><div>Andrey</div><div><br><blockquote type=3D"cite"><div>On 20=
 Sep 2018, at 15:37, Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com"=
 target=3D"_blank">juho.autio@rovio.com</a>&gt; wrote:</div><br><div><div d=
ir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"l=
tr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><di=
v>Hi Andrey!</div><div><br></div><div>I was finally able to gather the DEBU=
G logs that you suggested. In short, the reducer logged that it processed a=
t least some of the ids that were missing from the output.</div><div><br></=
div><div>&quot;At least some&quot;, because I didn&#39;t have the job runni=
ng with DEBUG logs for the full 24-hour window period. So I was only able t=
o look up if I can find <i>some</i>=C2=A0of the missing ids in the DEBUG lo=
gs. Which I did indeed.</div><div><br></div><div>I changed the DistinctFunc=
tion.java to do this:</div><div><br></div><div><div>=C2=A0 =C2=A0 @Override=
</div><div>=C2=A0 =C2=A0 public Map&lt;String, String&gt; reduce(Map&lt;Str=
ing, String&gt; value1, Map&lt;String, String&gt; value2) {</div><div>=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 LOG.debug(&quot;DistinctFunction.reduce returns: {=
}=3D{}&quot;, value1.get(&quot;field&quot;), value1.get(&quot;id&quot;));</=
div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 return value1;</div><div>=C2=A0 =C2=A0=
 }</div></div><div><br></div><div>Then:</div><div><br></div><div><div>vi fl=
ink-1.6.0/conf/log4j.properties</div><div>log4j.logger.org.apache.flink.str=
eaming.runtime.tasks.StreamTask=3DDEBUG</div><div>log4j.logger.com.rovio.ds=
.flink.uniqueid.DistinctFunction=3DDEBUG</div></div><div><br></div><div>The=
n I ran the following kind of test:</div><div><br></div><div>- Cancelled th=
e on-going job with savepoint created at ~Sep 18 08:35 UTC 2018</div><div>-=
 Started a new cluster &amp; job with DEBUG enabled at ~09:13, restored fro=
m that previous cluster&#39;s savepoint<br></div><div><div>- Ran until caug=
ht up offsets</div><div>- Cancelled the job with a new savepoint</div></div=
><div>- Started a new job _without_ DEBUG, which restored the new savepoint=
, let it keep running so that it will eventually write the output</div><div=
><br></div><div>Then on the next day, after results had been flushed when t=
he 24-hour window closed, I compared the results again with a batch version=
&#39;s output. And found some missing ids as usual.</div><div><br></div><di=
v>I drilled down to one specific missing id (I&#39;m replacing the actual v=
alue with AN12345 below), which was not found in the stream output, but was=
 found in batch output &amp; flink DEBUG logs.</div><div><br></div><div>Rel=
ated to that id, I gathered the following information:</div><div><br></div>=
<div>2018-09-18~09:13:21,000 job started &amp; savepoint is restored</div><=
div><br></div><div>2018-09-18 09:14:29,085 missing id is processed for the =
first time, proved by this log line:</div><div>2018-09-18 09:14:29,085 DEBU=
G com.rovio.ds.flink.uniqueid.DistinctFunction=C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - DistinctFunction.reduce returns: s.aid=
1=3DAN12345<br></div><div><br></div><div>2018-09-18 09:15:14,264 first sync=
hronous part of checkpoint</div><div>2018-09-18 09:15:16,544 first asynchro=
nous part of checkpoint</div><div><br></div><div>(</div><div><span style=3D=
"white-space:pre-wrap">	</span>more occurrences of checkpoints (~1 min chec=
kpointing time + ~1 min delay before next)</div><div><span style=3D"white-s=
pace:pre-wrap">	</span>/</div><div><span style=3D"white-space:pre-wrap">	</=
span>more occurrences of DistinctFunction.reduce</div><div>)</div><div><br>=
</div><div>2018-09-18 09:23:45,053 missing id is processed for the last tim=
e</div><div><br></div><div>2018-09-18~10:20:00,000 savepoint created &amp; =
job cancelled</div><div><br></div><div>To be noted, there was high backpres=
sure after restoring from savepoint until the stream caught up with the kaf=
ka offsets. Although, our job uses assign timestamps &amp; watermarks on th=
e flink kafka consumer itself, so event time of all partitions is synchroni=
zed. As expected, we don&#39;t get any late data in the late data side outp=
ut.</div><div dir=3D"ltr"><br></div><div>From this we can see that the miss=
ing ids are processed by the reducer, but they must get lost somewhere befo=
re the 24-hour window is triggered.</div><div><br></div><div>I think it&#39=
;s worth mentioning once more that the stream doesn&#39;t miss any ids if w=
e let it&#39;s running without interruptions / state restoring.</div><div><=
br></div><div>What&#39;s next?</div><br><div class=3D"gmail_quote"><div dir=
=3D"ltr">On Wed, Aug 29, 2018 at 3:49 PM Andrey Zagrebin &lt;<a href=3D"mai=
lto:andrey@data-artisans.com" target=3D"_blank">andrey@data-artisans.com</a=
>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px=
 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><di=
v>Hi Juho,<div><br></div><div>&gt; only when the 24-hour window triggers,=
=C2=A0BucketingSink=C2=A0gets a burst of input</div><div><br></div><div>Thi=
s is of course totally true, my understanding is the same. We cannot exclud=
e problem there for sure, just savepoints are used a lot w/o problem report=
s and=C2=A0BucketingSink=C2=A0is known to be problematic with s3. That is w=
hy, I asked you:</div><div><br></div><div>&gt; You also wrote that the time=
stamps of lost event are &#39;probably&#39; around the time of the savepoin=
t, if it is not yet for sure I would also check it.</div><div><br></div><di=
v>Although, bucketing sink might loose any data at the end of the day (also=
 from the middle). The fact, that it is always around the time of taking a =
savepoint and not random, is surely suspicious and possible savepoint failu=
res need to be investigated.</div><div><br></div><div>Regarding the s3 prob=
lem, s3 doc says:</div><div><br></div><div>&gt;=C2=A0The caveat is that if =
you make a HEAD or GET request to the key name (to find=C2=A0if the object =
exists) before creating the object, Amazon S3 provides &#39;eventual=C2=A0c=
onsistency&#39; for read-after-write.</div><div><br></div><div>The algorith=
m you suggest is how it is roughly implemented now (BucketingSink.openNewPa=
rtFile). My understanding is that &#39;eventual=C2=A0consistency=E2=80=99 m=
eans that even if you just created file (its name is key) it can be that yo=
u do not get it in the list or exists (HEAD) returns false and you risk to =
rewrite the previous part.</div><div><br></div><div>The=C2=A0BucketingSink=
=C2=A0was designed for a standard file system. s3 is used over a file syste=
m wrapper atm but does not always provide normal file system guarantees. Se=
e also last example in [1].</div><div><br></div><div>Cheers,</div><div>Andr=
ey</div><div><br></div><div>[1]=C2=A0<a href=3D"https://codeburst.io/quick-=
explanation-of-the-s3-consistency-model-6c9f325e3f82" target=3D"_blank">htt=
ps://codeburst.io/quick-explanation-of-the-s3-consistency-model-6c9f325e3f8=
2</a></div><div><div><br><blockquote type=3D"cite"><div>On 29 Aug 2018, at =
12:11, Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com" target=3D"_bl=
ank">juho.autio@rovio.com</a>&gt; wrote:</div><br><div><div dir=3D"ltr">And=
rey, thank you very much for the debugging suggestions, I&#39;ll try them.<=
div></div><div>In the meanwhile two more questions, please:</div><div></div=
><div>&gt; Just to keep in mind this problem with s3 and exclude it for sur=
e. I would also check whether the size of missing events is around the batc=
h size of BucketingSink or not.Fair enough, but I also want to focus on deb=
ugging the most probable subject first. So what do you think about this =E2=
=80=93 true or false: only when the 24-hour window triggers, BucketinSink g=
ets a burst of input. Around the state restoring point (middle of the day) =
it doesn&#39;t get any input, so it can&#39;t lose anything either. Isn&#39=
;t this true, or have I totally missed how Flink works in triggering window=
 results? I would not expect there to be any optimization that speculativel=
y triggers early results of a regular time window to the downstream operato=
rs.</div><div></div><div>&gt; The old BucketingSink has in general problem =
with s3. Internally BucketingSink queries s3 as a file system to list alrea=
dy written file parts (batches) and determine index of the next part to sta=
rt. Due to eventual consistency of checking file existence in s3 [1], the B=
ucketingSink can rewrite the previously written part and basically loose it=
.</div><div></div><div>I was wondering, what does S3&#39;s &quot;read-after=
-write consistency&quot; (mentioned on the page you linked) actually mean. =
It seems that this might be possible:</div><div>- LIST keys, find current m=
ax index</div><div>- choose next index =3D max=C2=A0+ 1</div><div>- HEAD ne=
xt index: if it exists, keep adding=C2=A0+ 1=C2=A0until key doesn&#39;t exi=
st on S3</div><div></div><div>But definitely sounds easier if a sink keeps =
track of files in a way that&#39;s guaranteed to be consistent.</div><div><=
/div><div>Cheers,</div><div>Juho</div><div></div><div class=3D"gmail_quote"=
>On Mon, Aug 27, 2018 at 2:04 PM Andrey Zagrebin &lt;<a href=3D"mailto:andr=
ey@data-artisans.com" target=3D"_blank">andrey@data-artisans.com</a>&gt; wr=
ote:Hi,true,=C2=A0StreamingFileSink does not support s3 in 1.6.0, it is pla=
nned for the next 1.7 release, sorry for confusion.The old=C2=A0BucketingSi=
nk has in general problem with s3. Internally=C2=A0BucketingSink queries s3=
 as a file system=C2=A0to list already written file parts (batches) and det=
ermine index of the next part to start. Due to eventual consistency of chec=
king file existence in s3 [1], the BucketingSink can rewrite the previously=
 written part and basically loose it. It should be fixed for StreamingFileS=
ink in 1.7 where Flink keeps its own track of written parts and does not re=
ly on s3 as a file system.=C2=A0I also include Kostas, he might add more de=
tails.=C2=A0Just to keep in mind this problem with s3 and exclude it for su=
re=C2=A0 I would also check whether the size of missing events is around th=
e batch size of BucketingSink or not. You also wrote that the timestamps of=
 lost event are &#39;probably&#39; around the time of the savepoint, if it =
is not yet for sure I would also check it.Have you already checked the log =
files of job manager and task managers for the job running before and after=
 the restore from the check point? Is everything successful there, no error=
s, relevant warnings or exceptions?As the next step, I would suggest to log=
 all encountered events in DistinctFunction.reduce if possible for producti=
on data and check whether the missed events are eventually processed before=
 or after the savepoint. The following log message indicates a border betwe=
en the events that should be included into the savepoint (logged before) or=
 not:=E2=80=9C{} ({}, synchronous part) in thread {} took {} ms=E2=80=9D (t=
emplate)Also check if the savepoint has been overall completed:&quot;{} ({}=
, asynchronous part) in thread {} took {} ms.&quot;Best,Andrey[1]=C2=A0<a h=
ref=3D"https://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.htmlOn"=
 target=3D"_blank">https://docs.aws.amazon.com/AmazonS3/latest/dev/Introduc=
tion.htmlOn</a> 24 Aug 2018, at 20:41, Juho Autio &lt;<a href=3D"mailto:juh=
o.autio@rovio.com" target=3D"_blank">juho.autio@rovio.com</a>&gt; wrote:Hi,=
Using StreamingFileSink is not a convenient option for production use for u=
s as it doesn&#39;t support s3*. I could use StreamingFileSink=C2=A0just to=
 verify, but I don&#39;t see much point in doing so. Please consider my pre=
vious comment:&gt;=C2=A0I realized that BucketingSink must not play any rol=
e in this problem. This is because only when the 24-hour window triggers, B=
ucketingSink gets a burst of input. Around the state restoring point (middl=
e of the day) it doesn&#39;t get any input, so it can&#39;t lose anything e=
ither (right?).I could also use a kafka sink instead, but I can&#39;t imagi=
ne how there could be any difference. It&#39;s very real that the sink does=
n&#39;t get any input for a long time until the 24-hour window closes, and =
then it quickly writes out everything because it&#39;s not that much data e=
ventually for the distinct values.Any ideas for debugging what&#39;s happen=
ing around the savepoint &amp; restoration time?*) I actually implemented S=
treamingFileSink as an alternative sink.=C2=A0This was before I came to rea=
lize that most likely the sink component has nothing to do with the data lo=
ss problem. I tried it with s3n://=C2=A0path just to see an exception being=
 thrown. In the source code I indeed then found an explicit check for the t=
arget path scheme to be &quot;hdfs://&quot;.=C2=A0On Fri, Aug 24, 2018 at 7=
:49 PM Andrey Zagrebin &lt;<a href=3D"mailto:andrey@data-artisans.com" targ=
et=3D"_blank">andrey@data-artisans.com</a>&gt; wrote:Ok, I think before fur=
ther debugging the window reduced state,=C2=A0could you try the new =E2=80=
=98StreamingFileSink=E2=80=99 [1] introduced in Flink 1.6.0 instead of the =
previous &#39;BucketingSink=E2=80=99?Cheers,Andrey[1]=C2=A0<a href=3D"https=
://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile=
_sink.htmlOn" target=3D"_blank">https://ci.apache.org/projects/flink/flink-=
docs-stable/dev/connectors/streamfile_sink.htmlOn</a> 24 Aug 2018, at 18:03=
, Juho Autio &lt;<a href=3D"mailto:juho.autio@rovio.com" target=3D"_blank">=
juho.autio@rovio.com</a>&gt; wrote:Yes, sorry for my confusing comment. I j=
ust meant that it seems like there&#39;s a bug somewhere now that the outpu=
t is missing some data.&gt;=C2=A0I would wait and check the actual output i=
n s3 because it is the main result of the jobYes, and that&#39;s what I hav=
e already done. There seems to be always some data loss with the production=
 data volumes, if the job has been restarted on that day.Would you have any=
 suggestions for how to debug this further?Many thanks for stepping in.On F=
ri, Aug 24, 2018 at 6:37 PM Andrey Zagrebin &lt;<a href=3D"mailto:andrey@da=
ta-artisans.com" target=3D"_blank">andrey@data-artisans.com</a>&gt; wrote:H=
i Juho,So it is a per key deduplication job.Yes, I would wait and check the=
 actual output in s3 because it is the main result of the job and&gt; The l=
ate data around the time of taking savepoint might be not included into the=
 savepoint but it should be=C2=A0behind the snapshotted offset in <a href=
=3D"http://Kafka.is" target=3D"_blank">Kafka.is</a> not a bug, it is a poss=
ible behaviour.The savepoint is a snapshot of the data in transient which i=
s already consumed from Kafka.Basically the full contents of the window res=
ult is split between the savepoint and what can come after the=C2=A0savepoi=
nt&#39;ed offset in Kafka but before the window result is written into s3.=
=C2=A0Allowed lateness should not affect it, I am just saying that the fina=
l result in s3 should include all records after it.=C2=A0This is what shoul=
d be guaranteed but not the contents of the intermediate savepoint.Cheers,A=
ndreyOn 24 Aug 2018, at 16:52, Juho Autio &lt;<a href=3D"mailto:juho.autio@=
rovio.com" target=3D"_blank">juho.autio@rovio.com</a>&gt; wrote:Thanks for =
your answer!I check for the missed data from the final output on s3. So I w=
ait until the next day, then run the same thing re-implemented in batch, an=
d compare the output.&gt; The late data around the time of taking savepoint=
 might be not included into the savepoint but it should be behind the snaps=
hotted offset in Kafka.Yes, I would definitely expect that. It seems like t=
here&#39;s a bug somewhere.&gt; Then it should just come later after the re=
store and should be reduced within the allowed lateness into the final resu=
lt which is saved into s3.Well, as far as I know, allowed lateness doesn=
9;t play any role here, because I started running the job with allowedLaten=
ess=3D0, and still get the data loss, while my late data output doesn&#39;t=
 receive anything.&gt; Also, is this `DistinctFunction.reduce` just an exam=
ple or the actual implementation, basically saving just one of records insi=
de the 24h window in s3? then what is missing there?Yes, it&#39;s the actua=
l implementation. Note that there&#39;s a keyBy before the=C2=A0DistinctFun=
ction. So there&#39;s one record for each key (which is the combination of =
a couple of fields). In practice I&#39;ve seen that we&#39;re missing ~2000=
-4000 elements on each restore, and the total output is obviously much more=
 than that.Here&#39;s the full code for the key selector:public class MapKe=
ySelector implements KeySelector&lt;Map&lt;String,String&gt;, Object&gt; {=
=C2=A0 =C2=A0 private final String[] fields;=C2=A0 =C2=A0 public MapKeySele=
ctor(String... fields) {=C2=A0 =C2=A0 =C2=A0 =C2=A0 this.fields =3D fields;=
=C2=A0 =C2=A0 }=C2=A0 =C2=A0 @Override=C2=A0 =C2=A0 public Object getKey(Ma=
p&lt;String, String&gt; event) throws Exception {=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 Tuple key =3D Tuple.getTupleClass(fields.length).newInstance();=C2=A0 =
=C2=A0 =C2=A0 =C2=A0 for (int i =3D 0; i &lt; fields.length; i++) {=C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 key.setField(event.getOrDefault(fields[i=
], &quot;&quot;), i);=C2=A0 =C2=A0 =C2=A0 =C2=A0 }=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 return key;=C2=A0 =C2=A0 }}And a more exact example on how it&#39;s use=
d:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0.keyBy(new M=
apKeySelector(&quot;ID&quot;, &quot;PLAYER_ID&quot;, &quot;FIELD&quot;, &qu=
ot;KEY_NAME&quot;, &quot;KEY_VALUE&quot;))=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 .timeWindow(Time.days(1))=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 .reduce(new DistinctFunction())On Fri, Aug =
24, 2018 at 5:26 PM Andrey Zagrebin &lt;<a href=3D"mailto:andrey@data-artis=
ans.com" target=3D"_blank">andrey@data-artisans.com</a>&gt; wrote:Hi Juho,W=
here exactly does the data miss? When do you notice that?=C2=A0Do you check=
 it:- debugging `DistinctFunction.reduce` right after resume in the middle =
of the day=C2=A0or=C2=A0- some distinct records miss in the final output of=
=C2=A0BucketingSink in s3 after window result is actually triggered and sav=
ed into s3 at the end of the day? is this the main output?The late data aro=
und the time of taking savepoint might be not included into the savepoint b=
ut it should be behind the snapshotted offset in Kafka. Then it should just=
 come later after the restore and should be reduced within the allowed late=
ness into the final result which is saved into s3.Also, is this `DistinctFu=
nction.reduce` just an example or the actual implementation, basically savi=
ng just one of records inside the 24h window in s3? then what is missing th=
ere?Cheers,AndreyOn 23 Aug 2018, at 15:42, Juho Autio &lt;<a href=3D"mailto=
:juho.autio@rovio.com" target=3D"_blank">juho.autio@rovio.com</a>&gt; wrote=
:I changed to allowedLateness=3D0, no change, still missing data when resto=
ring from savepoint.On Tue, Aug 21, 2018 at 10:43 AM Juho Autio &lt;<a href=
=3D"mailto:juho.autio@rovio.com" target=3D"_blank">juho.autio@rovio.com</a>=
&gt; wrote:I realized that BucketingSink must not play any role in this pro=
blem. This is because only when the 24-hour window triggers, BucketinSink g=
ets a burst of input. Around the state restoring point (middle of the day) =
it doesn&#39;t get any input, so it can&#39;t lose anything either (right?)=
.I will next try removing the allowedLateness entirely from the equation.In=
 the meanwhile, please let me know if you have any suggestions for debuggin=
g the lost data, for example what logs to enable.We use FlinkKafkaConsumer0=
10 btw. Are there any known issues with that, that could contribute to lost=
 data when restoring a savepoint?On Fri, Aug 17, 2018 at 4:23 PM Juho Autio=
 &lt;<a href=3D"mailto:juho.autio@rovio.com" target=3D"_blank">juho.autio@r=
ovio.com</a>&gt; wrote:Some data is silently lost on my Flink stream job wh=
en state is restored from a savepoint.Do you have any debugging hints to fi=
nd out where exactly the data gets dropped?My job gathers distinct values u=
sing a 24-hour window. It doesn&#39;t have any custom state management.When=
 I cancel the job with savepoint and restore from that savepoint, some data=
 is missed. It seems to be losing just a small amount of data. The event ti=
me of lost data is probably around the time of savepoint. In other words th=
e rest of the time window is not entirely missed =E2=80=93 collection works=
 correctly also for (most of the) events that come in after restoring.When =
the job processes a full 24-hour window without interruptions it doesn&#39;=
t miss anything.Usually the problem doesn&#39;t happen in test environments=
 that have smaller parallelism and smaller data volumes. But in production =
volumes the job seems to be consistently missing at least something on ever=
y restore.This issue has consistently happened since the job was initially =
created. It was at first run on an older version of Flink 1.5-SNAPSHOT and =
it still happens on both Flink 1.5.2 &amp; 1.6.0.I&#39;m wondering if this =
could be for example some synchronization issue between the kafka consumer =
offsets vs. what&#39;s been written by BucketingSink?1. Job content, simpli=
fied=C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0kafkaStream=C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 .flatMap(new ExtractFieldsFunction())=C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 .keyBy(new MapKeySelector(=
1, 2, 3, 4))=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 .timeWi=
ndow(Time.days(1))=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 .=
allowedLateness(allowedLateness)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 .sideOutputLateData(lateDataTag)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 .reduce(new DistinctFunction())=C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 .addSink(sink)=C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 // use a fixed number of output part=
itions=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 .setParalleli=
sm(8))/**=C2=A0* Usage: .keyBy(&quot;the&quot;, &quot;distinct&quot;, &quot=
;fields&quot;).reduce(new DistinctFunction())=C2=A0*/public class DistinctF=
unction implements ReduceFunction&lt;java.util.Map&lt;String, String&gt;&gt=
; {=C2=A0 =C2=A0 @Override=C2=A0 =C2=A0 public Map&lt;String, String&gt; re=
duce(Map&lt;String, String&gt; value1, Map&lt;String, String&gt; value2) {=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 return value1;=C2=A0 =C2=A0 }}2. State configur=
ationboolean enableIncrementalCheckpointing =3D true;String statePath =3D &=
quot;<a>s3n://bucket/savepoints</a>&quot;;new RocksDBStateBackend(statePath=
, enableIncrementalCheckpointing);Checkpointing Mode	Exactly OnceInterval	1=
m 0sTimeout	10m 0sMinimum Pause Between Checkpoints	1m 0sMaximum Concurrent=
 Checkpoints	1Persist Checkpoints Externally	Enabled (retain on cancellatio=
n)3. BucketingSink configurationWe use BucketingSink, I don&#39;t think the=
re&#39;s anything special here, if not the fact that we&#39;re writing to S=
3.=C2=A0 =C2=A0 =C2=A0 =C2=A0 String outputPath =3D &quot;<a>s3://bucket/ou=
tput</a>&quot;;=C2=A0 =C2=A0 =C2=A0 =C2=A0 BucketingSink&lt;Map&lt;String, =
String&gt;&gt; sink =3D new BucketingSink&lt;Map&lt;String, String&gt;&gt;(=
outputPath)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 .setBuck=
eter(new ProcessdateBucketer())=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 .setBatchSize(batchSize)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 .setInactiveBucketThreshold(inactiveBucketThreshold)=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 .setInactiveBucketC=
heckInterval(inactiveBucketCheckInterval);=C2=A0 =C2=A0 =C2=A0 =C2=A0 sink.=
setWriter(new IdJsonWriter());4. Kafka &amp; event timeMy flink job reads t=
he data from Kafka, using a BoundedOutOfOrdernessTimestampExtractor on the =
kafka consumer to synchronize watermarks accross all kafka partitions. We a=
lso write late data to side output, but nothing is written there =E2=80=93 =
if it would, it could explain missed data in the main output (I&#39;m also =
sure that our late data writing works, because we previously had some actua=
l late data which ended up there).5. allowedLatenessIt may be or may not be=
 relevant that I have also enabled allowedLateness with 1 minute lateness o=
n the 24-hour window:If that makes sense, I could try removing allowedLaten=
ess entirely? That would be just to rule out that Flink doesn&#39;t have a =
bug that&#39;s related to restoring state in combination with the allowedLa=
teness feature. After all, all of our data should be in a good enough order=
 to not be late, given the max out of orderness used on kafka consumer time=
stamp extractor.Thank you in advance!


</div><div></div></div>
</div></blockquote></div><br></div></div></blockquote></div><div><br></div>=
</div></div></div></div></div></div></div></div></div>
</div></blockquote></div><br></div></div></blockquote></div><div><br></div>=
</div></div>
</blockquote></div><div><br></div></div></div>
</blockquote></div><br></div></div>
</div></blockquote></div><br></div></div></blockquote></div>
</blockquote></div><div><br></div></div></div></div></div></div>
</div></blockquote></div><br></div></div></blockquote></div><div><br></div>=
</div></div></div></div></div></div></div></div></div></div></div>
</div></blockquote></div><br></div></blockquote></div><div><br></div></div>=
</div></div>
</blockquote></div><div><br></div></div></div></div></div></div></div>
</blockquote></div><br></div>
</blockquote></div><br></div>
</blockquote></div><div><br></div></div></div></div></div>
</blockquote></div><div><br></div></div>
</blockquote><br><br></blockquote></div></div>
</div>
</blockquote></div><div><br></div></div></div>
</blockquote></div>
</blockquote></div><br clear=3D"all"><div><br></div>-- <br><div dir=3D"ltr"=
 class=3D"m_-1673664296254471563gmail-m_4191132683903612140gmail-m_47802850=
61551269792gmail-m_3835006982768982814gmail-m_5222226725689792831gmail-m_-1=
425683553615925575gmail-m_893230967860415499gmail-m_4005067001349631360gmai=
l-m_-2883489677411389187gmail-m_8652012098191863984gmail-m_1534525690290176=
503gmail-m_8276719956164441117gmail-m_-992552166682970606gmail-m_-638024406=
1938198074gmail-m_7498041852700543318gmail-m_-5875148220391401525gmail-m_41=
49013583796159872gmail-m_8232114597330332459gmail-m_-3627641675392385003gma=
il-m_-7064368206193073819gmail-m_759940275789662500gmail-m_-719519405875109=
6433gmail-m_7686631074541873795gmail-m_-9220687973887808895gmail-m_-8749503=
412901838398gmail-m_8693773043552728422gmail-m_658593948785641506gmail-m_78=
12035139415822206gmail-m_3251182001238802649gmail-m_-8861550513252221396gma=
il-m_-1089187645483971039gmail-m_-3729794827731263835gmail-m_78487887079798=
37703gmail-m_-4056553463436372153gmail-m_7855292595714730576m_-860402186755=
7603042gmail-m_-1737494640612569698gmail-m_-2657744761126546667gmail-m_-695=
2977868463695051gmail-m_-6059874353068809944gmail-m_-1311840852507485819gma=
il-m_3307891912285888645gmail-m_-8569328095176494389gmail-m_727073960531890=
3769m_3401994674011917438gmail-m_5086936901295986168gmail-m_-68431727653408=
48215gmail-m_-4166783611889193866m_283114711642978144gmail_signature"><div =
dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"lt=
r"><div><div dir=3D"ltr"><div><b style=3D"font-family:arial,helvetica,sans-=
serif">Juho Autio</b><br></div><font face=3D"arial, helvetica, sans-serif">=
<span>Senior Data Engineer</span><br></font><div><br class=3D"m_-1673664296=
254471563gmail-m_4191132683903612140gmail-m_4780285061551269792gmail-m_3835=
006982768982814gmail-m_5222226725689792831gmail-m_-1425683553615925575gmail=
-m_893230967860415499gmail-m_4005067001349631360gmail-m_-288348967741138918=
7gmail-m_8652012098191863984gmail-m_1534525690290176503gmail-m_827671995616=
4441117gmail-m_-992552166682970606webkit-block-placeholder"></div><font fac=
e=3D"arial, helvetica, sans-serif"><span><div dir=3D"ltr">Data Engineering,=
 Games<br></div><div dir=3D"ltr">Rovio Entertainment Corporation<br></div><=
/span></font></div><div dir=3D"ltr"><font face=3D"arial, helvetica, sans-se=
rif"><span>Mobile: + 358 (0)45 313 0122</span><br><span><a href=3D"mailto:j=
uho.autio@rovio.com" target=3D"_blank">juho.autio@rovio.com</a>=C2=A0</span=
><br><span><a href=3D"http://www.rovio.com/" target=3D"_blank">www.rovio.co=
m</a></span></font><div><font size=3D"2" face=3D"arial, helvetica, sans-ser=
if"></font><br class=3D"m_-1673664296254471563gmail-m_4191132683903612140gm=
ail-m_4780285061551269792gmail-m_3835006982768982814gmail-m_522222672568979=
2831gmail-m_-1425683553615925575gmail-m_893230967860415499gmail-m_400506700=
1349631360gmail-m_-2883489677411389187gmail-m_8652012098191863984gmail-m_15=
34525690290176503gmail-m_8276719956164441117gmail-m_-992552166682970606webk=
it-block-placeholder"></div><div><br class=3D"m_-1673664296254471563gmail-m=
_4191132683903612140gmail-m_4780285061551269792gmail-m_3835006982768982814g=
mail-m_5222226725689792831gmail-m_-1425683553615925575gmail-m_8932309678604=
15499gmail-m_4005067001349631360gmail-m_-2883489677411389187gmail-m_8652012=
098191863984gmail-m_1534525690290176503gmail-m_8276719956164441117gmail-m_-=
992552166682970606webkit-block-placeholder"></div><p style=3D"text-align:ju=
stify"><font size=3D"2" face=3D"arial, helvetica, sans-serif"><i><span lang=
=3D"EN-US">This message and its attachments may contain confidential inform=
ation and is intended solely for the attention and use of the named address=
ee(s). If you are not the intended recipient and / or you have received thi=
s message in error, please contact the sender immediately and delete all ma=
terial you have received in this message. You are hereby notified that any =
use of the information, which you have received in error in whatsoever form=
, is strictly prohibited. Thank you for your co-operation.</span></i></font=
></p></div></div></div></div></div></div></div></div></div></div></div>
</blockquote></div></div>
</blockquote></div><br clear=3D"all"><br>-- <br><div dir=3D"ltr" class=3D"m=
_-1673664296254471563gmail-m_4191132683903612140gmail-m_4780285061551269792=
gmail-m_3835006982768982814gmail-m_5222226725689792831gmail-m_-142568355361=
5925575gmail-m_893230967860415499gmail-m_4005067001349631360gmail-m_-288348=
9677411389187gmail-m_8652012098191863984gmail-m_1534525690290176503gmail-m_=
8276719956164441117gmail-m_-992552166682970606gmail-m_-6380244061938198074g=
mail-m_7498041852700543318gmail-m_-5875148220391401525gmail-m_4149013583796=
159872gmail-m_8232114597330332459gmail-m_-3627641675392385003gmail-m_-70643=
68206193073819gmail-m_759940275789662500gmail-m_-7195194058751096433gmail-m=
_7686631074541873795gmail-m_-9220687973887808895gmail-m_-874950341290183839=
8gmail-m_8693773043552728422gmail-m_658593948785641506gmail-m_7812035139415=
822206gmail-m_3251182001238802649gmail-m_-8861550513252221396gmail-m_-10891=
87645483971039gmail-m_-3729794827731263835gmail-m_7848788707979837703gmail-=
m_-4056553463436372153gmail-m_7855292595714730576m_-8604021867557603042gmai=
l-m_-1737494640612569698gmail-m_-2657744761126546667gmail-m_-69529778684636=
95051gmail-m_-6059874353068809944gmail-m_-1311840852507485819gmail-m_330789=
1912285888645gmail-m_-8569328095176494389gmail-m_7270739605318903769m_34019=
94674011917438gmail-m_5086936901295986168gmail-m_-6843172765340848215gmail_=
signature"><div dir=3D"ltr"><span><div><div dir=3D"ltr"><span><div style=3D=
"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-siz=
e:10pt;font-family:Arial;background-color:transparent;font-weight:700;font-=
style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-va=
riant-east-asian:normal;text-decoration:none;vertical-align:baseline;white-=
space:pre-wrap">Konstantin Knauf</span><span style=3D"font-size:10pt;font-f=
amily:Roboto;background-color:transparent;font-weight:700;font-style:normal=
;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-east-a=
sian:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wr=
ap"> </span><span style=3D"font-size:10pt;font-family:Roboto;background-col=
or:transparent;font-weight:400;font-style:normal;font-variant-ligatures:nor=
mal;font-variant-caps:normal;font-variant-east-asian:normal;text-decoration=
:none;vertical-align:baseline;white-space:pre-wrap">| Solutions Architect</=
span></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"=
><span style=3D"font-size:10pt;font-family:Roboto;background-color:transpar=
ent;font-weight:400;font-style:normal;font-variant-ligatures:normal;font-va=
riant-caps:normal;font-variant-east-asian:normal;text-decoration:none;verti=
cal-align:baseline;white-space:pre-wrap">+49 160 91394525</span></div><br><=
div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><a href=3D"=
https://www.ververica.com/" style=3D"text-decoration:none" target=3D"_blank=
"><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(17,85,204);bac=
kground-color:transparent;font-weight:400;font-style:normal;font-variant:no=
rmal;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap=
"><img style=3D"border:medium none" width=3D"215" height=3D"35"></span></a>=
</div><br><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt">=
<span style=3D"font-size:10pt;font-family:Roboto;background-color:transpare=
nt;font-weight:400;font-style:normal;font-variant-ligatures:normal;font-var=
iant-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertic=
al-align:baseline;white-space:pre-wrap">Follow us @VervericaData</span></di=
v><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span st=
yle=3D"font-size:10pt;font-family:Roboto;background-color:transparent;font-=
weight:400;font-style:normal;font-variant-ligatures:normal;font-variant-cap=
s:normal;font-variant-east-asian:normal;text-decoration:none;vertical-align=
:baseline;white-space:pre-wrap">--</span></div><div style=3D"line-height:1.=
38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-fam=
ily:Roboto;background-color:transparent;font-weight:400;font-style:normal;f=
ont-variant-ligatures:normal;font-variant-caps:normal;font-variant-east-asi=
an:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap=
">Join </span><a href=3D"https://flink-forward.org/" style=3D"text-decorati=
on:none" target=3D"_blank"><span style=3D"font-size:10pt;font-family:Roboto=
;color:rgb(17,85,204);background-color:transparent;font-weight:400;font-sty=
le:normal;font-variant:normal;text-decoration:underline;vertical-align:base=
line;white-space:pre-wrap">Flink Forward</span></a><span style=3D"font-size=
:10pt;font-family:Roboto;background-color:transparent;font-weight:400;font-=
style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-va=
riant-east-asian:normal;text-decoration:none;vertical-align:baseline;white-=
space:pre-wrap"> - The Apache Flink</span><span style=3D"font-size:9.5pt;fo=
nt-family:Arial;background-color:transparent;font-weight:400;font-style:nor=
mal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-eas=
t-asian:normal;text-decoration:none;vertical-align:baseline;white-space:pre=
-wrap"> Conference</span></div><div style=3D"line-height:1.38;margin-top:0p=
t;margin-bottom:0pt"><span style=3D"font-size:9.5pt;font-family:Arial;backg=
round-color:transparent;font-weight:400;font-style:normal;font-variant-liga=
tures:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-d=
ecoration:none;vertical-align:baseline;white-space:pre-wrap">Stream Process=
ing | Event Driven | Real Time</span></div><div style=3D"line-height:1.38;m=
argin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:9.5pt;font-family=
:Arial;background-color:transparent;font-weight:400;font-style:normal;font-=
variant-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:n=
ormal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">--=
</span></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0p=
t"><span style=3D"font-size:10pt;font-family:Roboto;background-color:transp=
arent;font-weight:400;font-style:normal;font-variant-ligatures:normal;font-=
variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;ver=
tical-align:baseline;white-space:pre-wrap">Data Artisans GmbH | Invalidenst=
rasse 115, 10115 Berlin, Germany</span></div><div style=3D"line-height:1.38=
;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-famil=
y:Roboto;background-color:transparent;font-weight:400;font-style:normal;fon=
t-variant-ligatures:normal;font-variant-caps:normal;font-variant-east-asian=
:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">=
--</span></div><span style=3D"font-size:10pt;font-family:Roboto;background-=
color:transparent;font-weight:400;font-style:normal;font-variant-ligatures:=
normal;font-variant-caps:normal;font-variant-east-asian:normal;text-decorat=
ion:none;vertical-align:baseline;white-space:pre-wrap">Data Artisans GmbH</=
span><span style=3D"font-size:10pt;font-family:Roboto;background-color:tran=
sparent;font-weight:400;font-style:normal;font-variant-ligatures:normal;fon=
t-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;v=
ertical-align:baseline;white-space:pre-wrap"><br></span><span style=3D"font=
-size:10pt;font-family:Roboto;background-color:transparent;font-weight:400;=
font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;fo=
nt-variant-east-asian:normal;text-decoration:none;vertical-align:baseline;w=
hite-space:pre-wrap">Registered at Amtsgericht Charlottenburg: HRB 158244 B=
</span><span style=3D"font-size:10pt;font-family:Roboto;background-color:tr=
ansparent;font-weight:400;font-style:normal;font-variant-ligatures:normal;f=
ont-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none=
;vertical-align:baseline;white-space:pre-wrap"><br></span><span style=3D"fo=
nt-size:10pt;font-family:Roboto;background-color:transparent;font-weight:40=
0;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;=
font-variant-east-asian:normal;text-decoration:none;vertical-align:baseline=
;white-space:pre-wrap">Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan =
Ewen</span><span style=3D"font-size:10pt;font-family:Roboto;background-colo=
r:transparent;font-weight:400;font-style:normal;font-variant-ligatures:norm=
al;font-variant-caps:normal;font-variant-east-asian:normal;text-decoration:=
none;vertical-align:baseline;white-space:pre-wrap">=C2=A0=C2=A0=C2=A0 </spa=
n></span></div></div></span></div></div>
</blockquote></div><div><br></div></div></div></div>
</blockquote></div><br clear=3D"all"><br>-- <br><div dir=3D"ltr" class=3D"m=
_-1673664296254471563gmail-m_4191132683903612140gmail-m_4780285061551269792=
gmail-m_3835006982768982814gmail-m_5222226725689792831gmail-m_-142568355361=
5925575gmail-m_893230967860415499gmail-m_4005067001349631360gmail-m_-288348=
9677411389187gmail-m_8652012098191863984gmail-m_1534525690290176503gmail-m_=
8276719956164441117gmail-m_-992552166682970606gmail-m_-6380244061938198074g=
mail-m_7498041852700543318gmail-m_-5875148220391401525gmail-m_4149013583796=
159872gmail-m_8232114597330332459gmail-m_-3627641675392385003gmail-m_-70643=
68206193073819gmail-m_759940275789662500gmail-m_-7195194058751096433gmail-m=
_7686631074541873795gmail-m_-9220687973887808895gmail-m_-874950341290183839=
8gmail-m_8693773043552728422gmail-m_658593948785641506gmail-m_7812035139415=
822206gmail-m_3251182001238802649gmail-m_-8861550513252221396gmail-m_-10891=
87645483971039gmail-m_-3729794827731263835gmail-m_7848788707979837703gmail-=
m_-4056553463436372153gmail-m_7855292595714730576m_-8604021867557603042gmai=
l-m_-1737494640612569698gmail-m_-2657744761126546667gmail-m_-69529778684636=
95051gmail-m_-6059874353068809944gmail-m_-1311840852507485819gmail-m_330789=
1912285888645gmail-m_-8569328095176494389gmail-m_7270739605318903769m_34019=
94674011917438gmail_signature"><div dir=3D"ltr"><span><div><div dir=3D"ltr"=
><span><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><sp=
an style=3D"font-size:10pt;font-family:Arial;background-color:transparent;f=
ont-weight:700;font-style:normal;font-variant-ligatures:normal;font-variant=
-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertical-a=
lign:baseline;white-space:pre-wrap">Konstantin Knauf</span><span style=3D"f=
ont-size:10pt;font-family:Roboto;background-color:transparent;font-weight:7=
00;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal=
;font-variant-east-asian:normal;text-decoration:none;vertical-align:baselin=
e;white-space:pre-wrap"> </span><span style=3D"font-size:10pt;font-family:R=
oboto;background-color:transparent;font-weight:400;font-style:normal;font-v=
ariant-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:no=
rmal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">| S=
olutions Architect</span></div><div style=3D"line-height:1.38;margin-top:0p=
t;margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Roboto;backg=
round-color:transparent;font-weight:400;font-style:normal;font-variant-liga=
tures:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-d=
ecoration:none;vertical-align:baseline;white-space:pre-wrap">+49 160 913945=
25</span></div><br><div style=3D"line-height:1.38;margin-top:0pt;margin-bot=
tom:0pt"><a href=3D"https://www.ververica.com/" style=3D"text-decoration:no=
ne" target=3D"_blank"><span style=3D"font-size:10pt;font-family:Roboto;colo=
r:rgb(17,85,204);background-color:transparent;font-weight:400;font-style:no=
rmal;font-variant:normal;text-decoration:underline;vertical-align:baseline;=
white-space:pre-wrap"><img style=3D"border:medium none" width=3D"215" heigh=
t=3D"35"></span></a></div><br><div style=3D"line-height:1.38;margin-top:0pt=
;margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Roboto;backgr=
ound-color:transparent;font-weight:400;font-style:normal;font-variant-ligat=
ures:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-de=
coration:none;vertical-align:baseline;white-space:pre-wrap">Follow us @Verv=
ericaData</span></div><div style=3D"line-height:1.38;margin-top:0pt;margin-=
bottom:0pt"><span style=3D"font-size:10pt;font-family:Roboto;background-col=
or:transparent;font-weight:400;font-style:normal;font-variant-ligatures:nor=
mal;font-variant-caps:normal;font-variant-east-asian:normal;text-decoration=
:none;vertical-align:baseline;white-space:pre-wrap">--</span></div><div sty=
le=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"fon=
t-size:10pt;font-family:Roboto;background-color:transparent;font-weight:400=
;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;f=
ont-variant-east-asian:normal;text-decoration:none;vertical-align:baseline;=
white-space:pre-wrap">Join </span><a href=3D"https://flink-forward.org/" st=
yle=3D"text-decoration:none" target=3D"_blank"><span style=3D"font-size:10p=
t;font-family:Roboto;color:rgb(17,85,204);background-color:transparent;font=
-weight:400;font-style:normal;font-variant:normal;text-decoration:underline=
;vertical-align:baseline;white-space:pre-wrap">Flink Forward</span></a><spa=
n style=3D"font-size:10pt;font-family:Roboto;background-color:transparent;f=
ont-weight:400;font-style:normal;font-variant-ligatures:normal;font-variant=
-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertical-a=
lign:baseline;white-space:pre-wrap"> - The Apache Flink</span><span style=
=3D"font-size:9.5pt;font-family:Arial;background-color:transparent;font-wei=
ght:400;font-style:normal;font-variant-ligatures:normal;font-variant-caps:n=
ormal;font-variant-east-asian:normal;text-decoration:none;vertical-align:ba=
seline;white-space:pre-wrap"> Conference</span></div><div style=3D"line-hei=
ght:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:9.5pt;f=
ont-family:Arial;background-color:transparent;font-weight:400;font-style:no=
rmal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-ea=
st-asian:normal;text-decoration:none;vertical-align:baseline;white-space:pr=
e-wrap">Stream Processing | Event Driven | Real Time</span></div><div style=
=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-=
size:9.5pt;font-family:Arial;background-color:transparent;font-weight:400;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;fon=
t-variant-east-asian:normal;text-decoration:none;vertical-align:baseline;wh=
ite-space:pre-wrap">--</span></div><div style=3D"line-height:1.38;margin-to=
p:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Roboto;b=
ackground-color:transparent;font-weight:400;font-style:normal;font-variant-=
ligatures:normal;font-variant-caps:normal;font-variant-east-asian:normal;te=
xt-decoration:none;vertical-align:baseline;white-space:pre-wrap">Data Artis=
ans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany</span></div><div sty=
le=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"fon=
t-size:10pt;font-family:Roboto;background-color:transparent;font-weight:400=
;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;f=
ont-variant-east-asian:normal;text-decoration:none;vertical-align:baseline;=
white-space:pre-wrap">--</span></div><span style=3D"font-size:10pt;font-fam=
ily:Roboto;background-color:transparent;font-weight:400;font-style:normal;f=
ont-variant-ligatures:normal;font-variant-caps:normal;font-variant-east-asi=
an:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap=
">Data Artisans GmbH</span><span style=3D"font-size:10pt;font-family:Roboto=
;background-color:transparent;font-weight:400;font-style:normal;font-varian=
t-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:normal;=
text-decoration:none;vertical-align:baseline;white-space:pre-wrap"><br></sp=
an><span style=3D"font-size:10pt;font-family:Roboto;background-color:transp=
arent;font-weight:400;font-style:normal;font-variant-ligatures:normal;font-=
variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;ver=
tical-align:baseline;white-space:pre-wrap">Registered at Amtsgericht Charlo=
ttenburg: HRB 158244 B</span><span style=3D"font-size:10pt;font-family:Robo=
to;background-color:transparent;font-weight:400;font-style:normal;font-vari=
ant-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:norma=
l;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"><br></=
span><span style=3D"font-size:10pt;font-family:Roboto;background-color:tran=
sparent;font-weight:400;font-style:normal;font-variant-ligatures:normal;fon=
t-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;v=
ertical-align:baseline;white-space:pre-wrap">Managing Directors: Dr. Kostas=
 Tzoumas, Dr. Stephan Ewen</span><span style=3D"font-size:10pt;font-family:=
Roboto;background-color:transparent;font-weight:400;font-style:normal;font-=
variant-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:n=
ormal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">=
=C2=A0=C2=A0=C2=A0 </span></span></div></div></span></div></div></div></div=
></div>
</blockquote></div><div><br></div></div></div></div></div></div></div></div=
></div></div>
</blockquote></div><br clear=3D"all"><br>-- <br><div dir=3D"ltr" class=3D"m=
_-1673664296254471563gmail-m_4191132683903612140gmail-m_4780285061551269792=
gmail-m_3835006982768982814gmail-m_5222226725689792831gmail-m_-142568355361=
5925575gmail-m_893230967860415499gmail-m_4005067001349631360gmail-m_-288348=
9677411389187gmail-m_8652012098191863984gmail-m_1534525690290176503gmail-m_=
8276719956164441117gmail-m_-992552166682970606gmail-m_-6380244061938198074g=
mail-m_7498041852700543318gmail-m_-5875148220391401525gmail-m_4149013583796=
159872gmail-m_8232114597330332459gmail-m_-3627641675392385003gmail-m_-70643=
68206193073819gmail-m_759940275789662500gmail-m_-7195194058751096433gmail-m=
_7686631074541873795gmail-m_-9220687973887808895gmail-m_-874950341290183839=
8gmail-m_8693773043552728422gmail-m_658593948785641506gmail-m_7812035139415=
822206gmail-m_3251182001238802649gmail-m_-8861550513252221396gmail-m_-10891=
87645483971039gmail-m_-3729794827731263835gmail-m_7848788707979837703gmail-=
m_-4056553463436372153gmail-m_7855292595714730576m_-8604021867557603042gmai=
l-m_-1737494640612569698gmail-m_-2657744761126546667gmail-m_-69529778684636=
95051gmail-m_-6059874353068809944gmail-m_-1311840852507485819gmail-m_330789=
1912285888645gmail_signature"><div dir=3D"ltr"><div><div dir=3D"ltr"><span>=
<div><div dir=3D"ltr"><span><div style=3D"line-height:1.38;margin-top:0pt;m=
argin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Arial;backgroun=
d-color:transparent;font-weight:700;font-style:normal;font-variant-ligature=
s:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-decor=
ation:none;vertical-align:baseline;white-space:pre-wrap">Konstantin Knauf</=
span><span style=3D"font-size:10pt;font-family:Roboto;background-color:tran=
sparent;font-weight:700;font-style:normal;font-variant-ligatures:normal;fon=
t-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;v=
ertical-align:baseline;white-space:pre-wrap"> </span><span style=3D"font-si=
ze:10pt;font-family:Roboto;background-color:transparent;font-weight:400;fon=
t-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-=
variant-east-asian:normal;text-decoration:none;vertical-align:baseline;whit=
e-space:pre-wrap">| Solutions Architect</span></div><div style=3D"line-heig=
ht:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;fon=
t-family:Roboto;background-color:transparent;font-weight:400;font-style:nor=
mal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-eas=
t-asian:normal;text-decoration:none;vertical-align:baseline;white-space:pre=
-wrap">+49 160 91394525</span></div></span><div style=3D"line-height:1.38;m=
argin-top:0pt;margin-bottom:0pt"><a href=3D"https://www.ververica.com/" sty=
le=3D"text-decoration:none" target=3D"_blank"><span style=3D"font-size:10pt=
;font-family:Roboto;color:rgb(17,85,204);background-color:transparent;font-=
weight:400;font-style:normal;font-variant:normal;text-decoration:underline;=
vertical-align:baseline;white-space:pre-wrap"><img style=3D"border:medium n=
one" width=3D"203" height=3D"38"></span></a></div><br><div style=3D"line-he=
ight:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;f=
ont-family:Roboto;background-color:transparent;font-weight:400;font-style:n=
ormal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-e=
ast-asian:normal;text-decoration:none;vertical-align:baseline;white-space:p=
re-wrap">Follow us @VervericaData</span></div><div style=3D"line-height:1.3=
8;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-fami=
ly:Roboto;background-color:transparent;font-weight:400;font-style:normal;fo=
nt-variant-ligatures:normal;font-variant-caps:normal;font-variant-east-asia=
n:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"=
>--</span></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom=
:0pt"><span style=3D"font-size:10pt;font-family:Roboto;background-color:tra=
nsparent;font-weight:400;font-style:normal;font-variant-ligatures:normal;fo=
nt-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;=
vertical-align:baseline;white-space:pre-wrap">Join </span><a href=3D"https:=
//flink-forward.org/" style=3D"text-decoration:none" target=3D"_blank"><spa=
n style=3D"font-size:10pt;font-family:Roboto;color:rgb(17,85,204);backgroun=
d-color:transparent;font-weight:400;font-style:normal;font-variant:normal;t=
ext-decoration:underline;vertical-align:baseline;white-space:pre-wrap">Flin=
k Forward</span></a><span style=3D"font-size:10pt;font-family:Roboto;backgr=
ound-color:transparent;font-weight:400;font-style:normal;font-variant-ligat=
ures:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-de=
coration:none;vertical-align:baseline;white-space:pre-wrap"> - The Apache F=
link</span><span style=3D"font-size:9.5pt;font-family:Arial;background-colo=
r:transparent;font-weight:400;font-style:normal;font-variant-ligatures:norm=
al;font-variant-caps:normal;font-variant-east-asian:normal;text-decoration:=
none;vertical-align:baseline;white-space:pre-wrap"> Conference</span></div>=
<div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span styl=
e=3D"font-size:9.5pt;font-family:Arial;background-color:transparent;font-we=
ight:400;font-style:normal;font-variant-ligatures:normal;font-variant-caps:=
normal;font-variant-east-asian:normal;text-decoration:none;vertical-align:b=
aseline;white-space:pre-wrap">Stream Processing | Event Driven | Real Time<=
/span></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt=
"><span style=3D"font-size:9.5pt;font-family:Arial;background-color:transpa=
rent;font-weight:400;font-style:normal;font-variant-ligatures:normal;font-v=
ariant-caps:normal;font-variant-east-asian:normal;text-decoration:none;vert=
ical-align:baseline;white-space:pre-wrap">--</span></div><div style=3D"line=
-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10p=
t;font-family:Roboto;background-color:transparent;font-weight:400;font-styl=
e:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-varian=
t-east-asian:normal;text-decoration:none;vertical-align:baseline;white-spac=
e:pre-wrap">Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, German=
y</span></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0=
pt"><span style=3D"font-size:10pt;font-family:Roboto;background-color:trans=
parent;font-weight:400;font-style:normal;font-variant-ligatures:normal;font=
-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;ve=
rtical-align:baseline;white-space:pre-wrap">--</span></div><span style=3D"f=
ont-size:10pt;font-family:Roboto;background-color:transparent;font-weight:4=
00;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal=
;font-variant-east-asian:normal;text-decoration:none;vertical-align:baselin=
e;white-space:pre-wrap">Data Artisans GmbH</span><span style=3D"font-size:1=
0pt;font-family:Roboto;background-color:transparent;font-weight:400;font-st=
yle:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-vari=
ant-east-asian:normal;text-decoration:none;vertical-align:baseline;white-sp=
ace:pre-wrap"><br></span><span style=3D"font-size:10pt;font-family:Roboto;b=
ackground-color:transparent;font-weight:400;font-style:normal;font-variant-=
ligatures:normal;font-variant-caps:normal;font-variant-east-asian:normal;te=
xt-decoration:none;vertical-align:baseline;white-space:pre-wrap">Registered=
 at Amtsgericht Charlottenburg: HRB 158244 B</span><span style=3D"font-size=
:10pt;font-family:Roboto;background-color:transparent;font-weight:400;font-=
style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-va=
riant-east-asian:normal;text-decoration:none;vertical-align:baseline;white-=
space:pre-wrap"><br></span><span style=3D"font-size:10pt;font-family:Roboto=
;background-color:transparent;font-weight:400;font-style:normal;font-varian=
t-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:normal;=
text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Managing=
 Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen</span><span style=3D"font-=
size:10pt;font-family:Roboto;background-color:transparent;font-weight:400;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;fon=
t-variant-east-asian:normal;text-decoration:none;vertical-align:baseline;wh=
ite-space:pre-wrap">=C2=A0=C2=A0=C2=A0 </span></div></div></span></div></di=
v></div></div>
</blockquote></div><div><br></div></div></div></blockquote></div></div></di=
v></div></div>
</blockquote></div><br clear=3D"all"><br>-- <br><div dir=3D"ltr" class=3D"m=
_-1673664296254471563gmail-m_4191132683903612140gmail-m_4780285061551269792=
gmail-m_3835006982768982814gmail-m_5222226725689792831gmail-m_-142568355361=
5925575gmail-m_893230967860415499gmail-m_4005067001349631360gmail-m_-288348=
9677411389187gmail-m_8652012098191863984gmail-m_1534525690290176503gmail-m_=
8276719956164441117gmail-m_-992552166682970606gmail-m_-6380244061938198074g=
mail-m_7498041852700543318gmail-m_-5875148220391401525gmail-m_4149013583796=
159872gmail-m_8232114597330332459gmail-m_-3627641675392385003gmail-m_-70643=
68206193073819gmail-m_759940275789662500gmail-m_-7195194058751096433gmail-m=
_7686631074541873795gmail-m_-9220687973887808895gmail-m_-874950341290183839=
8gmail-m_8693773043552728422gmail-m_658593948785641506gmail-m_7812035139415=
822206gmail-m_3251182001238802649gmail-m_-8861550513252221396gmail-m_-10891=
87645483971039gmail-m_-3729794827731263835gmail-m_7848788707979837703gmail-=
m_-4056553463436372153gmail-m_7855292595714730576m_-8604021867557603042gmai=
l-m_-1737494640612569698gmail-m_-2657744761126546667gmail_signature"><div d=
ir=3D"ltr"><div><div dir=3D"ltr"><span><div><div dir=3D"ltr"><span><div sty=
le=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"fon=
t-size:10pt;font-family:Arial;background-color:transparent;font-weight:700;=
font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;fo=
nt-variant-east-asian:normal;text-decoration:none;vertical-align:baseline;w=
hite-space:pre-wrap">Konstantin Knauf</span><span style=3D"font-size:10pt;f=
ont-family:Roboto;background-color:transparent;font-weight:700;font-style:n=
ormal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-e=
ast-asian:normal;text-decoration:none;vertical-align:baseline;white-space:p=
re-wrap"> </span><span style=3D"font-size:10pt;font-family:Roboto;backgroun=
d-color:transparent;font-weight:400;font-style:normal;font-variant-ligature=
s:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-decor=
ation:none;vertical-align:baseline;white-space:pre-wrap">| Solutions Archit=
ect</span></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom=
:0pt"><span style=3D"font-size:10pt;font-family:Roboto;background-color:tra=
nsparent;font-weight:400;font-style:normal;font-variant-ligatures:normal;fo=
nt-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;=
vertical-align:baseline;white-space:pre-wrap">+49 160 91394525</span></div>=
</span><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><a =
href=3D"https://www.ververica.com/" style=3D"text-decoration:none" target=
=3D"_blank"><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(17,8=
5,204);background-color:transparent;font-weight:400;font-style:normal;font-=
variant:normal;text-decoration:underline;vertical-align:baseline;white-spac=
e:pre-wrap"><img style=3D"border:medium none" width=3D"203" height=3D"38"><=
/span></a></div><br><div style=3D"line-height:1.38;margin-top:0pt;margin-bo=
ttom:0pt"><span style=3D"font-size:10pt;font-family:Roboto;background-color=
:transparent;font-weight:400;font-style:normal;font-variant-ligatures:norma=
l;font-variant-caps:normal;font-variant-east-asian:normal;text-decoration:n=
one;vertical-align:baseline;white-space:pre-wrap">Follow us @VervericaData<=
/span></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt=
"><span style=3D"font-size:10pt;font-family:Roboto;background-color:transpa=
rent;font-weight:400;font-style:normal;font-variant-ligatures:normal;font-v=
ariant-caps:normal;font-variant-east-asian:normal;text-decoration:none;vert=
ical-align:baseline;white-space:pre-wrap">--</span></div><div style=3D"line=
-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10p=
t;font-family:Roboto;background-color:transparent;font-weight:400;font-styl=
e:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-varian=
t-east-asian:normal;text-decoration:none;vertical-align:baseline;white-spac=
e:pre-wrap">Join </span><a href=3D"https://flink-forward.org/" style=3D"tex=
t-decoration:none" target=3D"_blank"><span style=3D"font-size:10pt;font-fam=
ily:Roboto;color:rgb(17,85,204);background-color:transparent;font-weight:40=
0;font-style:normal;font-variant:normal;text-decoration:underline;vertical-=
align:baseline;white-space:pre-wrap">Flink Forward</span></a><span style=3D=
"font-size:10pt;font-family:Roboto;background-color:transparent;font-weight=
:400;font-style:normal;font-variant-ligatures:normal;font-variant-caps:norm=
al;font-variant-east-asian:normal;text-decoration:none;vertical-align:basel=
ine;white-space:pre-wrap"> - The Apache Flink</span><span style=3D"font-siz=
e:9.5pt;font-family:Arial;background-color:transparent;font-weight:400;font=
-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-v=
ariant-east-asian:normal;text-decoration:none;vertical-align:baseline;white=
-space:pre-wrap"> Conference</span></div><div style=3D"line-height:1.38;mar=
gin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:9.5pt;font-family:A=
rial;background-color:transparent;font-weight:400;font-style:normal;font-va=
riant-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:nor=
mal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Stre=
am Processing | Event Driven | Real Time</span></div><div style=3D"line-hei=
ght:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:9.5pt;f=
ont-family:Arial;background-color:transparent;font-weight:400;font-style:no=
rmal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-ea=
st-asian:normal;text-decoration:none;vertical-align:baseline;white-space:pr=
e-wrap">--</span></div><div style=3D"line-height:1.38;margin-top:0pt;margin=
-bottom:0pt"><span style=3D"font-size:10pt;font-family:Roboto;background-co=
lor:transparent;font-weight:400;font-style:normal;font-variant-ligatures:no=
rmal;font-variant-caps:normal;font-variant-east-asian:normal;text-decoratio=
n:none;vertical-align:baseline;white-space:pre-wrap">Data Artisans GmbH | I=
nvalidenstrasse 115, 10115 Berlin, Germany</span></div><div style=3D"line-h=
eight:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;=
font-family:Roboto;background-color:transparent;font-weight:400;font-style:=
normal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-=
east-asian:normal;text-decoration:none;vertical-align:baseline;white-space:=
pre-wrap">--</span></div><span style=3D"font-size:10pt;font-family:Roboto;b=
ackground-color:transparent;font-weight:400;font-style:normal;font-variant-=
ligatures:normal;font-variant-caps:normal;font-variant-east-asian:normal;te=
xt-decoration:none;vertical-align:baseline;white-space:pre-wrap">Data Artis=
ans GmbH</span><span style=3D"font-size:10pt;font-family:Roboto;background-=
color:transparent;font-weight:400;font-style:normal;font-variant-ligatures:=
normal;font-variant-caps:normal;font-variant-east-asian:normal;text-decorat=
ion:none;vertical-align:baseline;white-space:pre-wrap"><br></span><span sty=
le=3D"font-size:10pt;font-family:Roboto;background-color:transparent;font-w=
eight:400;font-style:normal;font-variant-ligatures:normal;font-variant-caps=
:normal;font-variant-east-asian:normal;text-decoration:none;vertical-align:=
baseline;white-space:pre-wrap">Registered at Amtsgericht Charlottenburg: HR=
B 158244 B</span><span style=3D"font-size:10pt;font-family:Roboto;backgroun=
d-color:transparent;font-weight:400;font-style:normal;font-variant-ligature=
s:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-decor=
ation:none;vertical-align:baseline;white-space:pre-wrap"><br></span><span s=
tyle=3D"font-size:10pt;font-family:Roboto;background-color:transparent;font=
-weight:400;font-style:normal;font-variant-ligatures:normal;font-variant-ca=
ps:normal;font-variant-east-asian:normal;text-decoration:none;vertical-alig=
n:baseline;white-space:pre-wrap">Managing Directors: Dr. Kostas Tzoumas, Dr=
. Stephan Ewen</span><span style=3D"font-size:10pt;font-family:Roboto;backg=
round-color:transparent;font-weight:400;font-style:normal;font-variant-liga=
tures:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-d=
ecoration:none;vertical-align:baseline;white-space:pre-wrap">=C2=A0=C2=A0=
=C2=A0 </span></div></div></span></div></div></div></div></blockquote></div=
></div></div></div></div>
</blockquote></div></div><br clear=3D"all"><br>-- <br><div dir=3D"ltr" clas=
s=3D"m_-1673664296254471563gmail-m_4191132683903612140gmail-m_4780285061551=
269792gmail-m_3835006982768982814gmail-m_5222226725689792831gmail-m_-142568=
3553615925575gmail-m_893230967860415499gmail-m_4005067001349631360gmail-m_-=
2883489677411389187gmail-m_8652012098191863984gmail-m_1534525690290176503gm=
ail-m_8276719956164441117gmail-m_-992552166682970606gmail-m_-63802440619381=
98074gmail-m_7498041852700543318gmail-m_-5875148220391401525gmail-m_4149013=
583796159872gmail-m_8232114597330332459gmail-m_-3627641675392385003gmail-m_=
-7064368206193073819gmail-m_759940275789662500gmail-m_-7195194058751096433g=
mail-m_7686631074541873795gmail-m_-9220687973887808895gmail-m_-874950341290=
1838398gmail-m_8693773043552728422gmail-m_658593948785641506gmail-m_7812035=
139415822206gmail-m_3251182001238802649gmail-m_-8861550513252221396gmail-m_=
-1089187645483971039gmail-m_-3729794827731263835gmail-m_7848788707979837703=
gmail-m_-4056553463436372153gmail-m_7855292595714730576gmail-m_-86040218675=
57603042gmail_signature"><div dir=3D"ltr"><div><div dir=3D"ltr"><span><div>=
<div dir=3D"ltr"><span><div style=3D"line-height:1.38;margin-top:0pt;margin=
-bottom:0pt"><span style=3D"font-size:10pt;font-family:Arial;background-col=
or:transparent;font-weight:700;font-style:normal;font-variant-ligatures:nor=
mal;font-variant-caps:normal;font-variant-east-asian:normal;text-decoration=
:none;vertical-align:baseline;white-space:pre-wrap">Konstantin Knauf</span>=
<span style=3D"font-size:10pt;font-family:Roboto;background-color:transpare=
nt;font-weight:700;font-style:normal;font-variant-ligatures:normal;font-var=
iant-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertic=
al-align:baseline;white-space:pre-wrap"> </span><span style=3D"font-size:10=
pt;font-family:Roboto;background-color:transparent;font-weight:400;font-sty=
le:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-varia=
nt-east-asian:normal;text-decoration:none;vertical-align:baseline;white-spa=
ce:pre-wrap">| Solutions Architect</span></div><div style=3D"line-height:1.=
38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-fam=
ily:Roboto;background-color:transparent;font-weight:400;font-style:normal;f=
ont-variant-ligatures:normal;font-variant-caps:normal;font-variant-east-asi=
an:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap=
">+49 160 91394525</span></div></span><div style=3D"line-height:1.38;margin=
-top:0pt;margin-bottom:0pt"><a href=3D"https://www.ververica.com/" style=3D=
"text-decoration:none" target=3D"_blank"><span style=3D"font-size:10pt;font=
-family:Roboto;color:rgb(17,85,204);background-color:transparent;font-weigh=
t:400;font-style:normal;font-variant:normal;text-decoration:underline;verti=
cal-align:baseline;white-space:pre-wrap"><img src=3D"https://lh4.googleuser=
content.com/1RRzA12SK12Xaowkag-W37QDs5LHrfw4R0tMwVNjKLDKoIu69ld1qtA2hSDn1LS=
Je9w2THG1A9igK_nXPrNeIqRF87FjbEQoBnZJJgyPXCkKPFYuYc_Vh419P9EOO36ERgdnX5wG" =
style=3D"border:medium none" width=3D"203" height=3D"38"></span></a></div><=
br><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span s=
tyle=3D"font-size:10pt;font-family:Roboto;background-color:transparent;font=
-weight:400;font-style:normal;font-variant-ligatures:normal;font-variant-ca=
ps:normal;font-variant-east-asian:normal;text-decoration:none;vertical-alig=
n:baseline;white-space:pre-wrap">Follow us @VervericaData</span></div><div =
style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"=
font-size:10pt;font-family:Roboto;background-color:transparent;font-weight:=
400;font-style:normal;font-variant-ligatures:normal;font-variant-caps:norma=
l;font-variant-east-asian:normal;text-decoration:none;vertical-align:baseli=
ne;white-space:pre-wrap">--</span></div><div style=3D"line-height:1.38;marg=
in-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Rob=
oto;background-color:transparent;font-weight:400;font-style:normal;font-var=
iant-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:norm=
al;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Join =
</span><a href=3D"https://flink-forward.org/" style=3D"text-decoration:none=
" target=3D"_blank"><span style=3D"font-size:10pt;font-family:Roboto;color:=
rgb(17,85,204);background-color:transparent;font-weight:400;font-style:norm=
al;font-variant:normal;text-decoration:underline;vertical-align:baseline;wh=
ite-space:pre-wrap">Flink Forward</span></a><span style=3D"font-size:10pt;f=
ont-family:Roboto;background-color:transparent;font-weight:400;font-style:n=
ormal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-e=
ast-asian:normal;text-decoration:none;vertical-align:baseline;white-space:p=
re-wrap"> - The Apache Flink</span><span style=3D"font-size:9.5pt;font-fami=
ly:Arial;background-color:transparent;font-weight:400;font-style:normal;fon=
t-variant-ligatures:normal;font-variant-caps:normal;font-variant-east-asian=
:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">=
 Conference</span></div><div style=3D"line-height:1.38;margin-top:0pt;margi=
n-bottom:0pt"><span style=3D"font-size:9.5pt;font-family:Arial;background-c=
olor:transparent;font-weight:400;font-style:normal;font-variant-ligatures:n=
ormal;font-variant-caps:normal;font-variant-east-asian:normal;text-decorati=
on:none;vertical-align:baseline;white-space:pre-wrap">Stream Processing | E=
vent Driven | Real Time</span></div><div style=3D"line-height:1.38;margin-t=
op:0pt;margin-bottom:0pt"><span style=3D"font-size:9.5pt;font-family:Arial;=
background-color:transparent;font-weight:400;font-style:normal;font-variant=
-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:normal;t=
ext-decoration:none;vertical-align:baseline;white-space:pre-wrap">--</span>=
</div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><spa=
n style=3D"font-size:10pt;font-family:Roboto;background-color:transparent;f=
ont-weight:400;font-style:normal;font-variant-ligatures:normal;font-variant=
-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertical-a=
lign:baseline;white-space:pre-wrap">Data Artisans GmbH | Invalidenstrasse 1=
15, 10115 Berlin, Germany</span></div><div style=3D"line-height:1.38;margin=
-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Robot=
o;background-color:transparent;font-weight:400;font-style:normal;font-varia=
nt-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:normal=
;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">--</spa=
n></div><span style=3D"font-size:10pt;font-family:Roboto;background-color:t=
ransparent;font-weight:400;font-style:normal;font-variant-ligatures:normal;=
font-variant-caps:normal;font-variant-east-asian:normal;text-decoration:non=
e;vertical-align:baseline;white-space:pre-wrap">Data Artisans GmbH</span><s=
pan style=3D"font-size:10pt;font-family:Roboto;background-color:transparent=
;font-weight:400;font-style:normal;font-variant-ligatures:normal;font-varia=
nt-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertical=
-align:baseline;white-space:pre-wrap"><br></span><span style=3D"font-size:1=
0pt;font-family:Roboto;background-color:transparent;font-weight:400;font-st=
yle:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-vari=
ant-east-asian:normal;text-decoration:none;vertical-align:baseline;white-sp=
ace:pre-wrap">Registered at Amtsgericht Charlottenburg: HRB 158244 B</span>=
<span style=3D"font-size:10pt;font-family:Roboto;background-color:transpare=
nt;font-weight:400;font-style:normal;font-variant-ligatures:normal;font-var=
iant-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertic=
al-align:baseline;white-space:pre-wrap"><br></span><span style=3D"font-size=
:10pt;font-family:Roboto;background-color:transparent;font-weight:400;font-=
style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-va=
riant-east-asian:normal;text-decoration:none;vertical-align:baseline;white-=
space:pre-wrap">Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen</s=
pan><span style=3D"font-size:10pt;font-family:Roboto;background-color:trans=
parent;font-weight:400;font-style:normal;font-variant-ligatures:normal;font=
-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;ve=
rtical-align:baseline;white-space:pre-wrap">=C2=A0=C2=A0=C2=A0 </span></div=
></div></span></div></div></div></div>
</blockquote></div>
</blockquote></div><div><br></div></div></div>
</blockquote></div><br clear=3D"all"><br>-- <br><div dir=3D"ltr" class=3D"m=
_-1673664296254471563gmail-m_4191132683903612140gmail-m_4780285061551269792=
gmail-m_3835006982768982814gmail-m_5222226725689792831gmail-m_-142568355361=
5925575gmail-m_893230967860415499gmail-m_4005067001349631360gmail-m_-288348=
9677411389187gmail-m_8652012098191863984gmail-m_1534525690290176503gmail-m_=
8276719956164441117gmail-m_-992552166682970606gmail-m_-6380244061938198074g=
mail-m_7498041852700543318gmail-m_-5875148220391401525gmail-m_4149013583796=
159872gmail-m_8232114597330332459gmail-m_-3627641675392385003gmail-m_-70643=
68206193073819gmail-m_759940275789662500gmail-m_-7195194058751096433gmail-m=
_7686631074541873795gmail-m_-9220687973887808895gmail-m_-874950341290183839=
8gmail-m_8693773043552728422gmail-m_658593948785641506gmail-m_7812035139415=
822206gmail-m_3251182001238802649gmail-m_-8861550513252221396gmail-m_-10891=
87645483971039gmail_signature"><div dir=3D"ltr"><div><div dir=3D"ltr"><span=
><div><div dir=3D"ltr"><span><div style=3D"line-height:1.38;margin-top:0pt;=
margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Arial;backgrou=
nd-color:transparent;font-weight:700;font-style:normal;font-variant-ligatur=
es:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-deco=
ration:none;vertical-align:baseline;white-space:pre-wrap">Konstantin Knauf<=
/span><span style=3D"font-size:10pt;font-family:Roboto;background-color:tra=
nsparent;font-weight:700;font-style:normal;font-variant-ligatures:normal;fo=
nt-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;=
vertical-align:baseline;white-space:pre-wrap"> </span><span style=3D"font-s=
ize:10pt;font-family:Roboto;background-color:transparent;font-weight:400;fo=
nt-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font=
-variant-east-asian:normal;text-decoration:none;vertical-align:baseline;whi=
te-space:pre-wrap">| Solutions Architect</span></div><div style=3D"line-hei=
ght:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;fo=
nt-family:Roboto;background-color:transparent;font-weight:400;font-style:no=
rmal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-ea=
st-asian:normal;text-decoration:none;vertical-align:baseline;white-space:pr=
e-wrap">+49 160 91394525</span></div></span><div style=3D"line-height:1.38;=
margin-top:0pt;margin-bottom:0pt"><a href=3D"https://www.ververica.com/" st=
yle=3D"text-decoration:none" target=3D"_blank"><span style=3D"font-size:10p=
t;font-family:Roboto;color:rgb(17,85,204);background-color:transparent;font=
-weight:400;font-style:normal;font-variant:normal;text-decoration:underline=
;vertical-align:baseline;white-space:pre-wrap"><img src=3D"https://lh4.goog=
leusercontent.com/1RRzA12SK12Xaowkag-W37QDs5LHrfw4R0tMwVNjKLDKoIu69ld1qtA2h=
SDn1LSJe9w2THG1A9igK_nXPrNeIqRF87FjbEQoBnZJJgyPXCkKPFYuYc_Vh419P9EOO36ERgdn=
X5wG" style=3D"border:medium none" width=3D"203" height=3D"38"></span></a><=
/div><br><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><=
span style=3D"font-size:10pt;font-family:Roboto;background-color:transparen=
t;font-weight:400;font-style:normal;font-variant-ligatures:normal;font-vari=
ant-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertica=
l-align:baseline;white-space:pre-wrap">Follow us @VervericaData</span></div=
><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span sty=
le=3D"font-size:10pt;font-family:Roboto;background-color:transparent;font-w=
eight:400;font-style:normal;font-variant-ligatures:normal;font-variant-caps=
:normal;font-variant-east-asian:normal;text-decoration:none;vertical-align:=
baseline;white-space:pre-wrap">--</span></div><div style=3D"line-height:1.3=
8;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-fami=
ly:Roboto;background-color:transparent;font-weight:400;font-style:normal;fo=
nt-variant-ligatures:normal;font-variant-caps:normal;font-variant-east-asia=
n:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"=
>Join </span><a href=3D"https://flink-forward.org/" style=3D"text-decoratio=
n:none" target=3D"_blank"><span style=3D"font-size:10pt;font-family:Roboto;=
color:rgb(17,85,204);background-color:transparent;font-weight:400;font-styl=
e:normal;font-variant:normal;text-decoration:underline;vertical-align:basel=
ine;white-space:pre-wrap">Flink Forward</span></a><span style=3D"font-size:=
10pt;font-family:Roboto;background-color:transparent;font-weight:400;font-s=
tyle:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-var=
iant-east-asian:normal;text-decoration:none;vertical-align:baseline;white-s=
pace:pre-wrap"> - The Apache Flink</span><span style=3D"font-size:9.5pt;fon=
t-family:Arial;background-color:transparent;font-weight:400;font-style:norm=
al;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-east=
-asian:normal;text-decoration:none;vertical-align:baseline;white-space:pre-=
wrap"> Conference</span></div><div style=3D"line-height:1.38;margin-top:0pt=
;margin-bottom:0pt"><span style=3D"font-size:9.5pt;font-family:Arial;backgr=
ound-color:transparent;font-weight:400;font-style:normal;font-variant-ligat=
ures:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-de=
coration:none;vertical-align:baseline;white-space:pre-wrap">Stream Processi=
ng | Event Driven | Real Time</span></div><div style=3D"line-height:1.38;ma=
rgin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:9.5pt;font-family:=
Arial;background-color:transparent;font-weight:400;font-style:normal;font-v=
ariant-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:no=
rmal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">--<=
/span></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt=
"><span style=3D"font-size:10pt;font-family:Roboto;background-color:transpa=
rent;font-weight:400;font-style:normal;font-variant-ligatures:normal;font-v=
ariant-caps:normal;font-variant-east-asian:normal;text-decoration:none;vert=
ical-align:baseline;white-space:pre-wrap">Data Artisans GmbH | Invalidenstr=
asse 115, 10115 Berlin, Germany</span></div><div style=3D"line-height:1.38;=
margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-family=
:Roboto;background-color:transparent;font-weight:400;font-style:normal;font=
-variant-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:=
normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">-=
-</span></div><span style=3D"font-size:10pt;font-family:Roboto;background-c=
olor:transparent;font-weight:400;font-style:normal;font-variant-ligatures:n=
ormal;font-variant-caps:normal;font-variant-east-asian:normal;text-decorati=
on:none;vertical-align:baseline;white-space:pre-wrap">Data Artisans GmbH</s=
pan><span style=3D"font-size:10pt;font-family:Roboto;background-color:trans=
parent;font-weight:400;font-style:normal;font-variant-ligatures:normal;font=
-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;ve=
rtical-align:baseline;white-space:pre-wrap"><br></span><span style=3D"font-=
size:10pt;font-family:Roboto;background-color:transparent;font-weight:400;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;fon=
t-variant-east-asian:normal;text-decoration:none;vertical-align:baseline;wh=
ite-space:pre-wrap">Registered at Amtsgericht Charlottenburg: HRB 158244 B<=
/span><span style=3D"font-size:10pt;font-family:Roboto;background-color:tra=
nsparent;font-weight:400;font-style:normal;font-variant-ligatures:normal;fo=
nt-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;=
vertical-align:baseline;white-space:pre-wrap"><br></span><span style=3D"fon=
t-size:10pt;font-family:Roboto;background-color:transparent;font-weight:400=
;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;f=
ont-variant-east-asian:normal;text-decoration:none;vertical-align:baseline;=
white-space:pre-wrap">Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan E=
wen</span><span style=3D"font-size:10pt;font-family:Roboto;background-color=
:transparent;font-weight:400;font-style:normal;font-variant-ligatures:norma=
l;font-variant-caps:normal;font-variant-east-asian:normal;text-decoration:n=
one;vertical-align:baseline;white-space:pre-wrap">=C2=A0=C2=A0=C2=A0 </span=
></div></div></span></div></div></div></div></blockquote></div></div>
</blockquote></div></div></div></div><br clear=3D"all"><br>-- <br><div dir=
=3D"ltr" class=3D"m_-1673664296254471563gmail-m_4191132683903612140gmail-m_=
4780285061551269792gmail-m_3835006982768982814gmail-m_5222226725689792831gm=
ail-m_-1425683553615925575gmail-m_893230967860415499gmail-m_400506700134963=
1360gmail-m_-2883489677411389187gmail-m_8652012098191863984gmail-m_15345256=
90290176503gmail-m_8276719956164441117gmail-m_-992552166682970606gmail-m_-6=
380244061938198074gmail-m_7498041852700543318gmail-m_-5875148220391401525gm=
ail-m_4149013583796159872gmail-m_8232114597330332459gmail-m_-36276416753923=
85003gmail-m_-7064368206193073819gmail-m_759940275789662500gmail-m_-7195194=
058751096433gmail-m_7686631074541873795gmail-m_-9220687973887808895gmail-m_=
-8749503412901838398gmail-m_8693773043552728422gmail-m_658593948785641506gm=
ail-m_7812035139415822206gmail-m_3251182001238802649gmail_signature"><div d=
ir=3D"ltr"><div><div dir=3D"ltr"><span><div><div dir=3D"ltr"><span><div sty=
le=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"fon=
t-size:10pt;font-family:Arial;background-color:transparent;font-weight:700;=
font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;fo=
nt-variant-east-asian:normal;text-decoration:none;vertical-align:baseline;w=
hite-space:pre-wrap">Konstantin Knauf</span><span style=3D"font-size:10pt;f=
ont-family:Roboto;background-color:transparent;font-weight:700;font-style:n=
ormal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-e=
ast-asian:normal;text-decoration:none;vertical-align:baseline;white-space:p=
re-wrap"> </span><span style=3D"font-size:10pt;font-family:Roboto;backgroun=
d-color:transparent;font-weight:400;font-style:normal;font-variant-ligature=
s:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-decor=
ation:none;vertical-align:baseline;white-space:pre-wrap">| Solutions Archit=
ect</span></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom=
:0pt"><span style=3D"font-size:10pt;font-family:Roboto;background-color:tra=
nsparent;font-weight:400;font-style:normal;font-variant-ligatures:normal;fo=
nt-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;=
vertical-align:baseline;white-space:pre-wrap">+49 160 91394525</span></div>=
</span><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><a =
href=3D"https://www.ververica.com/" style=3D"text-decoration:none" target=
=3D"_blank"><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(17,8=
5,204);background-color:transparent;font-weight:400;font-style:normal;font-=
variant:normal;text-decoration:underline;vertical-align:baseline;white-spac=
e:pre-wrap"><img src=3D"https://lh4.googleusercontent.com/1RRzA12SK12Xaowka=
g-W37QDs5LHrfw4R0tMwVNjKLDKoIu69ld1qtA2hSDn1LSJe9w2THG1A9igK_nXPrNeIqRF87Fj=
bEQoBnZJJgyPXCkKPFYuYc_Vh419P9EOO36ERgdnX5wG" style=3D"border:medium none" =
width=3D"203" height=3D"38"></span></a></div><br><div style=3D"line-height:=
1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-f=
amily:Roboto;background-color:transparent;font-weight:400;font-style:normal=
;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-east-a=
sian:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wr=
ap">Follow us @VervericaData</span></div><div style=3D"line-height:1.38;mar=
gin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Ro=
boto;background-color:transparent;font-weight:400;font-style:normal;font-va=
riant-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:nor=
mal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">--</=
span></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"=
><span style=3D"font-size:10pt;font-family:Roboto;background-color:transpar=
ent;font-weight:400;font-style:normal;font-variant-ligatures:normal;font-va=
riant-caps:normal;font-variant-east-asian:normal;text-decoration:none;verti=
cal-align:baseline;white-space:pre-wrap">Join </span><a href=3D"https://fli=
nk-forward.org/" style=3D"text-decoration:none" target=3D"_blank"><span sty=
le=3D"font-size:10pt;font-family:Roboto;color:rgb(17,85,204);background-col=
or:transparent;font-weight:400;font-style:normal;font-variant:normal;text-d=
ecoration:underline;vertical-align:baseline;white-space:pre-wrap">Flink For=
ward</span></a><span style=3D"font-size:10pt;font-family:Roboto;background-=
color:transparent;font-weight:400;font-style:normal;font-variant-ligatures:=
normal;font-variant-caps:normal;font-variant-east-asian:normal;text-decorat=
ion:none;vertical-align:baseline;white-space:pre-wrap"> - The Apache Flink<=
/span><span style=3D"font-size:9.5pt;font-family:Arial;background-color:tra=
nsparent;font-weight:400;font-style:normal;font-variant-ligatures:normal;fo=
nt-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;=
vertical-align:baseline;white-space:pre-wrap"> Conference</span></div><div =
style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"=
font-size:9.5pt;font-family:Arial;background-color:transparent;font-weight:=
400;font-style:normal;font-variant-ligatures:normal;font-variant-caps:norma=
l;font-variant-east-asian:normal;text-decoration:none;vertical-align:baseli=
ne;white-space:pre-wrap">Stream Processing | Event Driven | Real Time</span=
></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><sp=
an style=3D"font-size:9.5pt;font-family:Arial;background-color:transparent;=
font-weight:400;font-style:normal;font-variant-ligatures:normal;font-varian=
t-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertical-=
align:baseline;white-space:pre-wrap">--</span></div><div style=3D"line-heig=
ht:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;fon=
t-family:Roboto;background-color:transparent;font-weight:400;font-style:nor=
mal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-eas=
t-asian:normal;text-decoration:none;vertical-align:baseline;white-space:pre=
-wrap">Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany</sp=
an></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><=
span style=3D"font-size:10pt;font-family:Roboto;background-color:transparen=
t;font-weight:400;font-style:normal;font-variant-ligatures:normal;font-vari=
ant-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertica=
l-align:baseline;white-space:pre-wrap">--</span></div><span style=3D"font-s=
ize:10pt;font-family:Roboto;background-color:transparent;font-weight:400;fo=
nt-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font=
-variant-east-asian:normal;text-decoration:none;vertical-align:baseline;whi=
te-space:pre-wrap">Data Artisans GmbH</span><span style=3D"font-size:10pt;f=
ont-family:Roboto;background-color:transparent;font-weight:400;font-style:n=
ormal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-e=
ast-asian:normal;text-decoration:none;vertical-align:baseline;white-space:p=
re-wrap"><br></span><span style=3D"font-size:10pt;font-family:Roboto;backgr=
ound-color:transparent;font-weight:400;font-style:normal;font-variant-ligat=
ures:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-de=
coration:none;vertical-align:baseline;white-space:pre-wrap">Registered at A=
mtsgericht Charlottenburg: HRB 158244 B</span><span style=3D"font-size:10pt=
;font-family:Roboto;background-color:transparent;font-weight:400;font-style=
:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant=
-east-asian:normal;text-decoration:none;vertical-align:baseline;white-space=
:pre-wrap"><br></span><span style=3D"font-size:10pt;font-family:Roboto;back=
ground-color:transparent;font-weight:400;font-style:normal;font-variant-lig=
atures:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-=
decoration:none;vertical-align:baseline;white-space:pre-wrap">Managing Dire=
ctors: Dr. Kostas Tzoumas, Dr. Stephan Ewen</span><span style=3D"font-size:=
10pt;font-family:Roboto;background-color:transparent;font-weight:400;font-s=
tyle:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-var=
iant-east-asian:normal;text-decoration:none;vertical-align:baseline;white-s=
pace:pre-wrap">=C2=A0=C2=A0=C2=A0 </span></div></div></span></div></div></d=
iv></div></blockquote></div>
</div></div>
</blockquote></div><br clear=3D"all"><br>-- <br><div dir=3D"ltr" class=3D"m=
_-1673664296254471563gmail-m_4191132683903612140gmail-m_4780285061551269792=
gmail-m_3835006982768982814gmail-m_5222226725689792831gmail-m_-142568355361=
5925575gmail-m_893230967860415499gmail-m_4005067001349631360gmail-m_-288348=
9677411389187gmail-m_8652012098191863984gmail-m_1534525690290176503gmail-m_=
8276719956164441117gmail-m_-992552166682970606gmail-m_-6380244061938198074g=
mail-m_7498041852700543318gmail-m_-5875148220391401525gmail-m_4149013583796=
159872gmail-m_8232114597330332459gmail-m_-3627641675392385003gmail-m_-70643=
68206193073819gmail-m_759940275789662500gmail-m_-7195194058751096433gmail-m=
_7686631074541873795gmail-m_-9220687973887808895gmail_signature"><div dir=
=3D"ltr"><div><div dir=3D"ltr"><span><div><div dir=3D"ltr"><span><div style=
=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-=
size:10pt;font-family:Arial;background-color:transparent;font-weight:700;fo=
nt-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font=
-variant-east-asian:normal;text-decoration:none;vertical-align:baseline;whi=
te-space:pre-wrap">Konstantin Knauf</span><span style=3D"font-size:10pt;fon=
t-family:Roboto;background-color:transparent;font-weight:700;font-style:nor=
mal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-eas=
t-asian:normal;text-decoration:none;vertical-align:baseline;white-space:pre=
-wrap"> </span><span style=3D"font-size:10pt;font-family:Roboto;background-=
color:transparent;font-weight:400;font-style:normal;font-variant-ligatures:=
normal;font-variant-caps:normal;font-variant-east-asian:normal;text-decorat=
ion:none;vertical-align:baseline;white-space:pre-wrap">| Solutions Architec=
t</span></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0=
pt"><span style=3D"font-size:10pt;font-family:Roboto;background-color:trans=
parent;font-weight:400;font-style:normal;font-variant-ligatures:normal;font=
-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;ve=
rtical-align:baseline;white-space:pre-wrap">+49 160 91394525</span></div></=
span><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><a hr=
ef=3D"https://www.ververica.com/" style=3D"text-decoration:none" target=3D"=
_blank"><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(17,85,20=
4);background-color:transparent;font-weight:400;font-style:normal;font-vari=
ant:normal;text-decoration:underline;vertical-align:baseline;white-space:pr=
e-wrap"><img src=3D"https://lh4.googleusercontent.com/1RRzA12SK12Xaowkag-W3=
7QDs5LHrfw4R0tMwVNjKLDKoIu69ld1qtA2hSDn1LSJe9w2THG1A9igK_nXPrNeIqRF87FjbEQo=
BnZJJgyPXCkKPFYuYc_Vh419P9EOO36ERgdnX5wG" style=3D"border:medium none" widt=
h=3D"203" height=3D"38"></span></a></div><br><div style=3D"line-height:1.38=
;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-famil=
y:Roboto;background-color:transparent;font-weight:400;font-style:normal;fon=
t-variant-ligatures:normal;font-variant-caps:normal;font-variant-east-asian=
:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">=
Follow us @VervericaData</span></div><div style=3D"line-height:1.38;margin-=
top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Roboto=
;background-color:transparent;font-weight:400;font-style:normal;font-varian=
t-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:normal;=
text-decoration:none;vertical-align:baseline;white-space:pre-wrap">--</span=
></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><sp=
an style=3D"font-size:10pt;font-family:Roboto;background-color:transparent;=
font-weight:400;font-style:normal;font-variant-ligatures:normal;font-varian=
t-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertical-=
align:baseline;white-space:pre-wrap">Join </span><a href=3D"https://flink-f=
orward.org/" style=3D"text-decoration:none" target=3D"_blank"><span style=
=3D"font-size:10pt;font-family:Roboto;color:rgb(17,85,204);background-color=
:transparent;font-weight:400;font-style:normal;font-variant:normal;text-dec=
oration:underline;vertical-align:baseline;white-space:pre-wrap">Flink Forwa=
rd</span></a><span style=3D"font-size:10pt;font-family:Roboto;background-co=
lor:transparent;font-weight:400;font-style:normal;font-variant-ligatures:no=
rmal;font-variant-caps:normal;font-variant-east-asian:normal;text-decoratio=
n:none;vertical-align:baseline;white-space:pre-wrap"> - The Apache Flink</s=
pan><span style=3D"font-size:9.5pt;font-family:Arial;background-color:trans=
parent;font-weight:400;font-style:normal;font-variant-ligatures:normal;font=
-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;ve=
rtical-align:baseline;white-space:pre-wrap"> Conference</span></div><div st=
yle=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"fo=
nt-size:9.5pt;font-family:Arial;background-color:transparent;font-weight:40=
0;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;=
font-variant-east-asian:normal;text-decoration:none;vertical-align:baseline=
;white-space:pre-wrap">Stream Processing | Event Driven | Real Time</span><=
/div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span=
 style=3D"font-size:9.5pt;font-family:Arial;background-color:transparent;fo=
nt-weight:400;font-style:normal;font-variant-ligatures:normal;font-variant-=
caps:normal;font-variant-east-asian:normal;text-decoration:none;vertical-al=
ign:baseline;white-space:pre-wrap">--</span></div><div style=3D"line-height=
:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-=
family:Roboto;background-color:transparent;font-weight:400;font-style:norma=
l;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-east-=
asian:normal;text-decoration:none;vertical-align:baseline;white-space:pre-w=
rap">Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany</span=
></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><sp=
an style=3D"font-size:10pt;font-family:Roboto;background-color:transparent;=
font-weight:400;font-style:normal;font-variant-ligatures:normal;font-varian=
t-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertical-=
align:baseline;white-space:pre-wrap">--</span></div><span style=3D"font-siz=
e:10pt;font-family:Roboto;background-color:transparent;font-weight:400;font=
-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-v=
ariant-east-asian:normal;text-decoration:none;vertical-align:baseline;white=
-space:pre-wrap">Data Artisans GmbH</span><span style=3D"font-size:10pt;fon=
t-family:Roboto;background-color:transparent;font-weight:400;font-style:nor=
mal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-eas=
t-asian:normal;text-decoration:none;vertical-align:baseline;white-space:pre=
-wrap"><br></span><span style=3D"font-size:10pt;font-family:Roboto;backgrou=
nd-color:transparent;font-weight:400;font-style:normal;font-variant-ligatur=
es:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-deco=
ration:none;vertical-align:baseline;white-space:pre-wrap">Registered at Amt=
sgericht Charlottenburg: HRB 158244 B</span><span style=3D"font-size:10pt;f=
ont-family:Roboto;background-color:transparent;font-weight:400;font-style:n=
ormal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-e=
ast-asian:normal;text-decoration:none;vertical-align:baseline;white-space:p=
re-wrap"><br></span><span style=3D"font-size:10pt;font-family:Roboto;backgr=
ound-color:transparent;font-weight:400;font-style:normal;font-variant-ligat=
ures:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-de=
coration:none;vertical-align:baseline;white-space:pre-wrap">Managing Direct=
ors: Dr. Kostas Tzoumas, Dr. Stephan Ewen</span><span style=3D"font-size:10=
pt;font-family:Roboto;background-color:transparent;font-weight:400;font-sty=
le:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-varia=
nt-east-asian:normal;text-decoration:none;vertical-align:baseline;white-spa=
ce:pre-wrap">=C2=A0=C2=A0=C2=A0 </span></div></div></span></div></div></div=
></div></blockquote></div></div>
</blockquote></div><br clear=3D"all"><br>-- <br><div dir=3D"ltr" class=3D"m=
_-1673664296254471563gmail-m_4191132683903612140gmail-m_4780285061551269792=
gmail-m_3835006982768982814gmail-m_5222226725689792831gmail-m_-142568355361=
5925575gmail-m_893230967860415499gmail-m_4005067001349631360gmail-m_-288348=
9677411389187gmail-m_8652012098191863984gmail-m_1534525690290176503gmail-m_=
8276719956164441117gmail-m_-992552166682970606gmail-m_-6380244061938198074g=
mail-m_7498041852700543318gmail-m_-5875148220391401525gmail-m_4149013583796=
159872gmail-m_8232114597330332459gmail-m_-3627641675392385003gmail-m_-70643=
68206193073819gmail-m_759940275789662500gmail-m_-7195194058751096433gmail_s=
ignature"><div dir=3D"ltr"><div><div dir=3D"ltr"><span><div><div dir=3D"ltr=
"><span><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><s=
pan style=3D"font-size:10pt;font-family:Arial;background-color:transparent;=
font-weight:700;font-style:normal;font-variant-ligatures:normal;font-varian=
t-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertical-=
align:baseline;white-space:pre-wrap">Konstantin Knauf</span><span style=3D"=
font-size:10pt;font-family:Roboto;background-color:transparent;font-weight:=
700;font-style:normal;font-variant-ligatures:normal;font-variant-caps:norma=
l;font-variant-east-asian:normal;text-decoration:none;vertical-align:baseli=
ne;white-space:pre-wrap"> </span><span style=3D"font-size:10pt;font-family:=
Roboto;background-color:transparent;font-weight:400;font-style:normal;font-=
variant-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:n=
ormal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">| =
Solutions Architect</span></div><div style=3D"line-height:1.38;margin-top:0=
pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Roboto;back=
ground-color:transparent;font-weight:400;font-style:normal;font-variant-lig=
atures:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-=
decoration:none;vertical-align:baseline;white-space:pre-wrap">+49 160 91394=
525</span></div></span><div style=3D"line-height:1.38;margin-top:0pt;margin=
-bottom:0pt"><a href=3D"https://www.ververica.com/" style=3D"text-decoratio=
n:none" target=3D"_blank"><span style=3D"font-size:10pt;font-family:Roboto;=
color:rgb(17,85,204);background-color:transparent;font-weight:400;font-styl=
e:normal;font-variant:normal;text-decoration:underline;vertical-align:basel=
ine;white-space:pre-wrap"><img src=3D"https://lh4.googleusercontent.com/1RR=
zA12SK12Xaowkag-W37QDs5LHrfw4R0tMwVNjKLDKoIu69ld1qtA2hSDn1LSJe9w2THG1A9igK_=
nXPrNeIqRF87FjbEQoBnZJJgyPXCkKPFYuYc_Vh419P9EOO36ERgdnX5wG" style=3D"border=
:medium none" width=3D"203" height=3D"38"></span></a></div><br><div style=
=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-=
size:10pt;font-family:Roboto;background-color:transparent;font-weight:400;f=
ont-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;fon=
t-variant-east-asian:normal;text-decoration:none;vertical-align:baseline;wh=
ite-space:pre-wrap">Follow us @VervericaData</span></div><div style=3D"line=
-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10p=
t;font-family:Roboto;background-color:transparent;font-weight:400;font-styl=
e:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-varian=
t-east-asian:normal;text-decoration:none;vertical-align:baseline;white-spac=
e:pre-wrap">--</span></div><div style=3D"line-height:1.38;margin-top:0pt;ma=
rgin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Roboto;backgroun=
d-color:transparent;font-weight:400;font-style:normal;font-variant-ligature=
s:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-decor=
ation:none;vertical-align:baseline;white-space:pre-wrap">Join </span><a hre=
f=3D"https://flink-forward.org/" style=3D"text-decoration:none" target=3D"_=
blank"><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(17,85,204=
);background-color:transparent;font-weight:400;font-style:normal;font-varia=
nt:normal;text-decoration:underline;vertical-align:baseline;white-space:pre=
-wrap">Flink Forward</span></a><span style=3D"font-size:10pt;font-family:Ro=
boto;background-color:transparent;font-weight:400;font-style:normal;font-va=
riant-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:nor=
mal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"> - T=
he Apache Flink</span><span style=3D"font-size:9.5pt;font-family:Arial;back=
ground-color:transparent;font-weight:400;font-style:normal;font-variant-lig=
atures:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-=
decoration:none;vertical-align:baseline;white-space:pre-wrap"> Conference</=
span></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"=
><span style=3D"font-size:9.5pt;font-family:Arial;background-color:transpar=
ent;font-weight:400;font-style:normal;font-variant-ligatures:normal;font-va=
riant-caps:normal;font-variant-east-asian:normal;text-decoration:none;verti=
cal-align:baseline;white-space:pre-wrap">Stream Processing | Event Driven |=
 Real Time</span></div><div style=3D"line-height:1.38;margin-top:0pt;margin=
-bottom:0pt"><span style=3D"font-size:9.5pt;font-family:Arial;background-co=
lor:transparent;font-weight:400;font-style:normal;font-variant-ligatures:no=
rmal;font-variant-caps:normal;font-variant-east-asian:normal;text-decoratio=
n:none;vertical-align:baseline;white-space:pre-wrap">--</span></div><div st=
yle=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"fo=
nt-size:10pt;font-family:Roboto;background-color:transparent;font-weight:40=
0;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;=
font-variant-east-asian:normal;text-decoration:none;vertical-align:baseline=
;white-space:pre-wrap">Data Artisans GmbH | Invalidenstrasse 115, 10115 Ber=
lin, Germany</span></div><div style=3D"line-height:1.38;margin-top:0pt;marg=
in-bottom:0pt"><span style=3D"font-size:10pt;font-family:Roboto;background-=
color:transparent;font-weight:400;font-style:normal;font-variant-ligatures:=
normal;font-variant-caps:normal;font-variant-east-asian:normal;text-decorat=
ion:none;vertical-align:baseline;white-space:pre-wrap">--</span></div><span=
 style=3D"font-size:10pt;font-family:Roboto;background-color:transparent;fo=
nt-weight:400;font-style:normal;font-variant-ligatures:normal;font-variant-=
caps:normal;font-variant-east-asian:normal;text-decoration:none;vertical-al=
ign:baseline;white-space:pre-wrap">Data Artisans GmbH</span><span style=3D"=
font-size:10pt;font-family:Roboto;background-color:transparent;font-weight:=
400;font-style:normal;font-variant-ligatures:normal;font-variant-caps:norma=
l;font-variant-east-asian:normal;text-decoration:none;vertical-align:baseli=
ne;white-space:pre-wrap"><br></span><span style=3D"font-size:10pt;font-fami=
ly:Roboto;background-color:transparent;font-weight:400;font-style:normal;fo=
nt-variant-ligatures:normal;font-variant-caps:normal;font-variant-east-asia=
n:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"=
>Registered at Amtsgericht Charlottenburg: HRB 158244 B</span><span style=
=3D"font-size:10pt;font-family:Roboto;background-color:transparent;font-wei=
ght:400;font-style:normal;font-variant-ligatures:normal;font-variant-caps:n=
ormal;font-variant-east-asian:normal;text-decoration:none;vertical-align:ba=
seline;white-space:pre-wrap"><br></span><span style=3D"font-size:10pt;font-=
family:Roboto;background-color:transparent;font-weight:400;font-style:norma=
l;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-east-=
asian:normal;text-decoration:none;vertical-align:baseline;white-space:pre-w=
rap">Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen</span><span s=
tyle=3D"font-size:10pt;font-family:Roboto;background-color:transparent;font=
-weight:400;font-style:normal;font-variant-ligatures:normal;font-variant-ca=
ps:normal;font-variant-east-asian:normal;text-decoration:none;vertical-alig=
n:baseline;white-space:pre-wrap">=C2=A0=C2=A0=C2=A0 </span></div></div></sp=
an></div></div></div></div></blockquote></div></div>
</blockquote></div></div></div></div></div></div></div><br clear=3D"all"><b=
r>-- <br><div dir=3D"ltr" class=3D"m_-1673664296254471563gmail-m_4191132683=
903612140gmail-m_4780285061551269792gmail-m_3835006982768982814gmail-m_5222=
226725689792831gmail-m_-1425683553615925575gmail-m_893230967860415499gmail-=
m_4005067001349631360gmail-m_-2883489677411389187gmail-m_865201209819186398=
4gmail-m_1534525690290176503gmail-m_8276719956164441117gmail-m_-99255216668=
2970606gmail-m_-6380244061938198074gmail-m_7498041852700543318gmail-m_-5875=
148220391401525gmail-m_4149013583796159872gmail-m_8232114597330332459gmail-=
m_-3627641675392385003gmail-m_-7064368206193073819gmail_signature"><div dir=
=3D"ltr"><div><div dir=3D"ltr"><span><div><div dir=3D"ltr"><span><div style=
=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-=
size:10pt;font-family:Arial;background-color:transparent;font-weight:700;fo=
nt-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font=
-variant-east-asian:normal;text-decoration:none;vertical-align:baseline;whi=
te-space:pre-wrap">Konstantin Knauf</span><span style=3D"font-size:10pt;fon=
t-family:Roboto;background-color:transparent;font-weight:700;font-style:nor=
mal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-eas=
t-asian:normal;text-decoration:none;vertical-align:baseline;white-space:pre=
-wrap"> </span><span style=3D"font-size:10pt;font-family:Roboto;background-=
color:transparent;font-weight:400;font-style:normal;font-variant-ligatures:=
normal;font-variant-caps:normal;font-variant-east-asian:normal;text-decorat=
ion:none;vertical-align:baseline;white-space:pre-wrap">| Solutions Architec=
t</span></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0=
pt"><span style=3D"font-size:10pt;font-family:Roboto;background-color:trans=
parent;font-weight:400;font-style:normal;font-variant-ligatures:normal;font=
-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;ve=
rtical-align:baseline;white-space:pre-wrap">+49 160 91394525</span></div></=
span><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><a hr=
ef=3D"https://www.ververica.com/" style=3D"text-decoration:none" target=3D"=
_blank"><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(17,85,20=
4);background-color:transparent;font-weight:400;font-style:normal;font-vari=
ant:normal;text-decoration:underline;vertical-align:baseline;white-space:pr=
e-wrap"><img src=3D"https://lh4.googleusercontent.com/1RRzA12SK12Xaowkag-W3=
7QDs5LHrfw4R0tMwVNjKLDKoIu69ld1qtA2hSDn1LSJe9w2THG1A9igK_nXPrNeIqRF87FjbEQo=
BnZJJgyPXCkKPFYuYc_Vh419P9EOO36ERgdnX5wG" style=3D"border:medium none" widt=
h=3D"203" height=3D"38"></span></a></div><br><div style=3D"line-height:1.38=
;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-famil=
y:Roboto;background-color:transparent;font-weight:400;font-style:normal;fon=
t-variant-ligatures:normal;font-variant-caps:normal;font-variant-east-asian=
:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">=
Follow us @VervericaData</span></div><div style=3D"line-height:1.38;margin-=
top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Roboto=
;background-color:transparent;font-weight:400;font-style:normal;font-varian=
t-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:normal;=
text-decoration:none;vertical-align:baseline;white-space:pre-wrap">--</span=
></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><sp=
an style=3D"font-size:10pt;font-family:Roboto;background-color:transparent;=
font-weight:400;font-style:normal;font-variant-ligatures:normal;font-varian=
t-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertical-=
align:baseline;white-space:pre-wrap">Join </span><a href=3D"https://flink-f=
orward.org/" style=3D"text-decoration:none" target=3D"_blank"><span style=
=3D"font-size:10pt;font-family:Roboto;color:rgb(17,85,204);background-color=
:transparent;font-weight:400;font-style:normal;font-variant:normal;text-dec=
oration:underline;vertical-align:baseline;white-space:pre-wrap">Flink Forwa=
rd</span></a><span style=3D"font-size:10pt;font-family:Roboto;background-co=
lor:transparent;font-weight:400;font-style:normal;font-variant-ligatures:no=
rmal;font-variant-caps:normal;font-variant-east-asian:normal;text-decoratio=
n:none;vertical-align:baseline;white-space:pre-wrap"> - The Apache Flink</s=
pan><span style=3D"font-size:9.5pt;font-family:Arial;background-color:trans=
parent;font-weight:400;font-style:normal;font-variant-ligatures:normal;font=
-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;ve=
rtical-align:baseline;white-space:pre-wrap"> Conference</span></div><div st=
yle=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"fo=
nt-size:9.5pt;font-family:Arial;background-color:transparent;font-weight:40=
0;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;=
font-variant-east-asian:normal;text-decoration:none;vertical-align:baseline=
;white-space:pre-wrap">Stream Processing | Event Driven | Real Time</span><=
/div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span=
 style=3D"font-size:9.5pt;font-family:Arial;background-color:transparent;fo=
nt-weight:400;font-style:normal;font-variant-ligatures:normal;font-variant-=
caps:normal;font-variant-east-asian:normal;text-decoration:none;vertical-al=
ign:baseline;white-space:pre-wrap">--</span></div><div style=3D"line-height=
:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-=
family:Roboto;background-color:transparent;font-weight:400;font-style:norma=
l;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-east-=
asian:normal;text-decoration:none;vertical-align:baseline;white-space:pre-w=
rap">Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany</span=
></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><sp=
an style=3D"font-size:10pt;font-family:Roboto;background-color:transparent;=
font-weight:400;font-style:normal;font-variant-ligatures:normal;font-varian=
t-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertical-=
align:baseline;white-space:pre-wrap">--</span></div><span style=3D"font-siz=
e:10pt;font-family:Roboto;background-color:transparent;font-weight:400;font=
-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-v=
ariant-east-asian:normal;text-decoration:none;vertical-align:baseline;white=
-space:pre-wrap">Data Artisans GmbH</span><span style=3D"font-size:10pt;fon=
t-family:Roboto;background-color:transparent;font-weight:400;font-style:nor=
mal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-eas=
t-asian:normal;text-decoration:none;vertical-align:baseline;white-space:pre=
-wrap"><br></span><span style=3D"font-size:10pt;font-family:Roboto;backgrou=
nd-color:transparent;font-weight:400;font-style:normal;font-variant-ligatur=
es:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-deco=
ration:none;vertical-align:baseline;white-space:pre-wrap">Registered at Amt=
sgericht Charlottenburg: HRB 158244 B</span><span style=3D"font-size:10pt;f=
ont-family:Roboto;background-color:transparent;font-weight:400;font-style:n=
ormal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-e=
ast-asian:normal;text-decoration:none;vertical-align:baseline;white-space:p=
re-wrap"><br></span><span style=3D"font-size:10pt;font-family:Roboto;backgr=
ound-color:transparent;font-weight:400;font-style:normal;font-variant-ligat=
ures:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-de=
coration:none;vertical-align:baseline;white-space:pre-wrap">Managing Direct=
ors: Dr. Kostas Tzoumas, Dr. Stephan Ewen</span><span style=3D"font-size:10=
pt;font-family:Roboto;background-color:transparent;font-weight:400;font-sty=
le:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-varia=
nt-east-asian:normal;text-decoration:none;vertical-align:baseline;white-spa=
ce:pre-wrap">=C2=A0=C2=A0=C2=A0 </span></div></div></span></div></div></div=
></div></blockquote></div></div>
</blockquote></div><br clear=3D"all"><br>-- <br><div dir=3D"ltr" class=3D"m=
_-1673664296254471563gmail-m_4191132683903612140gmail-m_4780285061551269792=
gmail-m_3835006982768982814gmail-m_5222226725689792831gmail-m_-142568355361=
5925575gmail-m_893230967860415499gmail-m_4005067001349631360gmail-m_-288348=
9677411389187gmail-m_8652012098191863984gmail-m_1534525690290176503gmail-m_=
8276719956164441117gmail-m_-992552166682970606gmail-m_-6380244061938198074g=
mail-m_7498041852700543318gmail-m_-5875148220391401525gmail-m_4149013583796=
159872gmail_signature"><div dir=3D"ltr"><div><div dir=3D"ltr"><span><div><d=
iv dir=3D"ltr"><span><div style=3D"line-height:1.38;margin-top:0pt;margin-b=
ottom:0pt"><span style=3D"font-size:10pt;font-family:Arial;background-color=
:transparent;font-weight:700;font-style:normal;font-variant-ligatures:norma=
l;font-variant-caps:normal;font-variant-east-asian:normal;text-decoration:n=
one;vertical-align:baseline;white-space:pre-wrap">Konstantin Knauf</span><s=
pan style=3D"font-size:10pt;font-family:Roboto;background-color:transparent=
;font-weight:700;font-style:normal;font-variant-ligatures:normal;font-varia=
nt-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertical=
-align:baseline;white-space:pre-wrap"> </span><span style=3D"font-size:10pt=
;font-family:Roboto;background-color:transparent;font-weight:400;font-style=
:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant=
-east-asian:normal;text-decoration:none;vertical-align:baseline;white-space=
:pre-wrap">| Solutions Architect</span></div><div style=3D"line-height:1.38=
;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-famil=
y:Roboto;background-color:transparent;font-weight:400;font-style:normal;fon=
t-variant-ligatures:normal;font-variant-caps:normal;font-variant-east-asian=
:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">=
+49 160 91394525</span></div></span><div style=3D"line-height:1.38;margin-t=
op:0pt;margin-bottom:0pt"><a href=3D"https://www.ververica.com/" style=3D"t=
ext-decoration:none" target=3D"_blank"><span style=3D"font-size:10pt;font-f=
amily:Roboto;color:rgb(17,85,204);background-color:transparent;font-weight:=
400;font-style:normal;font-variant:normal;text-decoration:underline;vertica=
l-align:baseline;white-space:pre-wrap"><img src=3D"https://lh4.googleuserco=
ntent.com/1RRzA12SK12Xaowkag-W37QDs5LHrfw4R0tMwVNjKLDKoIu69ld1qtA2hSDn1LSJe=
9w2THG1A9igK_nXPrNeIqRF87FjbEQoBnZJJgyPXCkKPFYuYc_Vh419P9EOO36ERgdnX5wG" st=
yle=3D"border:medium none" width=3D"203" height=3D"38"></span></a></div><br=
><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span sty=
le=3D"font-size:10pt;font-family:Roboto;background-color:transparent;font-w=
eight:400;font-style:normal;font-variant-ligatures:normal;font-variant-caps=
:normal;font-variant-east-asian:normal;text-decoration:none;vertical-align:=
baseline;white-space:pre-wrap">Follow us @VervericaData</span></div><div st=
yle=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"fo=
nt-size:10pt;font-family:Roboto;background-color:transparent;font-weight:40=
0;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;=
font-variant-east-asian:normal;text-decoration:none;vertical-align:baseline=
;white-space:pre-wrap">--</span></div><div style=3D"line-height:1.38;margin=
-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Robot=
o;background-color:transparent;font-weight:400;font-style:normal;font-varia=
nt-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:normal=
;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Join </=
span><a href=3D"https://flink-forward.org/" style=3D"text-decoration:none" =
target=3D"_blank"><span style=3D"font-size:10pt;font-family:Roboto;color:rg=
b(17,85,204);background-color:transparent;font-weight:400;font-style:normal=
;font-variant:normal;text-decoration:underline;vertical-align:baseline;whit=
e-space:pre-wrap">Flink Forward</span></a><span style=3D"font-size:10pt;fon=
t-family:Roboto;background-color:transparent;font-weight:400;font-style:nor=
mal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-eas=
t-asian:normal;text-decoration:none;vertical-align:baseline;white-space:pre=
-wrap"> - The Apache Flink</span><span style=3D"font-size:9.5pt;font-family=
:Arial;background-color:transparent;font-weight:400;font-style:normal;font-=
variant-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:n=
ormal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"> C=
onference</span></div><div style=3D"line-height:1.38;margin-top:0pt;margin-=
bottom:0pt"><span style=3D"font-size:9.5pt;font-family:Arial;background-col=
or:transparent;font-weight:400;font-style:normal;font-variant-ligatures:nor=
mal;font-variant-caps:normal;font-variant-east-asian:normal;text-decoration=
:none;vertical-align:baseline;white-space:pre-wrap">Stream Processing | Eve=
nt Driven | Real Time</span></div><div style=3D"line-height:1.38;margin-top=
:0pt;margin-bottom:0pt"><span style=3D"font-size:9.5pt;font-family:Arial;ba=
ckground-color:transparent;font-weight:400;font-style:normal;font-variant-l=
igatures:normal;font-variant-caps:normal;font-variant-east-asian:normal;tex=
t-decoration:none;vertical-align:baseline;white-space:pre-wrap">--</span></=
div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span =
style=3D"font-size:10pt;font-family:Roboto;background-color:transparent;fon=
t-weight:400;font-style:normal;font-variant-ligatures:normal;font-variant-c=
aps:normal;font-variant-east-asian:normal;text-decoration:none;vertical-ali=
gn:baseline;white-space:pre-wrap">Data Artisans GmbH | Invalidenstrasse 115=
, 10115 Berlin, Germany</span></div><div style=3D"line-height:1.38;margin-t=
op:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Roboto;=
background-color:transparent;font-weight:400;font-style:normal;font-variant=
-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:normal;t=
ext-decoration:none;vertical-align:baseline;white-space:pre-wrap">--</span>=
</div><span style=3D"font-size:10pt;font-family:Roboto;background-color:tra=
nsparent;font-weight:400;font-style:normal;font-variant-ligatures:normal;fo=
nt-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;=
vertical-align:baseline;white-space:pre-wrap">Data Artisans GmbH</span><spa=
n style=3D"font-size:10pt;font-family:Roboto;background-color:transparent;f=
ont-weight:400;font-style:normal;font-variant-ligatures:normal;font-variant=
-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertical-a=
lign:baseline;white-space:pre-wrap"><br></span><span style=3D"font-size:10p=
t;font-family:Roboto;background-color:transparent;font-weight:400;font-styl=
e:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-varian=
t-east-asian:normal;text-decoration:none;vertical-align:baseline;white-spac=
e:pre-wrap">Registered at Amtsgericht Charlottenburg: HRB 158244 B</span><s=
pan style=3D"font-size:10pt;font-family:Roboto;background-color:transparent=
;font-weight:400;font-style:normal;font-variant-ligatures:normal;font-varia=
nt-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertical=
-align:baseline;white-space:pre-wrap"><br></span><span style=3D"font-size:1=
0pt;font-family:Roboto;background-color:transparent;font-weight:400;font-st=
yle:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-vari=
ant-east-asian:normal;text-decoration:none;vertical-align:baseline;white-sp=
ace:pre-wrap">Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen</spa=
n><span style=3D"font-size:10pt;font-family:Roboto;background-color:transpa=
rent;font-weight:400;font-style:normal;font-variant-ligatures:normal;font-v=
ariant-caps:normal;font-variant-east-asian:normal;text-decoration:none;vert=
ical-align:baseline;white-space:pre-wrap">=C2=A0=C2=A0=C2=A0 </span></div><=
/div></span></div></div></div></div></div>
</blockquote></div><br clear=3D"all"><br>-- <br><div dir=3D"ltr" class=3D"m=
_-1673664296254471563gmail-m_4191132683903612140gmail-m_4780285061551269792=
gmail-m_3835006982768982814gmail-m_5222226725689792831gmail-m_-142568355361=
5925575gmail-m_893230967860415499gmail-m_4005067001349631360gmail-m_-288348=
9677411389187gmail-m_8652012098191863984gmail-m_1534525690290176503gmail-m_=
8276719956164441117gmail-m_-992552166682970606gmail-m_-6380244061938198074g=
mail-m_7498041852700543318gmail-m_-5875148220391401525gmail_signature"><div=
 dir=3D"ltr"><div><div dir=3D"ltr"><span><div><div dir=3D"ltr"><span><div s=
tyle=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"f=
ont-size:10pt;font-family:Arial;background-color:transparent;font-weight:70=
0;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;=
font-variant-east-asian:normal;text-decoration:none;vertical-align:baseline=
;white-space:pre-wrap">Konstantin Knauf</span><span style=3D"font-size:10pt=
;font-family:Roboto;background-color:transparent;font-weight:700;font-style=
:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant=
-east-asian:normal;text-decoration:none;vertical-align:baseline;white-space=
:pre-wrap"> </span><span style=3D"font-size:10pt;font-family:Roboto;backgro=
und-color:transparent;font-weight:400;font-style:normal;font-variant-ligatu=
res:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-dec=
oration:none;vertical-align:baseline;white-space:pre-wrap">| Solutions Arch=
itect</span></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bott=
om:0pt"><span style=3D"font-size:10pt;font-family:Roboto;background-color:t=
ransparent;font-weight:400;font-style:normal;font-variant-ligatures:normal;=
font-variant-caps:normal;font-variant-east-asian:normal;text-decoration:non=
e;vertical-align:baseline;white-space:pre-wrap">+49 160 91394525</span></di=
v></span><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><=
a href=3D"https://www.ververica.com/" style=3D"text-decoration:none" target=
=3D"_blank"><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(17,8=
5,204);background-color:transparent;font-weight:400;font-style:normal;font-=
variant:normal;text-decoration:underline;vertical-align:baseline;white-spac=
e:pre-wrap"><img src=3D"https://lh4.googleusercontent.com/1RRzA12SK12Xaowka=
g-W37QDs5LHrfw4R0tMwVNjKLDKoIu69ld1qtA2hSDn1LSJe9w2THG1A9igK_nXPrNeIqRF87Fj=
bEQoBnZJJgyPXCkKPFYuYc_Vh419P9EOO36ERgdnX5wG" style=3D"border:medium none" =
width=3D"203" height=3D"38"></span></a></div><br><div style=3D"line-height:=
1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-f=
amily:Roboto;background-color:transparent;font-weight:400;font-style:normal=
;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-east-a=
sian:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wr=
ap">Follow us @VervericaData</span></div><div style=3D"line-height:1.38;mar=
gin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Ro=
boto;background-color:transparent;font-weight:400;font-style:normal;font-va=
riant-ligatures:normal;font-variant-caps:normal;font-variant-east-asian:nor=
mal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">--</=
span></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"=
><span style=3D"font-size:10pt;font-family:Roboto;background-color:transpar=
ent;font-weight:400;font-style:normal;font-variant-ligatures:normal;font-va=
riant-caps:normal;font-variant-east-asian:normal;text-decoration:none;verti=
cal-align:baseline;white-space:pre-wrap">Join </span><a href=3D"https://fli=
nk-forward.org/" style=3D"text-decoration:none" target=3D"_blank"><span sty=
le=3D"font-size:10pt;font-family:Roboto;color:rgb(17,85,204);background-col=
or:transparent;font-weight:400;font-style:normal;font-variant:normal;text-d=
ecoration:underline;vertical-align:baseline;white-space:pre-wrap">Flink For=
ward</span></a><span style=3D"font-size:10pt;font-family:Roboto;background-=
color:transparent;font-weight:400;font-style:normal;font-variant-ligatures:=
normal;font-variant-caps:normal;font-variant-east-asian:normal;text-decorat=
ion:none;vertical-align:baseline;white-space:pre-wrap"> - The Apache Flink<=
/span><span style=3D"font-size:9.5pt;font-family:Arial;background-color:tra=
nsparent;font-weight:400;font-style:normal;font-variant-ligatures:normal;fo=
nt-variant-caps:normal;font-variant-east-asian:normal;text-decoration:none;=
vertical-align:baseline;white-space:pre-wrap"> Conference</span></div><div =
style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"=
font-size:9.5pt;font-family:Arial;background-color:transparent;font-weight:=
400;font-style:normal;font-variant-ligatures:normal;font-variant-caps:norma=
l;font-variant-east-asian:normal;text-decoration:none;vertical-align:baseli=
ne;white-space:pre-wrap">Stream Processing | Event Driven | Real Time</span=
></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><sp=
an style=3D"font-size:9.5pt;font-family:Arial;background-color:transparent;=
font-weight:400;font-style:normal;font-variant-ligatures:normal;font-varian=
t-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertical-=
align:baseline;white-space:pre-wrap">--</span></div><div style=3D"line-heig=
ht:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;fon=
t-family:Roboto;background-color:transparent;font-weight:400;font-style:nor=
mal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-eas=
t-asian:normal;text-decoration:none;vertical-align:baseline;white-space:pre=
-wrap">Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany</sp=
an></div><div style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><=
span style=3D"font-size:10pt;font-family:Roboto;background-color:transparen=
t;font-weight:400;font-style:normal;font-variant-ligatures:normal;font-vari=
ant-caps:normal;font-variant-east-asian:normal;text-decoration:none;vertica=
l-align:baseline;white-space:pre-wrap">--</span></div><span style=3D"font-s=
ize:10pt;font-family:Roboto;background-color:transparent;font-weight:400;fo=
nt-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font=
-variant-east-asian:normal;text-decoration:none;vertical-align:baseline;whi=
te-space:pre-wrap">Data Artisans GmbH</span><span style=3D"font-size:10pt;f=
ont-family:Roboto;background-color:transparent;font-weight:400;font-style:n=
ormal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-e=
ast-asian:normal;text-decoration:none;vertical-align:baseline;white-space:p=
re-wrap"><br></span><span style=3D"font-size:10pt;font-family:Roboto;backgr=
ound-color:transparent;font-weight:400;font-style:normal;font-variant-ligat=
ures:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-de=
coration:none;vertical-align:baseline;white-space:pre-wrap">Registered at A=
mtsgericht Charlottenburg: HRB 158244 B</span><span style=3D"font-size:10pt=
;font-family:Roboto;background-color:transparent;font-weight:400;font-style=
:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-variant=
-east-asian:normal;text-decoration:none;vertical-align:baseline;white-space=
:pre-wrap"><br></span><span style=3D"font-size:10pt;font-family:Roboto;back=
ground-color:transparent;font-weight:400;font-style:normal;font-variant-lig=
atures:normal;font-variant-caps:normal;font-variant-east-asian:normal;text-=
decoration:none;vertical-align:baseline;white-space:pre-wrap">Managing Dire=
ctors: Dr. Kostas Tzoumas, Dr. Stephan Ewen</span><span style=3D"font-size:=
10pt;font-family:Roboto;background-color:transparent;font-weight:400;font-s=
tyle:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-var=
iant-east-asian:normal;text-decoration:none;vertical-align:baseline;white-s=
pace:pre-wrap">=C2=A0=C2=A0=C2=A0 </span></div></div></span></div></div></d=
iv></div></blockquote></div></div></blockquote></div></blockquote></div></d=
iv>
</div></blockquote></div><br></div></div></blockquote></div><br clear=3D"al=
l"><br>-- <br><div dir=3D"ltr" class=3D"m_-1673664296254471563gmail-m_41911=
32683903612140gmail-m_4780285061551269792gmail-m_3835006982768982814gmail-m=
_5222226725689792831gmail-m_-1425683553615925575gmail-m_893230967860415499g=
mail-m_4005067001349631360gmail-m_-2883489677411389187gmail-m_8652012098191=
863984gmail-m_1534525690290176503gmail-m_8276719956164441117gmail_signature=
"><div dir=3D"ltr"><div><div dir=3D"ltr"><span><div><div dir=3D"ltr"><span>=
<p dir=3D"ltr" style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt">=
<span style=3D"font-size:10pt;font-family:Arial;color:rgb(0,0,0);background=
-color:transparent;font-weight:700;font-style:normal;font-variant:normal;te=
xt-decoration:none;vertical-align:baseline;white-space:pre-wrap">Konstantin=
 Knauf</span><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(0,0=
,0);background-color:transparent;font-weight:700;font-style:normal;font-var=
iant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wr=
ap"> </span><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(0,0,=
0);background-color:transparent;font-weight:400;font-style:normal;font-vari=
ant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wra=
p">| Solutions Architect</span></p><p dir=3D"ltr" style=3D"line-height:1.38=
;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-famil=
y:Roboto;color:rgb(0,0,0);background-color:transparent;font-weight:400;font=
-style:normal;font-variant:normal;text-decoration:none;vertical-align:basel=
ine;white-space:pre-wrap">+49 160 91394525</span></p></span><p dir=3D"ltr" =
style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><a href=3D"http=
s://www.ververica.com/" style=3D"text-decoration:none" target=3D"_blank"><s=
pan style=3D"font-size:10pt;font-family:Roboto;color:rgb(17,85,204);backgro=
und-color:transparent;font-weight:400;font-style:normal;font-variant:normal=
;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap"><i=
mg src=3D"https://lh4.googleusercontent.com/1RRzA12SK12Xaowkag-W37QDs5LHrfw=
4R0tMwVNjKLDKoIu69ld1qtA2hSDn1LSJe9w2THG1A9igK_nXPrNeIqRF87FjbEQoBnZJJgyPXC=
kKPFYuYc_Vh419P9EOO36ERgdnX5wG" style=3D"border:medium none" width=3D"203" =
height=3D"38"></span></a></p><br><p dir=3D"ltr" style=3D"line-height:1.38;m=
argin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:=
Roboto;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-s=
tyle:normal;font-variant:normal;text-decoration:none;vertical-align:baselin=
e;white-space:pre-wrap">Follow us @VervericaData</span></p><p dir=3D"ltr" s=
tyle=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"f=
ont-size:10pt;font-family:Roboto;color:rgb(0,0,0);background-color:transpar=
ent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:n=
one;vertical-align:baseline;white-space:pre-wrap">--</span></p><p dir=3D"lt=
r" style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=
=3D"font-size:10pt;font-family:Roboto;color:rgb(0,0,0);background-color:tra=
nsparent;font-weight:400;font-style:normal;font-variant:normal;text-decorat=
ion:none;vertical-align:baseline;white-space:pre-wrap">Join </span><a href=
=3D"https://flink-forward.org/" style=3D"text-decoration:none" target=3D"_b=
lank"><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(17,85,204)=
;background-color:transparent;font-weight:400;font-style:normal;font-varian=
t:normal;text-decoration:underline;vertical-align:baseline;white-space:pre-=
wrap">Flink Forward</span></a><span style=3D"font-size:10pt;font-family:Rob=
oto;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-styl=
e:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;w=
hite-space:pre-wrap"> - The Apache Flink</span><span style=3D"font-size:9.5=
pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-wei=
ght:400;font-style:normal;font-variant:normal;text-decoration:none;vertical=
-align:baseline;white-space:pre-wrap"> Conference</span></p><p dir=3D"ltr" =
style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"=
font-size:9.5pt;font-family:Arial;color:rgb(0,0,0);background-color:transpa=
rent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:=
none;vertical-align:baseline;white-space:pre-wrap">Stream Processing | Even=
t Driven | Real Time</span></p><p dir=3D"ltr" style=3D"line-height:1.38;mar=
gin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:9.5pt;font-family:A=
rial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-sty=
le:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;=
white-space:pre-wrap">--</span></p><p dir=3D"ltr" style=3D"line-height:1.38=
;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-famil=
y:Roboto;color:rgb(0,0,0);background-color:transparent;font-weight:400;font=
-style:normal;font-variant:normal;text-decoration:none;vertical-align:basel=
ine;white-space:pre-wrap">Data Artisans GmbH | Invalidenstrasse 115, 10115 =
Berlin, Germany</span></p><p dir=3D"ltr" style=3D"line-height:1.38;margin-t=
op:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Roboto;=
color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:no=
rmal;font-variant:normal;text-decoration:none;vertical-align:baseline;white=
-space:pre-wrap">--</span></p><span style=3D"font-size:10pt;font-family:Rob=
oto;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-styl=
e:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;w=
hite-space:pre-wrap">Data Artisans GmbH</span><span style=3D"font-size:10pt=
;font-family:Roboto;color:rgb(0,0,0);background-color:transparent;font-weig=
ht:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-=
align:baseline;white-space:pre-wrap"><br></span><span style=3D"font-size:10=
pt;font-family:Roboto;color:rgb(0,0,0);background-color:transparent;font-we=
ight:400;font-style:normal;font-variant:normal;text-decoration:none;vertica=
l-align:baseline;white-space:pre-wrap">Registered at Amtsgericht Charlotten=
burg: HRB 158244 B</span><span style=3D"font-size:10pt;font-family:Roboto;c=
olor:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:nor=
mal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-=
space:pre-wrap"><br></span><span style=3D"font-size:10pt;font-family:Roboto=
;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:n=
ormal;font-variant:normal;text-decoration:none;vertical-align:baseline;whit=
e-space:pre-wrap">Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen<=
/span><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(0,0,0);bac=
kground-color:transparent;font-weight:400;font-style:normal;font-variant:no=
rmal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">=C2=
=A0=C2=A0=C2=A0 </span></div></div></span></div></div></div></div></blockqu=
ote></div></div></div></blockquote></div></div></div></div>
</blockquote></div><br clear=3D"all"><br>-- <br><div dir=3D"ltr" class=3D"m=
_-1673664296254471563gmail-m_4191132683903612140gmail-m_4780285061551269792=
gmail-m_3835006982768982814gmail-m_5222226725689792831gmail-m_-142568355361=
5925575gmail-m_893230967860415499gmail-m_4005067001349631360gmail_signature=
"><div dir=3D"ltr"><div><div dir=3D"ltr"><span><div><div dir=3D"ltr"><span>=
<p dir=3D"ltr" style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt">=
<span style=3D"font-size:10pt;font-family:Arial;color:rgb(0,0,0);background=
-color:transparent;font-weight:700;font-style:normal;font-variant:normal;te=
xt-decoration:none;vertical-align:baseline;white-space:pre-wrap">Konstantin=
 Knauf</span><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(0,0=
,0);background-color:transparent;font-weight:700;font-style:normal;font-var=
iant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wr=
ap"> </span><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(0,0,=
0);background-color:transparent;font-weight:400;font-style:normal;font-vari=
ant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wra=
p">| Solutions Architect</span></p><p dir=3D"ltr" style=3D"line-height:1.38=
;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-famil=
y:Roboto;color:rgb(0,0,0);background-color:transparent;font-weight:400;font=
-style:normal;font-variant:normal;text-decoration:none;vertical-align:basel=
ine;white-space:pre-wrap">+49 160 91394525</span></p></span><p dir=3D"ltr" =
style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><a href=3D"http=
s://www.ververica.com/" style=3D"text-decoration:none" target=3D"_blank"><s=
pan style=3D"font-size:10pt;font-family:Roboto;color:rgb(17,85,204);backgro=
und-color:transparent;font-weight:400;font-style:normal;font-variant:normal=
;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap"><i=
mg src=3D"https://lh4.googleusercontent.com/1RRzA12SK12Xaowkag-W37QDs5LHrfw=
4R0tMwVNjKLDKoIu69ld1qtA2hSDn1LSJe9w2THG1A9igK_nXPrNeIqRF87FjbEQoBnZJJgyPXC=
kKPFYuYc_Vh419P9EOO36ERgdnX5wG" style=3D"border:medium none" width=3D"203" =
height=3D"38"></span></a></p><br><p dir=3D"ltr" style=3D"line-height:1.38;m=
argin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:=
Roboto;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-s=
tyle:normal;font-variant:normal;text-decoration:none;vertical-align:baselin=
e;white-space:pre-wrap">Follow us @VervericaData</span></p><p dir=3D"ltr" s=
tyle=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"f=
ont-size:10pt;font-family:Roboto;color:rgb(0,0,0);background-color:transpar=
ent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:n=
one;vertical-align:baseline;white-space:pre-wrap">--</span></p><p dir=3D"lt=
r" style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=
=3D"font-size:10pt;font-family:Roboto;color:rgb(0,0,0);background-color:tra=
nsparent;font-weight:400;font-style:normal;font-variant:normal;text-decorat=
ion:none;vertical-align:baseline;white-space:pre-wrap">Join </span><a href=
=3D"https://flink-forward.org/" style=3D"text-decoration:none" target=3D"_b=
lank"><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(17,85,204)=
;background-color:transparent;font-weight:400;font-style:normal;font-varian=
t:normal;text-decoration:underline;vertical-align:baseline;white-space:pre-=
wrap">Flink Forward</span></a><span style=3D"font-size:10pt;font-family:Rob=
oto;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-styl=
e:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;w=
hite-space:pre-wrap"> - The Apache Flink</span><span style=3D"font-size:9.5=
pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-wei=
ght:400;font-style:normal;font-variant:normal;text-decoration:none;vertical=
-align:baseline;white-space:pre-wrap"> Conference</span></p><p dir=3D"ltr" =
style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"=
font-size:9.5pt;font-family:Arial;color:rgb(0,0,0);background-color:transpa=
rent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:=
none;vertical-align:baseline;white-space:pre-wrap">Stream Processing | Even=
t Driven | Real Time</span></p><p dir=3D"ltr" style=3D"line-height:1.38;mar=
gin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:9.5pt;font-family:A=
rial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-sty=
le:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;=
white-space:pre-wrap">--</span></p><p dir=3D"ltr" style=3D"line-height:1.38=
;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-famil=
y:Roboto;color:rgb(0,0,0);background-color:transparent;font-weight:400;font=
-style:normal;font-variant:normal;text-decoration:none;vertical-align:basel=
ine;white-space:pre-wrap">Data Artisans GmbH | Invalidenstrasse 115, 10115 =
Berlin, Germany</span></p><p dir=3D"ltr" style=3D"line-height:1.38;margin-t=
op:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Roboto;=
color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:no=
rmal;font-variant:normal;text-decoration:none;vertical-align:baseline;white=
-space:pre-wrap">--</span></p><span style=3D"font-size:10pt;font-family:Rob=
oto;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-styl=
e:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;w=
hite-space:pre-wrap">Data Artisans GmbH</span><span style=3D"font-size:10pt=
;font-family:Roboto;color:rgb(0,0,0);background-color:transparent;font-weig=
ht:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-=
align:baseline;white-space:pre-wrap"><br></span><span style=3D"font-size:10=
pt;font-family:Roboto;color:rgb(0,0,0);background-color:transparent;font-we=
ight:400;font-style:normal;font-variant:normal;text-decoration:none;vertica=
l-align:baseline;white-space:pre-wrap">Registered at Amtsgericht Charlotten=
burg: HRB 158244 B</span><span style=3D"font-size:10pt;font-family:Roboto;c=
olor:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:nor=
mal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-=
space:pre-wrap"><br></span><span style=3D"font-size:10pt;font-family:Roboto=
;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:n=
ormal;font-variant:normal;text-decoration:none;vertical-align:baseline;whit=
e-space:pre-wrap">Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen<=
/span><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(0,0,0);bac=
kground-color:transparent;font-weight:400;font-style:normal;font-variant:no=
rmal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">=C2=
=A0=C2=A0=C2=A0 </span></div></div></span></div></div></div></div></blockqu=
ote></div></div></div></div></div></div></div></div></div></div></div></div=
></blockquote></div></div>
</blockquote></div>
</blockquote></div><br clear=3D"all"><br>-- <br><div dir=3D"ltr" class=3D"m=
_-1673664296254471563gmail-m_4191132683903612140gmail-m_4780285061551269792=
gmail-m_3835006982768982814gmail_signature"><div dir=3D"ltr"><div><div dir=
=3D"ltr"><div><div dir=3D"ltr"><span><div><div dir=3D"ltr"><span><p dir=3D"=
ltr" style=3D"line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span styl=
e=3D"font-size:10pt;font-family:Arial;color:rgb(0,0,0);background-color:tra=
nsparent;font-weight:700;font-style:normal;font-variant:normal;text-decorat=
ion:none;vertical-align:baseline;white-space:pre-wrap">Konstantin Knauf</sp=
an><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(0,0,0);backgr=
ound-color:transparent;font-weight:700;font-style:normal;font-variant:norma=
l;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"> </spa=
n><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(0,0,0);backgro=
und-color:transparent;font-weight:400;font-style:normal;font-variant:normal=
;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">| Solut=
ions Architect</span></p><p dir=3D"ltr" style=3D"line-height:1.38;margin-to=
p:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Roboto;c=
olor:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:nor=
mal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-=
space:pre-wrap">+49 160 91394525</span></p><p dir=3D"ltr" style=3D"line-hei=
ght:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;fo=
nt-family:Roboto;color:rgb(0,0,0);background-color:transparent;font-weight:=
400;font-style:normal;font-variant:normal;text-decoration:none;vertical-ali=
gn:baseline;white-space:pre-wrap"><br></span></p><p style=3D"line-height:1.=
38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;font-fam=
ily:Roboto;color:rgb(0,0,0);background-color:transparent;font-weight:400;fo=
nt-style:normal;font-variant:normal;text-decoration:none;vertical-align:bas=
eline;white-space:pre-wrap">Planned Absences: 17.04.2019 - 26.04.2019<br></=
span></p><p dir=3D"ltr" style=3D"line-height:1.38;margin-top:0pt;margin-bot=
tom:0pt"><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(0,0,0);=
background-color:transparent;font-weight:400;font-style:normal;font-variant=
:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">=
<br></span></p></span><p dir=3D"ltr" style=3D"line-height:1.38;margin-top:0=
pt;margin-bottom:0pt"><a href=3D"https://www.ververica.com/" style=3D"text-=
decoration:none" target=3D"_blank"><span style=3D"font-size:10pt;font-famil=
y:Roboto;color:rgb(17,85,204);background-color:transparent;font-weight:400;=
font-style:normal;font-variant:normal;text-decoration:underline;vertical-al=
ign:baseline;white-space:pre-wrap"><img src=3D"https://lh4.googleuserconten=
t.com/1RRzA12SK12Xaowkag-W37QDs5LHrfw4R0tMwVNjKLDKoIu69ld1qtA2hSDn1LSJe9w2T=
HG1A9igK_nXPrNeIqRF87FjbEQoBnZJJgyPXCkKPFYuYc_Vh419P9EOO36ERgdnX5wG" style=
=3D"border:medium none" width=3D"203" height=3D"38"></span></a><br></p></di=
v><div dir=3D"ltr"><p dir=3D"ltr" style=3D"line-height:1.38;margin-top:0pt;=
margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Roboto;color:r=
gb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;fo=
nt-variant:normal;text-decoration:none;vertical-align:baseline;white-space:=
pre-wrap">Follow us @VervericaData</span></p><p dir=3D"ltr" style=3D"line-h=
eight:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:10pt;=
font-family:Roboto;color:rgb(0,0,0);background-color:transparent;font-weigh=
t:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-a=
lign:baseline;white-space:pre-wrap">--</span></p><p dir=3D"ltr" style=3D"li=
ne-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:1=
0pt;font-family:Roboto;color:rgb(0,0,0);background-color:transparent;font-w=
eight:400;font-style:normal;font-variant:normal;text-decoration:none;vertic=
al-align:baseline;white-space:pre-wrap">Join </span><a href=3D"https://flin=
k-forward.org/" style=3D"text-decoration:none" target=3D"_blank"><span styl=
e=3D"font-size:10pt;font-family:Roboto;color:rgb(17,85,204);background-colo=
r:transparent;font-weight:400;font-style:normal;font-variant:normal;text-de=
coration:underline;vertical-align:baseline;white-space:pre-wrap">Flink Forw=
ard</span></a><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(0,=
0,0);background-color:transparent;font-weight:400;font-style:normal;font-va=
riant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-w=
rap"> - The Apache Flink</span><span style=3D"font-size:9.5pt;font-family:A=
rial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-sty=
le:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;=
white-space:pre-wrap"> Conference</span></p><p dir=3D"ltr" style=3D"line-he=
ight:1.38;margin-top:0pt;margin-bottom:0pt"><span style=3D"font-size:9.5pt;=
font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight=
:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-al=
ign:baseline;white-space:pre-wrap">Stream Processing | Event Driven | Real =
Time</span></p><p dir=3D"ltr" style=3D"line-height:1.38;margin-top:0pt;marg=
in-bottom:0pt"><span style=3D"font-size:9.5pt;font-family:Arial;color:rgb(0=
,0,0);background-color:transparent;font-weight:400;font-style:normal;font-v=
ariant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-=
wrap">--</span></p><p dir=3D"ltr" style=3D"line-height:1.38;margin-top:0pt;=
margin-bottom:0pt"><span style=3D"font-size:10pt;font-family:Roboto;color:r=
gb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;fo=
nt-variant:normal;text-decoration:none;vertical-align:baseline;white-space:=
pre-wrap">Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany<=
/span></p><p dir=3D"ltr" style=3D"line-height:1.38;margin-top:0pt;margin-bo=
ttom:0pt"><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(0,0,0)=
;background-color:transparent;font-weight:400;font-style:normal;font-varian=
t:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"=
>--</span></p><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(0,=
0,0);background-color:transparent;font-weight:400;font-style:normal;font-va=
riant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-w=
rap">Data Artisans GmbH</span><span style=3D"font-size:10pt;font-family:Rob=
oto;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-styl=
e:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;w=
hite-space:pre-wrap"><br></span><span style=3D"font-size:10pt;font-family:R=
oboto;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-st=
yle:normal;font-variant:normal;text-decoration:none;vertical-align:baseline=
;white-space:pre-wrap">Registered at Amtsgericht Charlottenburg: HRB 158244=
 B</span><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(0,0,0);=
background-color:transparent;font-weight:400;font-style:normal;font-variant=
:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">=
<br></span><span style=3D"font-size:10pt;font-family:Roboto;color:rgb(0,0,0=
);background-color:transparent;font-weight:400;font-style:normal;font-varia=
nt:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap=
">Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen</span><span styl=
e=3D"font-size:10pt;font-family:Roboto;color:rgb(0,0,0);background-color:tr=
ansparent;font-weight:400;font-style:normal;font-variant:normal;text-decora=
tion:none;vertical-align:baseline;white-space:pre-wrap">=C2=A0=C2=A0=C2=A0 =
</span></div></div></span></div></div></div></div></div></div></blockquote>=
</div></div></div></div></div></div></div>

--0000000000001bca7a0587f7aa7c--