Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DAF00200CA9 for ; Fri, 16 Jun 2017 16:44:11 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id D99E7160BDD; Fri, 16 Jun 2017 14:44:11 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 2BE75160BD2 for ; Fri, 16 Jun 2017 16:44:11 +0200 (CEST) Received: (qmail 62634 invoked by uid 500); 16 Jun 2017 14:44:10 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@flink.apache.org Received: (qmail 62624 invoked by uid 99); 16 Jun 2017 14:44:10 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Jun 2017 14:44:10 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id AED7D188A62 for ; Fri, 16 Jun 2017 14:44:09 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.487 X-Spam-Level: *** X-Spam-Status: No, score=3.487 tagged_above=-999 required=6.31 tests=[DKIM_ADSP_CUSTOM_MED=0.001, NML_ADSP_CUSTOM_MED=1.2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_SOFTFAIL=0.972, URIBL_BLOCKED=0.001, URI_HEX=1.313] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id vNQeQYD5HVR8 for ; Fri, 16 Jun 2017 14:44:08 +0000 (UTC) Received: from mwork.nabble.com (mwork.nabble.com [162.253.133.43]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id C46B55F2AE for ; Fri, 16 Jun 2017 14:44:03 +0000 (UTC) Received: from mjoe.nabble.com (unknown [162.253.133.57]) by mwork.nabble.com (Postfix) with ESMTP id 564BA4AA0B3E3 for ; Fri, 16 Jun 2017 07:44:03 -0700 (MST) Date: Fri, 16 Jun 2017 07:27:44 -0700 (PDT) From: ninad To: user@flink.apache.org Message-ID: <1497623264174-13805.post@n4.nabble.com> In-Reply-To: References: <1496577571853-13477.post@n4.nabble.com> <1496759887655-13527.post@n4.nabble.com> <1496940767596-13597.post@n4.nabble.com> Subject: Re: Fink: KafkaProducer Data Loss MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit archived-at: Fri, 16 Jun 2017 14:44:12 -0000 Hi Aljoscha, I gather you guys aren't able to reproduce this. Here are the answers to your questions: How do you ensure that you only shut down the brokers once Flink has read all the data that you expect it to read Ninad: I am able to see the number of messages received on the Flink Job UI. And, how do you ensure that the offset that Flink checkpoints in step 3) is the offset that corresponds to the end of your test data. Ninad: I haven't explicitly verified which offsets were checkpointed. When I say that a checkpoint was successful, I am referring to the Flink logs. So, as long as Flink says that my last successful checkpoint was #7. And on recovery, it restores it's state of checkpoint #7. What is the difference between steps 3) and 5)? Ninad: I didn't realize that windows are merged eagerly. I have a session window with interval of 30 secs. Once I see from the UI that all the messages have been received, I don't see the following logs for 30 secs. So that's why I thought that the windows are merged once the window trigger is fired. Ex: I verified from the UI that all messages were received. I then see this checkpoint in the logs: 2017-06-01 20:21:49,012 DEBUG org.apache.flink.streaming.runtime.tasks.StreamTask - Notification of complete checkpoint for task TriggerWindow(ProcessingTimeSessionWindows (30000), ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@e56b3293}, ProcessingTimeTrigger(), WindowedStream.apply(WindowedStream.java:521) ) -> Sink: sink.http.sep (1/1) I then see the windows being merged after a few seconds: 2017-06-01 20:22:14,300 DEBUG org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSet - Merging [TimeWindow{start=1496348534287, end=1496348564287}, TimeWindow{start=1496348534300, end=1496348564300}] into TimeWindow{start=1496348534287, end=1496348564300} So, point 3 is referring to these logs "MergingWindowSet - Merging .." And point 4 is referring to the data in windows being evaluated. Hope this helps. Thanks. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Fink-KafkaProducer-Data-Loss-tp11413p13805.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.