Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 290C6200D0F for ; Fri, 15 Sep 2017 04:25:23 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 27B391609CE; Fri, 15 Sep 2017 02:25:23 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 6E4B51609CD for ; Fri, 15 Sep 2017 04:25:22 +0200 (CEST) Received: (qmail 35206 invoked by uid 500); 15 Sep 2017 02:25:21 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@flink.apache.org Received: (qmail 35196 invoked by uid 99); 15 Sep 2017 02:25:21 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Sep 2017 02:25:21 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id A5A9B1A51C7 for ; Fri, 15 Sep 2017 02:25:20 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0 X-Spam-Level: X-Spam-Status: No, score=0 tagged_above=-999 required=6.31 tests=[SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id jhIsNcsODeu6 for ; Fri, 15 Sep 2017 02:25:18 +0000 (UTC) Received: from mandala.kddilabs.jp (mandala.kddilabs.jp [192.26.91.6]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 618C35FC8A for ; Fri, 15 Sep 2017 02:25:18 +0000 (UTC) Received: from localhost (mandala.kddilabs.jp [127.0.0.1]) by mandala.kddilabs.jp (Postfix) with ESMTP id A60421748F90; Fri, 15 Sep 2017 11:25:16 +0900 (JST) X-Virus-Scanned: amavisd-new at kddilabs.jp Received: from mandala.kddilabs.jp ([127.0.0.1]) by localhost (mandala.kddilabs.jp [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BVlpK-i6MA3f; Fri, 15 Sep 2017 11:25:16 +0900 (JST) Received: from safeattach.localdomain (unknown [IPv6:2001:200:601:1a00:20c:29ff:fe79:2280]) by mandala.kddilabs.jp (Postfix) with ESMTP id 37D91174872D; Fri, 15 Sep 2017 11:25:16 +0900 (JST) Received: from [172.19.124.199] (dhcp199.west-4f.cn.kddilabs.jp [172.19.124.199]) by safeattach.localdomain with ESMTP id v8F2PFJI003919; Fri, 15 Sep 2017 11:25:15 +0900 Subject: Re: Streaming API has a long delay at the beginning of the process. To: Till Rohrmann , Fabian Hueske Cc: user References: <8f88c5ed-2e33-ceee-3f4e-b99ca1b2f635@kddi-research.jp> From: Yuta Morisawa Message-ID: <5727a1df-6bb3-2705-1093-4fbadc4a97a7@kddi-research.jp> Date: Fri, 15 Sep 2017 11:25:15 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit archived-at: Fri, 15 Sep 2017 02:25:23 -0000 Hi, Fabian > If I understand you correctly, the problem is only for the first events > that are processed. Yes. More Precisely, first 300 kafka-messages. > AFAIK, Flink lazily instantiates its operators which means that a source > task starts to consume records from Kafka before the subsequent tasks > have been started. That's a great indication. It describe well the affair. But, according to the document, it says "The operations are actually executed when the execution is explicitly triggered by an execute() call on the execution environment.". What does it mean? AFAIK, common Flink programs invoke execute() in main(). Every operators start at this time? I think maybe no. - Flink Document https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/api_concepts.html#lazy-evaluation > Not sure if or what can be done about this behavior. > I'll loop in Till who knows more about the lifecycle of tasks. Thank you very much for your kindness. Regards, Yuta On 2017/09/14 19:32, Fabian Hueske wrote: > Hi, > > If I understand you correctly, the problem is only for the first events > that are processed. > > AFAIK, Flink lazily instantiates its operators which means that a source > task starts to consume records from Kafka before the subsequent tasks > have been started. > That's why the latency of the first records is higher. > > Not sure if or what can be done about this behavior. > I'll loop in Till who knows more about the lifecycle of tasks. > > Best, Fabian > > > 2017-09-12 11:02 GMT+02:00 Yuta Morisawa >: > > Hi, > > I am worrying about the delay of the Streaming API. > My application is that it gets data from kafka-connectors and > process them, then push data to kafka-producers. > The problem is that the app suffers a long delay when the first data > come in the cluster. > It takes about 1000ms to process data (I measure the time with > kafka-timestamp). On the other hand, it works well after 2-3 seconds > first data come in (the delay is about 200ms). > > The application is so delay sensitive that I want to solve this problem. > Now, I think this is a matter of JVM but I have no idea to > investigate it. > Is there any way to avoid this delay? > > > > Thank you for your attention > Yuta > >