Return-Path: X-Original-To: apmail-flink-user-archive@minotaur.apache.org Delivered-To: apmail-flink-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1C28F17B16 for ; Wed, 9 Sep 2015 05:01:30 +0000 (UTC) Received: (qmail 97256 invoked by uid 500); 9 Sep 2015 05:01:29 -0000 Delivered-To: apmail-flink-user-archive@flink.apache.org Received: (qmail 97180 invoked by uid 500); 9 Sep 2015 05:01:29 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 97169 invoked by uid 99); 9 Sep 2015 05:01:29 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Sep 2015 05:01:29 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 0EDEEE0678 for ; Wed, 9 Sep 2015 05:01:29 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.001 X-Spam-Level: *** X-Spam-Status: No, score=3.001 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=3, MIME_QP_LONG_LINE=0.001, SPF_HELO_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id vrjK9B5zC8mU for ; Wed, 9 Sep 2015 05:01:19 +0000 (UTC) Received: from smtprelay05.ispgateway.de (smtprelay05.ispgateway.de [80.67.18.28]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id A5BC444151 for ; Wed, 9 Sep 2015 05:01:18 +0000 (UTC) Received: from [85.181.154.82] (helo=[192.168.1.71]) by smtprelay05.ispgateway.de with esmtpsa (TLSv1:DHE-RSA-AES256-SHA:256) (Exim 4.84) (envelope-from ) id 1ZZXVQ-0007SN-GL for user@flink.apache.org; Wed, 09 Sep 2015 07:01:12 +0200 From: Rico Bergmann Content-Type: multipart/alternative; boundary=Apple-Mail-752FA6AB-4596-42AD-AAE0-F2511A35996C Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (1.0) Subject: Re: Performance Issue Message-Id: <02D694A0-FCE1-49BB-9026-5204EE3BCCA1@ricobergmann.de> Date: Wed, 9 Sep 2015 07:01:08 +0200 References: <12E50A80607B5A46BF7060E19B29634F14DED7B4@NTOVMAIL04.ad.otto.de> <7885A715-A8F3-4256-9A0B-7628BEF29675@ricobergmann.de> <84A966EF-C306-4FC0-9455-46445AC7656B@ricobergmann.de> <15DC7F60-EC46-455B-84E1-4E29DA852B45@ricobergmann.de> In-Reply-To: To: "user@flink.apache.org" X-Mailer: iPhone Mail (12F70) X-Df-Sender: aW5mb0ByaWNvYmVyZ21hbm4uZGU= --Apple-Mail-752FA6AB-4596-42AD-AAE0-F2511A35996C Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Yes. The keys are constantly changing. Indeed each unique event has its own k= ey (the event itself). The purpose was to do an event deduplication ... > Am 08.09.2015 um 20:05 schrieb Aljoscha Krettek : >=20 > Hi Rico, > I have a suspicion. What is the distribution of your keys? That is, are th= ere many unique keys, do the keys keep evolving, i.e. is it always new and d= ifferent keys? >=20 > Cheers, > Aljoscha >=20 >> On Tue, 8 Sep 2015 at 13:44 Rico Bergmann wrote: >> I also see in the TM overview the CPU load is still around 25% although t= here is no input to the program since minutes. The CPU load is degrading ver= y slowly.=20 >>=20 >> The memory consumption is still fluctuating at a high level. It does not d= egrade.=20 >>=20 >> In my test I generated test input for 1 minute. Now 10 minutes are over .= ..=20 >>=20 >> I think there must be something with flink... >>=20 >>=20 >>=20 >>> Am 08.09.2015 um 13:32 schrieb Rico Bergmann : >>>=20 >>> The marksweep value is very high, the scavenge very low. If this helps ;= -) >>>=20 >>>=20 >>>=20 >>>=20 >>>> Am 08.09.2015 um 11:27 schrieb Robert Metzger : >>>>=20 >>>> It is in the "Information" column: http://i.imgur.com/rzxxURR.png >>>> In the screenshot, the two GCs only spend 84 and 25 ms. >>>>=20 >>>>> On Tue, Sep 8, 2015 at 10:34 AM, Rico Bergmann w= rote: >>>>> Where can I find these information? I can see the memory usage and cpu= load. But where are the information on the GC? >>>>>=20 >>>>>=20 >>>>>=20 >>>>>> Am 08.09.2015 um 09:34 schrieb Robert Metzger : >>>>>>=20 >>>>>> The webinterface of Flink has a tab for the TaskManagers. There, you c= an also see how much time the JVM spend with garbage collection. >>>>>> Can you check whether the number of GC calls + the time spend goes up= after 30 minutes? >>>>>>=20 >>>>>>> On Tue, Sep 8, 2015 at 8:37 AM, Rico Bergmann = wrote: >>>>>>> Hi! >>>>>>>=20 >>>>>>> I also think it's a GC problem. In the KeySelector I don't instantia= te any object. It's a simple toString method call.=20 >>>>>>> In the mapWindow I create new objects. But I'm doing the same in oth= er map operators, too. They don't slow down the execution. Only with this co= nstruct the execution is slowed down.=20 >>>>>>>=20 >>>>>>> I watched on the memory footprint of my program. Once with the code c= onstruct I wrote and once without. The memory characteristic were the same. T= he CPU usage also ...=20 >>>>>>>=20 >>>>>>> I don't have an explanation. But I don't think it comes from my oper= ator functions ... >>>>>>>=20 >>>>>>> Cheers Rico.=20 >>>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>>>> Am 07.09.2015 um 22:43 schrieb Martin Neumann : >>>>>>>>=20 >>>>>>>> Hej, >>>>>>>>=20 >>>>>>>> This sounds like it could be a garbage collection problem. Do you i= nstantiate any classes inside any of the operators (e.g. in the KeySelector)= . You can also try to run it locally and use something like jstat to rule th= is out. >>>>>>>>=20 >>>>>>>> cheers Martin >>>>>>>>=20 >>>>>>>>> On Mon, Sep 7, 2015 at 12:00 PM, Rico Bergmann wrote: >>>>>>>>> Hi! >>>>>>>>>=20 >>>>>>>>> While working with grouping and windowing I encountered a strange b= ehavior. I'm doing: >>>>>>>>>> dataStream.groupBy(KeySelector).window(Time.of(x, TimeUnit.SECOND= S)).mapWindow(toString).flatten() >>>>>>>>>=20 >>>>>>>>> When I run the program containing this snippet it initially output= s data at a rate around 150 events per sec. (That is roughly the input rate f= or the program). After about 10-30 minutes the rate drops down below 5 event= s per sec. This leads to event delivery offsets getting bigger and bigger ..= .=20 >>>>>>>>>=20 >>>>>>>>> Any explanation for this? I know you are reworking the streaming A= PI. But it would be useful to know, why this happens ... >>>>>>>>>=20 >>>>>>>>> Cheers. Rico.=20 --Apple-Mail-752FA6AB-4596-42AD-AAE0-F2511A35996C Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable
Yes. The keys are constantly changing.= Indeed each unique event has its own key (the event itself). The purpose wa= s to do an event deduplication ...



Am 08.09.2015 u= m 20:05 schrieb Aljoscha Krettek <= aljoscha@apache.org>:

Hi Rico,
I have a suspicion. What is the distribution of y= our keys? That is, are there many unique keys, do the keys keep evolving, i.= e. is it always new and different keys?

Cheers,
Aljoscha

O= n Tue, 8 Sep 2015 at 13:44 Rico Bergmann <info@ricobergmann.de> wrote:
I also see in the TM overview the CPU load is= still around 25% although there is no input to the program since minutes. T= he CPU load is degrading very slowly. 

The mem= ory consumption is still fluctuating at a high level. It does not degrade.&n= bsp;

In my test I generated test input for 1 minute= . Now 10 minutes are over ... 

I think there m= ust be something with flink...



Am 08.09.2015 um 13:32 schrieb Rico Bergmann <info@ricobergmann.de>:

The marksweep value is very high, the= scavenge very low. If this helps ;-)




A= m 08.09.2015 um 11:27 schrieb Robert Metzger <rmetzger@apache.org>:

It is in the "Information" colum= n: http://= i.imgur.com/rzxxURR.png
In the screenshot, the two GCs only spend 84= and 25 ms.

On Tue, Sep 8, 2015 at 10:34 AM, Rico Bergmann <info@ricobergmann.de= > wrote:
Where can I find these information? I can see the memory usage and cpu lo= ad. But where are the information on the GC?



Am 08.09.2015 um 09:34 schrieb Robert Metzger <rmetzger@apache.org>:

The webinterface of Flink h= as a tab for the TaskManagers. There, you can also see how much time the JVM= spend with garbage collection.
Can you check whether the number of GC c= alls + the time spend goes up after 30 minutes?

On Tue, Sep 8, 2015 at 8:37 AM, Ric= o Bergmann <info@ricobergmann.de> wrote:
Hi!

I also th= ink it's a GC problem. In the KeySelector I don't instantiate any object. It= 's a simple toString method call. 
In the mapWindow I create n= ew objects. But I'm doing the same in other map operators, too. They don't s= low down the execution. Only with this construct the execution is slowed dow= n. 

I watched on the memory footprint of my pr= ogram. Once with the code construct I wrote and once without. The memory cha= racteristic were the same. The CPU usage also ... 

=
I don't have an explanation. But I don't think it comes from my operato= r functions ...

Cheers Rico. 



Am 07.09.2015 um 22:43 schrieb Martin Neumann <mneumann@sics.se>:
Hej,

This sounds like it could be a garbage collection problem. Do you ins= tantiate any classes inside any of the operators (e.g. in the KeySelector). Y= ou can also try to run it locally and use something like jstat to rule this o= ut.

cheers Martin

On Mon, Sep 7, 2015 at 12:00 PM, Rico Be= rgmann <info@ricobergmann.de> wrote:
Hi!

While working with groupin= g and windowing I encountered a strange behavior. I'm doing:
dataStream.gr= oupBy(KeySelector).window(Time.of(x, TimeUnit.SECONDS)).mapWindow(toString).= flatten()

When I run the program containin= g this snippet it initially outputs data at a rate around 150 events per sec= . (That is roughly the input rate for the program). After about 10-30 minute= s the rate drops down below 5 events per sec. This leads to event delivery o= ffsets getting bigger and bigger ... 

Any expl= anation for this? I know you are reworking the streaming API. But it would b= e useful to know, why this happens ...

Cheers. Rico= . 



= --Apple-Mail-752FA6AB-4596-42AD-AAE0-F2511A35996C--