flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Metzger <rmetz...@apache.org>
Subject Re: GC on TaskManagers stats
Date Tue, 09 Feb 2016 09:17:34 GMT
Hi Guido,

sorry for the late reply. You were collecting the stats every 1 second.
Afaik, Flink is internally collecting the stats with a frequency of 5
seconds, so you can either change your or Flink's polling interval (I think
its taskmanager.heartbeat-interval)

Regarding the details on PS-Scavenge, MarkSweep etc.: We just use the names
the Java management beans return, so you can just google for the names and
read how to interpret them. For example:
http://www.ibm.com/developerworks/library/j-jtp11253/

The load is the operating system load.



On Thu, Feb 4, 2016 at 10:25 PM, Guido <gmazza104@gmail.com> wrote:

> Hello,
>
> I have few questions regarding garbage collector’s stats on Taskmanagers
> and any help or further documentation would be great.
> I have collected “1 second polling requesting" stats on 7 Taskmanagers,
> through the relative request (/taskmanagers/<idtaskmanager>/) of the
> Monitoring REST API  while a job, that overall took 38 seconds, was
> running.
>
> This way got 38 records for each TaskManager and focusing on garbage
> collector’s stats I can see, for example on 1 of the 38th records:
>
> - PS-Scavenge.Time: 2597, PS-MarkSweep.Time: 29016;
> 1. Is It correct to assume they represent the total elapsed time on
> different GCs (respectively young and old gen)? So, I basically got a
> running sum distribution?
> 2. If yes, values are in mills, so 29 sec?
>
> 3. Could they be used to get how much time has been wasted in total
> because of the “Stop-the-world” GCs policy?
>
> Finally, on the same record:
>
> - PS-Scavenge.Count: 3, PS-MarkSweep.Time: 5, load: 3.73.
>
> 4. Is it the “load” value tightly related?
>
> Sorry if it has been quite long and thanks a lot.
>
> Guido
>
>
>

Mime
View raw message