flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: Task Manager was lost/killed due to full GC
Date Wed, 18 Oct 2017 08:07:33 GMT
Thanks for the heads-up and explaining how you resolve the issue!

Best, Fabian

2017-10-18 3:50 GMT+02:00 ShB <shon.balakrishna@gmail.com>:

> I just wanted to leave an update about this issue, for someone else who
> might
> come across it. The problem was with memory, but it was disk memory and not
> heap/off-heap memory. Yarn was killing off my containers as they exceeded
> the threshold for disk utilization and this was manifesting as Task manager
> was lost/killed or JobClientActorConnectionTimeoutException: Lost
> connection
> to the JobManager. Digging deep into the individual instance node manager
> logs provided some hints about it being a disk issue.
> Some fixes for this problem:
> yarn.nodemanager.disk-health-checker.max-disk-utilization-
> per-disk-percentage
> -- can be increased to alleviate the problem temporarily.
> Increasing the disk capacity on each task manager is a more long-term fix.
> Increasing the number of task managers increases available disk memory and
> hence is also a fix.
> Thanks!
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.
> n4.nabble.com/

View raw message