flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dongwon Kim <eastcirc...@gmail.com>
Subject Re: YARN per-job cluster reserves all of remaining memory in YARN
Date Mon, 07 May 2018 14:08:28 GMT
Hi Fabian and till,

Below is what I've observed today.
Hope it provides a strong evidence to figure out the problem.

I attach another log file, jmlog2.txt, after observing the different
behavior of a per-job cluster with more memory given to YARN nodemanagers
(compared to jmlog.txt).
- jmlog.txt : Each of 7 NodeManagers has 96GB. Only a single TM (50GB) can
be scheduled on a NM and I ended up with having only 7 NodeManagers.
There's no room for extra unnecessary TaskManagers.
- jmlog2.txt : Each of 7 NMs has 128GB. After scheduling a TM on each NM,
RM can schedule additional 7 TMs as each NM has remaining 78 GB.

What I see from both log files is that,
- ExecutionGraph creates 100 tasks as I specified.
- Initially 7 necessary containers (for 7 TMs each with 16 slots) are
requested to YARN, which is quite desired behavior.
- However, extra unnecessary 93 requests are made after the very first
TaskManager is registered to SlotManager with the following messages:
+ jmlog.txt : Register TaskManager 640b098f3a132b452a74673631a0bf7f at the
+ jmlog2.txt : Registering TaskManager container_1525676778566_0001_01_000005
under 35b9ceed32bd87fa23ddca4282f5abac at the SlotManager.
(Please note that the info messages are different in jmlog.txt and
jmlog2.txt; it is due to a recent hotfix "Add resourceId to TaskManager
registration messages")
The 93 containers should not be asked as JobMaster is going to have enough
slots on the 6 TaskManagers which will be soon registered to SlotManager.
This causes a deadlock situation if YARN does not have resources to
allocate such 93 containers as in jmlog.txt.

Unlike in jmlog.txt, jmlog2.txt shows
- Extra TMs are scheduled on newly scheduled containers.
- Extra TMs are not given any tasks for while.
- Extra TMs are shut down with the below message.
    "Closing TaskExecutor connection container_1525676778566_0001_01_000015
because: TaskExecutor exceeded the idle timeout."
- At the end, there are no pending container requests in jmlog2.txt at the

p.s. I just found that SlotManager is only for flip-6. Nevertheless, I
write this email to user@ as I originally start this thread on user@. Sorry
for the inconvenience.

- Dongwon

On Mon, May 7, 2018 at 9:27 PM, Fabian Hueske <fhueske@gmail.com> wrote:

> Hi Dongwon,
> I see that you are using the latest master (Flink 1.6-SNAPSHOT).
> This is a known problem in the new FLIP-6 mode. The ResourceManager tries
> to allocate too many resources, basically on TM per required slot, i.e., it
> does not take the number of slots per TM into account.
> The resources are not used and should be returned to YARN after a timeout.
> I couldn't find a JIRA issue to point you to.
> Till (in CC) should know more details about this problem.
> Best, Fabian
> 2018-05-05 12:50 GMT+02:00 Dongwon Kim <eastcirclek@gmail.com>:
>> I'm testing per-job cluster on YARN.
>> I just need to launch 7 TMs each with 50GB memory (total 350GB) but Flink
>> makes more resource request to YARN than necessary.
>> All of the remaining memory in YARN, around 370GB, are reserved by the
>> Flink job, which I can check in YARN UI.
>> The remaining memory is not used but reserved; that’s very weird.
>> Attached is JM log.
>> Any help would be greatly appreciated!
>> Thanks,
>> - Dongwon

View raw message