flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yun Tang <myas...@live.com>
Subject Re: best practices on getting flink job logs from Hadoop history server?
Date Fri, 30 Aug 2019 09:21:36 GMT
Hi  Yu

If you have client job log and you could find your application id from below description:

The Flink YARN client has been started in detached mode. In order to stop Flink on YARN, use
the following command or a YARN web interface to stop it:
yarn application -kill {appId}
Please also note that the temporary files of the YARN session in the home directory will not
be removed.

Yun Tang

From: Zhu Zhu <reedpor@gmail.com>
Sent: Friday, August 30, 2019 16:24
To: Yu Yang <yuyang08@gmail.com>
Cc: user <user@flink.apache.org>
Subject: Re: best practices on getting flink job logs from Hadoop history server?

Hi Yu,

Regarding #2,
Currently we search task deployment log in JM log, which contains info of the container and
machine the task deploys to.

Regarding #3,
You can find the application logs aggregated by machines on DFS, this path of which relies
on your YARN config.
Each log may still include multiple TM logs. However it can be much smaller than the "yarn
logs ..." generated log.

Zhu Zhu

Yu Yang <yuyang08@gmail.com<mailto:yuyang08@gmail.com>> 于2019年8月30日周五

We run flink jobs through yarn on hadoop clusters. One challenge that we are facing is to
simplify flink job log access.

The flink job logs can be accessible using "yarn logs $application_id". That approach has
a few limitations:

  1.  It is not straightforward to find yarn application id based on flink job id.
  2.  It is difficult to find the corresponding container id for the flink sub tasks.
  3.  For jobs that have many tasks, it is inefficient to use "yarn logs ..."  as it mixes
logs from all task managers.

Any suggestions on the best practice to get logs for completed flink job that run on yarn?


View raw message