flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1579) Create a Flink History Server
Date Thu, 02 Mar 2017 17:59:46 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892693#comment-15892693

ASF GitHub Bot commented on FLINK-1579:

GitHub user zentol opened a pull request:


    [FLINK-1579] Implement History Server

    This PR adds a slightly unpolished HistoryServer implementation. It is missing tests and
some documentation, but is working.
    This PR builds on top of #3377.
    The basic idea is as follows:
    The ```MemoryArchivist```, upon receiving an ```ExecutionGraph```, writes a set of json
files into a directory structure resembling the REST API using the features introduced in
FLINK-5870, FLINK-5852 and FLINK-5941. The target location is configurable using ```job-manager.archive.dir```.
Each job resides in it's own directory, using the job ID as the directory name. As such, each
archive is consistent on it's own and multiple jobmanagers may use the same archive dir.
    The ```HistoryServer``` polls certain directories, configured via ```historyserver.archive.dirs```,
in regular intervals, configured via ```historyserver.refresh-interval```, for new job archives.
If a new archive is found it is downloaded and integrated into a cache of job archives in
the local file system, configurable using ```historyserver.web.dir```. These files are served
to a slightly modified WebFrontend using the ```HistoryServerStaticFileServerHandler```.
    In the end the HistoryServer is little more than an aggregator and archive viewer.
    None of the directory configuration options have defaults; as it stands the entire feature
is opt-in.
    Should a file that the WebFrontend requests be missing a separate fetch routine kicks
in which attempts to fetch the missing file. This is primarily aimed at eventually-consistent
    The HistoryServer is started using the new historyserver.sh script, which works similarly
to job- or taskmanager scripts: ```./bin/historyserver.sh [start|stop]```
    2 bigger refactorings were made to existing code to increase the amount of shared code:
    * the netty setup in the WebRuntimeMonitor was moved into a separate NettySetup class
which the HistoryServer can use as well
    * an AbstractStaticFileServerHandler was added which the (HistoryServer)StaticFileServerHandler

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zentol/flink 1579_history_server_pr

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3460
commit 61a07456f151ac8f5418ac66629751e1a83ada3a
Author: zentol <chesnay@apache.org>
Date:   2017-01-24T09:13:24Z

    [FLINK-1579] Implement History Server - Frontend

commit e6316e544fea160f7d050dd1b087301a83345d31
Author: zentol <chesnay@apache.org>
Date:   2017-02-21T11:36:17Z

    [FLINK-5645] Store accumulators/metrics for canceled/failed tasks

commit 84fd2746b09ce41c2d9bd5be7f6e8a8cc1a3291d
Author: zentol <chesnay@apache.org>
Date:   2017-03-02T12:31:56Z

    Refactor netty setup into separate class

commit 81d7e6b92fe69326d6edf6b63f3f9c95f5ebd0ef
Author: zentol <chesnay@apache.org>
Date:   2017-02-22T14:47:07Z

    [FLINK-1579] Implement History Server - Backend

commit 8d1e8c59690ea97be4bbaf1a011c8ec4a68f5892
Author: zentol <chesnay@apache.org>
Date:   2017-03-02T11:09:36Z

    Rebuild frontend


> Create a Flink History Server
> -----------------------------
>                 Key: FLINK-1579
>                 URL: https://issues.apache.org/jira/browse/FLINK-1579
>             Project: Flink
>          Issue Type: New Feature
>          Components: Distributed Coordination
>    Affects Versions: 0.9
>            Reporter: Robert Metzger
>            Assignee: Chesnay Schepler
> Right now its not possible to analyze the job results for jobs that ran on YARN, because
we'll loose the information once the JobManager has stopped.
> Therefore, I propose to implement a "Flink History Server" which serves  the results
from these jobs.
> I haven't started thinking about the implementation, but I suspect it involves some JSON
files stored in HDFS :)

This message was sent by Atlassian JIRA

View raw message