flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijiang Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-4354) Implement TaskManager side of heartbeat from ResourceManager
Date Mon, 07 Nov 2016 09:38:59 GMT

    [ https://issues.apache.org/jira/browse/FLINK-4354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15643637#comment-15643637
] 

Zhijiang Wang commented on FLINK-4354:
--------------------------------------

Hi [~till.rohrmann], for heartbeat interaction between TM and RM,  I have some issues need
to be confirmed with you before implementation.
1. The current HeartbeatManagerImpl implements both HeartbeatManager and HeartbeatTarget interfaces
in order to test easily. I think we need another HeartbeatManagerImpl that just implements
HeartbeatManager interface so can be used directly in different components. And every component
can implement the separate HeartbeatTarget.
2. For TM component,  the HeartbeatManagerImpl can be constructed in TaskManagerRunner (maybe
not put in TaskManagerServices) and passed into TaskExecutor.
3. The TM will create the HeartbeatListener and start the HeartbeatManagerImpl.
4. When RM leader changes, the TM registers the new RM. If the registration successes, TaskExecutorRegistrationSuccess
should contain ResourceID of RM, so the TM can create the HeartbeatTarget and monitor it based
on ResourceID and Gateway of RM.
5. For RM, when receive registration from TM, it will create the HeartbeatTarget and monitor
it based on ResourceID and Gateway of TM.
RM will schedule a heartbeat request to all the monitored TMs.
6. TaskExecutorGateway should define the requestHeartbeat RPC method, and ResourceManagerGateway
should define the sendHeartbeat RPC method.

Do you think the above processes are feasible?  I wish your professional advices and them
begin to implement this week.

>  Implement TaskManager side of heartbeat from ResourceManager
> -------------------------------------------------------------
>
>                 Key: FLINK-4354
>                 URL: https://issues.apache.org/jira/browse/FLINK-4354
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Cluster Management
>            Reporter: Zhijiang Wang
>            Assignee: Zhijiang Wang
>
> The {{ResourceManager}} initiates heartbeat messages via the {{RmLeaderID}}. 
> The {{TaskManager}} transmits its slot availability with each heartbeat. That way, the
RM will always know about available slots.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message