Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 334E7200B9A for ; Fri, 23 Sep 2016 05:16:22 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 320AA160AE3; Fri, 23 Sep 2016 03:16:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 79CAB160AAD for ; Fri, 23 Sep 2016 05:16:21 +0200 (CEST) Received: (qmail 79364 invoked by uid 500); 23 Sep 2016 03:16:20 -0000 Mailing-List: contact issues-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list issues@flink.apache.org Received: (qmail 79339 invoked by uid 99); 23 Sep 2016 03:16:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Sep 2016 03:16:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 80AAD2C2A62 for ; Fri, 23 Sep 2016 03:16:20 +0000 (UTC) Date: Fri, 23 Sep 2016 03:16:20 +0000 (UTC) From: "zhangjing (JIRA)" To: issues@flink.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (FLINK-4537) ResourceManager registration with JobManager MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 23 Sep 2016 03:16:22 -0000 [ https://issues.apache.org/jira/browse/FLINK-4537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangjing resolved FLINK-4537. ------------------------------ Resolution: Resolved > ResourceManager registration with JobManager > -------------------------------------------- > > Key: FLINK-4537 > URL: https://issues.apache.org/jira/browse/FLINK-4537 > Project: Flink > Issue Type: Sub-task > Components: Cluster Management > Reporter: Maximilian Michels > Assignee: zhangjing > > The ResourceManager keeps tracks of all JobManager's which execute Jobs. When a new JobManager registered, its leadership status is checked through the HighAvailabilityServices. It will then be registered at the ResourceManager using the {{JobID}} provided with the initial registration message. > ResourceManager should use JobID and LeaderSessionID(notified by HighAvailabilityServices) to identify a a session to JobMaster. > When JobManager's register at ResourceManager, it takes the following 2 input parameters : > 1. resourceManagerLeaderId: the fencing token for the ResourceManager leader which is kept by JobMaster who send the registration > 2. JobMasterRegistration: contain address, JobID > ResourceManager need to process the registration event based on the following steps: > 1. Check whether input resourceManagerLeaderId is as same as the current leadershipSessionId of resourceManager. If not, it means that maybe two or more resourceManager exists at the same time, and current resourceManager is not the proper rm. so it rejects or ignores the registration. > 2. Check whether exists a valid JobMaster at the giving address by connecting to the address. Reject the registration from invalid address.(Hidden in the connect logic) > 3. Keep JobID and JobMasterGateway mapping relationships. > 4. Start a JobMasterLeaderListener at the given JobID to listen to the leadership of the specified JobMaster. > 5. Send registration successful ack to the jobMaster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)