Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 3812C200C2B for ; Thu, 16 Feb 2017 02:52:22 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 3542D160B70; Thu, 16 Feb 2017 01:52:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 7DF3E160B5E for ; Thu, 16 Feb 2017 02:52:21 +0100 (CET) Received: (qmail 47508 invoked by uid 500); 16 Feb 2017 01:52:20 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 47496 invoked by uid 99); 16 Feb 2017 01:52:20 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Feb 2017 01:52:20 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 3784E186128 for ; Thu, 16 Feb 2017 01:52:20 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.198 X-Spam-Level: X-Spam-Status: No, score=-1.198 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-2.999, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id 11HB2uOS_ZMq for ; Thu, 16 Feb 2017 01:52:19 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id CFB865F3BD for ; Thu, 16 Feb 2017 01:52:18 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 0B09AE0570 for ; Thu, 16 Feb 2017 01:51:42 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id B52032411D for ; Thu, 16 Feb 2017 01:51:41 +0000 (UTC) Date: Thu, 16 Feb 2017 01:51:41 +0000 (UTC) From: "Jonathan Hung (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-5946) Create YarnConfigurationStore interface and InMemoryConfigurationStore class MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 16 Feb 2017 01:52:22 -0000 [ https://issues.apache.org/jira/browse/YARN-5946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15868965#comment-15868965 ] Jonathan Hung commented on YARN-5946: ------------------------------------- Thanks [~leftnoteasy] for the comments. bq. It is actually means "last confirmed" transaction id, correct? I found in the step 5 it get increased even if update failed. It is the txnid for which all logs with a lesser txnid do not need to be replayed on recovery. Either this means the log has been persisted to the store in case of successful refresh, or the mutation has been deemed invalid in case of failure to refresh (which is why it is incremented even if update fails). So in this case perhaps confirmMutation(long id) should be confirmMutation(long id, boolean isValid). bq. So I suggest to persist a transaction-id in addition to "last good" configuration to table-1. Sure, I think this is implementation dependent, in general though we can have a configuration entry with key="transaction.id" or something similar. bq. Who will generate "id" for each logItem? I think the YarnConfigurationStore should maintain the current id and generate new ones, which are returned upon logMutation calls. So when MCM receives a mutation, it will log it, which will then return an incremented id "id", then MCM will try to refresh, and will call confirmMutation("id", true/false). Here the YarnConfigurationStore can store a map of "id" to LogMutation in memory, so it can quickly store the LogMutation into table1 if confirmMutation(id, true) is called. bq. YarnConfigurationStore#retrieve, does it mean get from table-1 or get from table-1/2/3 (which described by your "for the failover case ..." in your previous comment)? I would prefer the latter one. On failover MCM would call retrieve (which returns a "conf"), and getPendingMutations, apply each pendingMutation one by one to "conf", and confirmMutation(pendingMutation.id, true/false) if refresh is successful/unsuccessful. So YarnConfigurationStore#retrieve on its own returns from table1 which may not have all logs applied, but MCM will reconstruct the updated configuration from getPendingMutations. So not sure if retrieveLatestConf is necessary (the third API in previous comment). Since MCM stores an in memory configuration, YarnConfigurationStore#retrieve and getPendingMutations should be only called once, on failover. So my proposal is: {noformat}1) initialize(Configuration conf, Map schedConf); 2) retrieve which returns conf stored in table1 3) logMutation to save the new mutation in table2 4) confirmMutation(long id, boolean isValid) to increment txnid stored in table1, and persist the logged mutation if isValid==true 5) List getPendingMutations(void) for getting unconfirmed mutations{noformat} I think we can add getConfirmedConfHistory in a later patch. If no concerns with this approach, will upload patch. Thanks! > Create YarnConfigurationStore interface and InMemoryConfigurationStore class > ---------------------------------------------------------------------------- > > Key: YARN-5946 > URL: https://issues.apache.org/jira/browse/YARN-5946 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Jonathan Hung > Assignee: Jonathan Hung > Attachments: YARN-5946.001.patch, YARN-5946-YARN-5734.002.patch > > > This class provides the interface to persist YARN configurations in a backing store. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org