hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hung (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5946) Create YarnConfigurationStore interface and InMemoryConfigurationStore class
Date Thu, 16 Feb 2017 01:51:41 GMT

    [ https://issues.apache.org/jira/browse/YARN-5946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15868965#comment-15868965

Jonathan Hung commented on YARN-5946:

Thanks [~leftnoteasy] for the comments.
bq. It is actually means "last confirmed" transaction id, correct? I found in the step 5 it
get increased even if update failed.
It is the txnid for which all logs with a lesser txnid do not need to be replayed on recovery.
Either this means the log has been persisted to the store in case of successful refresh, or
the mutation has been deemed invalid in case of failure to refresh (which is why it is incremented
even if update fails). So in this case perhaps confirmMutation(long id) should be confirmMutation(long
id, boolean isValid). 
bq. So I suggest to persist a transaction-id in addition to "last good" configuration to table-1.
Sure, I think this is implementation dependent, in general though we can have a configuration
entry with key="transaction.id" or something similar.
bq. Who will generate "id" for each logItem?
I think the YarnConfigurationStore should maintain the current id and generate new ones, which
are returned upon logMutation calls. So when MCM receives a mutation, it will log it, which
will then return an incremented id "id", then MCM will try to refresh, and will call confirmMutation("id",

Here the YarnConfigurationStore can store a map of "id" to LogMutation in memory, so it can
quickly store the LogMutation into table1 if confirmMutation(id, true) is called.
bq. YarnConfigurationStore#retrieve, does it mean get from table-1 or get from table-1/2/3
(which described by your "for the failover case ..." in your previous comment)? I would prefer
the latter one.
On failover MCM would call retrieve (which returns a "conf"), and getPendingMutations, apply
each pendingMutation one by one to "conf", and confirmMutation(pendingMutation.id, true/false)
if refresh is successful/unsuccessful. So YarnConfigurationStore#retrieve on its own returns
from table1 which may not have all logs applied, but MCM will reconstruct the updated configuration
from getPendingMutations. So not sure if retrieveLatestConf is necessary (the third API in
previous comment).

Since MCM stores an in memory configuration, YarnConfigurationStore#retrieve and getPendingMutations
should be only called once, on failover.
So my proposal is: {noformat}1) initialize(Configuration conf, Map<String, String> schedConf);
2) retrieve which returns conf stored in table1
3) logMutation to save the new mutation in table2
4) confirmMutation(long id, boolean isValid) to increment txnid stored in table1, and persist
the logged mutation if isValid==true
5) List<LogMutation> getPendingMutations(void) for getting unconfirmed mutations{noformat}
I think we can add getConfirmedConfHistory in a later patch.

If no concerns with this approach, will upload patch. Thanks!

> Create YarnConfigurationStore interface and InMemoryConfigurationStore class
> ----------------------------------------------------------------------------
>                 Key: YARN-5946
>                 URL: https://issues.apache.org/jira/browse/YARN-5946
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Jonathan Hung
>            Assignee: Jonathan Hung
>         Attachments: YARN-5946.001.patch, YARN-5946-YARN-5734.002.patch
> This class provides the interface to persist YARN configurations in a backing store.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message