hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hung (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5946) Create YarnConfigurationStore interface and InMemoryConfigurationStore class
Date Sat, 04 Feb 2017 01:53:51 GMT

    [ https://issues.apache.org/jira/browse/YARN-5946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15852478#comment-15852478

Jonathan Hung commented on YARN-5946:

Discussed offline with [~xgong] and [~leftnoteasy], seems there are some concerns about how
this will be handled in the RM HA case. For the derby implementation, we can have a table1
containing the "last good" configuration, a table2 containing the mutations (one per row,
basically an audit log), and a table3 with the "last good" transaction id (initialized at
0). Each mutation row in the audit log table will have a distinct and incrementing id. For
a mutation, the workflow will be:

# Client issues configuration mutation
# Mutation passed to capacity scheduler, which then passes it to MCM
# MCM logs the change in table2 (atomically, using derby's commit) via the YarnConfigurationStore
interface. This change will have txn id one greater than whatever is in table3.
# MCM calls cs.reinitialize with the in-memory CS configuration with the mutation applied
# If success, MCM stores the mutation in table1 and increments the txn id in table3 (both
of these are done together atomically). If failure, MCM increments the txn id in table3 but
doesn't change table1.

Steps 3-5 should be synchronized to prevent mutations to be logged in a different order than
they are applied.

For the failover case, first we can NFS mount the derby DB on both RMs. Now if there is failover
anytime after step 3, the new RM will read table3, see if there are any logs in table2 with
txn id greater than this value (via YarnConfigurationStore interface), and apply these mutations,
calling cs.reinitialize on the current configuration with this mutation applied, and storing
if valid.

Note that since steps 3-5 are synchronized, there should be at most one log line from table2
to replay onto table1 when recovering.

Let me know if there are any concerns with this.

> Create YarnConfigurationStore interface and InMemoryConfigurationStore class
> ----------------------------------------------------------------------------
>                 Key: YARN-5946
>                 URL: https://issues.apache.org/jira/browse/YARN-5946
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Jonathan Hung
>            Assignee: Jonathan Hung
>         Attachments: YARN-5946.001.patch, YARN-5946-YARN-5734.002.patch
> This class provides the interface to persist YARN configurations in a backing store.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message