hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinayakumar B (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-10901) QJM should not consider stale/failed txn available in any one of JNs.
Date Mon, 26 Sep 2016 06:21:20 GMT
Vinayakumar B created HDFS-10901:

             Summary: QJM should not consider stale/failed txn available in any one of JNs.
                 Key: HDFS-10901
                 URL: https://issues.apache.org/jira/browse/HDFS-10901
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: qjm
            Reporter: Vinayakumar B
            Assignee: Vinayakumar B
            Priority: Critical

In one of our cluster faced an issue, where NameNode restart failed due to a stale/failed
txn available in one JN but not others. 

Scenario is:
1. Full cluster restart
2. startLogSegment Txn(195222) synced in Only one JN but failed to others, because they were
shutting down. Only editlog file was created but txn was not synced in others, so after restart
they were marked as empty.
3. Cluster restarted. During failover, this new logSegment missed the recovery because this
JN was slow in responding to this call.
4. Other JNs recover was successfull, as there was no in-progress files.
5. editlog.openForWrite() detected that (195222) was already available, and failed the failover.

Same steps repeated until that stale editlog in JN was manually deleted.

Since QJM is a quorum of JNs, txn is considered successfull, if its written min quorum. Otherwise
it will be failed.
So, same case should be applied while selecting streams for reading also.
Stale/failed txns available in only less JNs should not be considered for reading.

HDFS-10519, does similar work to consider 'durable' txns based on 'committedTxnId'. But updating
'committedTxnId' for every flush with one more RPC seems tobe problematic to performance.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message