hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kihwal Lee (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-9678) Standby NN sometimes does not clear needRollbackFsImage
Date Thu, 21 Jan 2016 20:36:39 GMT
Kihwal Lee created HDFS-9678:

             Summary: Standby NN sometimes does not clear needRollbackFsImage
                 Key: HDFS-9678
                 URL: https://issues.apache.org/jira/browse/HDFS-9678
             Project: Hadoop HDFS
          Issue Type: Bug
            Reporter: Kihwal Lee

When the edit log loader sees {{OP_ROLLING_UPGRADE_START}}, it calls {{setNeedRollbackFsImage(true)}}.
This is cleared on a standby NN only by the checkpointer thread when it actually creates a
rollback image. 

On {{OP_ROLLING_UPGRADE_FINALIZE}}, the rolling upgrade is finalized, but {{needRollbackFsImage}}
is not cleared, if a rollback image was never created.  This result in perpetual checkpointing
by the standby NN.

The standby NN thinks it needs to do chekpointing because it needs to create a rollback image,
but since it is not in upgrade mode, it creates a regular checkpoint, not a rollback image.
As a result, the status is not cleared even after creating checkpoint.

The standby will keep checkpointing back-to-back and they will get uploaded to the active
constantly. We noticed this because of increased sync time on the active.

This message was sent by Atlassian JIRA

View raw message