hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rakesh R (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3464) BKJM: Deleting currentLedger and leaving 'inprogress_x' on exceptions can throw BKNoSuchLedgerExistsException later.
Date Sun, 27 May 2012 05:31:24 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284107#comment-13284107
] 

Rakesh R commented on HDFS-3464:
--------------------------------


As per discussion with Ivan, here just deleting the inprogress znode may not be safe in the
case of an Exception. If the reason for the exception was a NodeExistsException from the EditLogLedgerMetadata#write,
then it means another namenode has managed to create an inprogress znode with the same start
txid, so deleting it would mean we would loose data. I'd also prefer the approach like, one
should delete the inprogress znode on exception if he has the lock.

-Rakesh
                
> BKJM: Deleting currentLedger and leaving 'inprogress_x'  on exceptions can throw BKNoSuchLedgerExistsException
later.
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3464
>                 URL: https://issues.apache.org/jira/browse/HDFS-3464
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: 2.0.1-alpha, 3.0.0
>            Reporter: Uma Maheswara Rao G
>            Assignee: Uma Maheswara Rao G
>
> HDFS-3058 will clean currentLedgers on exception.
> In BookKeeperJournalManager, startLogSegment() is deleting the corresponding 'inprogress_ledger'
ledger on exception. Here leaving the 'inprogress_x' ledger metadata in ZooKeeper. When the
other node becomes active, he will see the 'inprogress_x' znode and tries to recoverLastTxId()
it would throw exception, since there is no 'inprogress_ledger' exists. 
> {noformat}
> Caused by: org.apache.bookkeeper.client.BKException$BKNoSuchLedgerExistsException
> 	at org.apache.bookkeeper.client.BookKeeper.openLedger(BookKeeper.java:393)
> 	at org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.recoverLastTxId(BookKeeperJournalManager.java:493)
> {noformat}
> As per the discussion in HDFS-3058, we will handle the coment as part of this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message