hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2691) HA: Tests and fixes for pipeline targets and replica recovery
Date Tue, 24 Jan 2012 00:49:41 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191701#comment-13191701
] 

Todd Lipcon commented on HDFS-2691:
-----------------------------------

- Fixed #1, #2, #4, #7, #8, #9, #10 as suggested

bq. In BlockManager#blockReceivedAndDeleted do you really think it's reasonable to only warn
here?
I tend to think of ERROR as things like dataloss/corruption. So I left it as warn, but added
an assert here so that it will fail if we come across this during development or QA where
assertions are often enabled.

bq. I realize it's consistent with DataNode#notifyNamenodeReceivedBlock and DataNode#notifyNamenodeDeletedBlock,
but it seems like they should all be ERROR
Changed all to ERROR

- Re #6, I didn't rename the RPC call or the ReceivedDeletedBlockInfo wire structure since
it will make merging really complicated (especially since this is one of the things that differs
between trunk and 23, plus it's involved with the PB merge). I think we should file a follow-up
JIRA to do the rename to something like IncrementalBlockInfo.
                
> HA: Tests and fixes for pipeline targets and replica recovery
> -------------------------------------------------------------
>
>                 Key: HDFS-2691
>                 URL: https://issues.apache.org/jira/browse/HDFS-2691
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>         Attachments: hdfs-2691.txt, hdfs-2691.txt
>
>
> Currently there are some TODOs around pipeline/recovery code in the HA branch. For example,
commitBlockSynchronization only gets sent to the active NN which may have failed over by that
point. So, we need to write some tests here and figure out what the correct behavior is.
> Another related area is the treatment of targets in the pipeline. When a pipeline is
created, the active NN adds the "expected locations" to the BlockInfoUnderConstruction, but
the DN identifiers aren't logged with the OP_ADD. So after a failover, the BlockInfoUnderConstruction
will have no targets and I imagine replica recovery would probably trigger some issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message