hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sameer Paranjpye (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4278) TestDatanodeDeath failed occasionally
Date Wed, 15 Oct 2008 21:14:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639965#action_12639965
] 

Sameer Paranjpye commented on HADOOP-4278:
------------------------------------------

@Dhruba: I agree that this is not a blocker for 0.19. The out of phase thread deaths don't
occur typically in real deployments. Also we haven't yet observed this condition occurring
frequently on our grids.

However, I think there are real deficiencies in error recovery for HDFS writes. 
# the client does not correctly detect which link in the write pipeline failed
# the client tries to initiate block recovery from the dead Datanode, fails to do so and causes
the write to fail. This is mostly due to 1. but can also occur if the recovery primary fails
following a link failure.

Ideally, a writer should fail only if
# the writer itself dies for some reason
# the writer loses all it's replicas

This should be the subject of a different JIRA but I think we should spend some energy making
it happen. For this issue, the best course might be to disable testSimple until we have a
complete recovery story.


> TestDatanodeDeath failed occasionally
> -------------------------------------
>
>                 Key: HADOOP-4278
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4278
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.0
>
>
> TestDatanodeDeath keeps failing occasionally.  For example, see
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3365/testReport/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message