hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "LiuLei (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4273) Problem in DFSInputStream read retry logic may cause early failure
Date Fri, 03 Jan 2014 08:14:57 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861325#comment-13861325
] 

LiuLei commented on HDFS-4273:
------------------------------

Hi Binglin, I have another case.

I use Hbase-0.94 and CDH-4.3.1
When RegionServer read data from loca datanode, if local datanode is dead, the local datanode
is add to deadNodes, and RegionServer read data from remote datanode. But when local datanode
is become live, RegionServer still read data from remote datanode, that reduces the performance
of RegionServer.  We need to one way that remove local datanode from deadNodes when the local
datanode is become live.

> Problem in DFSInputStream read retry logic may cause early failure
> ------------------------------------------------------------------
>
>                 Key: HDFS-4273
>                 URL: https://issues.apache.org/jira/browse/HDFS-4273
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.0.2-alpha
>            Reporter: Binglin Chang
>            Assignee: Binglin Chang
>            Priority: Minor
>         Attachments: HDFS-4273-v2.patch, HDFS-4273.patch, HDFS-4273.v3.patch, HDFS-4273.v4.patch,
HDFS-4273.v5.patch, HDFS-4273.v6.patch, TestDFSInputStream.java
>
>
> Assume the following call logic
> {noformat} 
> readWithStrategy()
>   -> blockSeekTo()
>   -> readBuffer()
>      -> reader.doRead()
>      -> seekToNewSource() add currentNode to deadnode, wish to get a different datanode
>         -> blockSeekTo()
>            -> chooseDataNode()
>               -> block missing, clear deadNodes and pick the currentNode again
>         seekToNewSource() return false
>      readBuffer() re-throw the exception quit loop
> readWithStrategy() got the exception,  and may fail the read call before tried MaxBlockAcquireFailures.
> {noformat} 
> some issues of the logic:
> 1. seekToNewSource() logic is broken because it may clear deadNodes in the middle.
> 2. the variable "int retries=2" in readWithStrategy seems have conflict with MaxBlockAcquireFailures,
should it be removed?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message