hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liang Xie (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4273) Problem in DFSInputStream read retry logic may cause early failure
Date Wed, 11 Dec 2013 09:33:09 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13845218#comment-13845218
] 

Liang Xie commented on HDFS-4273:
---------------------------------

{code}
     -> seekToNewSource() add currentNode to deadnode, wish to get a different datanode
        -> blockSeekTo()
           -> chooseDataNode()
              -> block missing, clear deadNodes and pick the currentNode again
        seekToNewSource() return false
{code}

i checked codebase, it shows :
{code}
  private synchronized boolean seekToBlockSource(long targetPos)
                                                 throws IOException {
    currentNode = blockSeekTo(targetPos);
    return true;
  }
{code}
It could not return false, seems the original description is stale ? 

> Problem in DFSInputStream read retry logic may cause early failure
> ------------------------------------------------------------------
>
>                 Key: HDFS-4273
>                 URL: https://issues.apache.org/jira/browse/HDFS-4273
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.0.2-alpha
>            Reporter: Binglin Chang
>            Assignee: Binglin Chang
>            Priority: Minor
>         Attachments: HDFS-4273-v2.patch, HDFS-4273.patch, HDFS-4273.v3.patch, HDFS-4273.v4.patch,
HDFS-4273.v5.patch, TestDFSInputStream.java
>
>
> Assume the following call logic
> {noformat} 
> readWithStrategy()
>   -> blockSeekTo()
>   -> readBuffer()
>      -> reader.doRead()
>      -> seekToNewSource() add currentNode to deadnode, wish to get a different datanode
>         -> blockSeekTo()
>            -> chooseDataNode()
>               -> block missing, clear deadNodes and pick the currentNode again
>         seekToNewSource() return false
>      readBuffer() re-throw the exception quit loop
> readWithStrategy() got the exception,  and may fail the read call before tried MaxBlockAcquireFailures.
> {noformat} 
> some issues of the logic:
> 1. seekToNewSource() logic is broken because it may clear deadNodes in the middle.
> 2. the variable "int retries=2" in readWithStrategy seems have conflict with MaxBlockAcquireFailures,
should it be removed?



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message