hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11303) Hedged read might hang infinitely if read data from all DN failed
Date Tue, 13 Jun 2017 14:13:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16047945#comment-16047945
] 

Wei-Chiu Chuang commented on HDFS-11303:
----------------------------------------

Hi [~zhangchen], thanks for filing the jira and posting the patch.
I'd like to help reviewing the patch. The fix itself looks good. Here's my quick comments,
mostly cosmetic:

1.  I am not so sure about the test: you mentioned the client had timeouts connecting to DNs,
but the test throws ChecksumException -- it is not clear to me if client exhibit the same
symptom in both scenarios. Perhaps you can use {{DFSClientFaultInjector.readFromDatanodeDelay}}
to insert a delay instead?

2. Could you add test timeout limit?  For example a 10 second timeout: {{\@Test(timeout=100000)}}

3. It appears to me that the test expects to throw BlockMissingException.  Instead of the
following:
{code}
	    } catch (BlockMissingException e) {
	      assertTrue(true);
	    }
{code} Would you mind update the test and use ExpectedException to assert that the exception
is expected?

4. 
{code}
	        if (true) {
	          System.out.println("-------------- throw Checksum Exception");
	          throw new ChecksumException("ChecksumException test", 100);
	        }
{code}
Please remove {{if(true)}} and please use DFSClient.LOG instead of System.out.println for
log printing

5. There's a slight code conflict due to HDFS-11708. Please rebase the patch.

> Hedged read might hang infinitely if read data from all DN failed 
> ------------------------------------------------------------------
>
>                 Key: HDFS-11303
>                 URL: https://issues.apache.org/jira/browse/HDFS-11303
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Chen Zhang
>            Assignee: Chen Zhang
>         Attachments: HDFS-11303-001.patch, HDFS-11303-001.patch, HDFS-11303-002.patch,
HDFS-11303-002.patch
>
>
> Hedged read will read from a DN first, if timeout, then read other DNs simultaneously.
> If read all DN failed, this bug will cause the future-list not empty(the first timeout
request left in list), and hang in the loop infinitely



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message