hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mingliang Liu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-10293) StripedFileTestUtil#readAll flaky
Date Thu, 14 Apr 2016 19:11:26 GMT

     [ https://issues.apache.org/jira/browse/HDFS-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Mingliang Liu updated HDFS-10293:
    Attachment: HDFS-10293.000.patch

The code is as following:

  static int readAll(FSDataInputStream in, byte[] buf) throws IOException {
    int readLen = 0;
    int ret;
    while ((ret = in.read(buf, readLen, buf.length - readLen)) >= 0 &&
        readLen <= buf.length) {
      readLen += ret;
    return readLen;

If the {{readLen}} equals to {{buf.length}}, then {{buf.length - readLen}} will be zero, and
{{in.read()}} will simply returns zero without reading from the stream. This case, no exception
will be thrown, and the code is stuck in the while-loop.

One possible fix is to strict the condition as {{ret = in.read(buf, readLen, buf.length -
readLen)) > 0 && readLen < buf.length}}. A probable better fix is to use the
{{IOUtils.readFully()}}, which will throw an IOException if it reads premature EOF from inputStream,
see the v0 patch.

> StripedFileTestUtil#readAll flaky
> ---------------------------------
>                 Key: HDFS-10293
>                 URL: https://issues.apache.org/jira/browse/HDFS-10293
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: erasure-coding, test
>    Affects Versions: 3.0.0
>            Reporter: Mingliang Liu
>            Assignee: Mingliang Liu
>         Attachments: HDFS-10293.000.patch
> The flaky test helper method cause several UT test failing intermittently. For example,
the {{TestDFSStripedOutputStreamWithFailure#testAddBlockWhenNoSufficientParityNumOfNodes}}
timed out in a recent run (see [exception|https://builds.apache.org/job/PreCommit-HDFS-Build/15158/testReport/org.apache.hadoop.hdfs/TestDFSStripedOutputStreamWithFailure/testAddBlockWhenNoSufficientParityNumOfNodes/]),
which can be easily reproduced locally.
> Debugging at the code, chances are that the helper method is stuck in an infinite loop.
We need a fix to make the test robust.

This message was sent by Atlassian JIRA

View raw message