hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Masatake Iwasaki (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2043) TestHFlush failing intermittently
Date Thu, 05 May 2016 03:18:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271828#comment-15271828

Masatake Iwasaki commented on HDFS-2043:

In both case of IOException and ClosedByInterruptException, I can see the message "Got expected
exception during close" in the test logs. The exception was thrown on the second {{stm.close()}}
in the catch block below.

      try {
        // If we made it past the close(), then that means that the ack made it back
        // from the pipeline before we got to the wait() call. In that case we should
        // still have interrupted status.
      } catch (InterruptedIOException ioe) {
        System.out.println("Got expected exception during close");
        // If we got the exception, we shouldn't have interrupted status anymore.

        // Now do a successful close.

The catched ioe points to {{DFSOutputStream#closeImpl}}. (The stack trace is logged by fixing
TestHFlush in my local environment.)

java.io.InterruptedIOException: Interrupted while waiting for data to be acknowledged by pipeline
        at org.apache.hadoop.hdfs.DataStreamer.waitForAckedSeqno(DataStreamer.java:771)
        at org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:697)
        at org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:778)
        at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:755)
        at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
        at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
        at org.apache.hadoop.hdfs.TestHFlush.testHFlushInterrupted(TestHFlush.java:480)

The testHFlushInterrupted expects that the second {{stm.close()}} succeeds but it is not true.
Underlying streamer thread is closed since {{closeThreads(true)}} is called in the finally
block of {{DFSOutputStream#closeImpl}}.

    } finally {
      // Failures may happen when flushing data.
      // Streamers may keep waiting for the new block information.
      // Thus need to force closing these threads.
      // Don't need to call setClosed() because closeThreads(true)
      // calls setClosed() in the finally block.

I think we should just catch IOException on the second {{stm.close()}} and ignore it. The
final check in the test should fail if there is a problem.

      // verify that entire file is good
      AppendTestUtil.checkFullFile(fs, p, 4, fileContents,
          "Failed to deal with thread interruptions", false);

> TestHFlush failing intermittently
> ---------------------------------
>                 Key: HDFS-2043
>                 URL: https://issues.apache.org/jira/browse/HDFS-2043
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Aaron T. Myers
>            Assignee: Lin Yiqun
>         Attachments: HDFS-2043.002.patch, HDFS-2043.003.patch, HDFS.001.patch
> I can't reproduce this failure reliably, but it seems like TestHFlush has been failing
intermittently, with the frequency increasing of late.
> Note the following two pre-commit test runs from different JIRAs where TestHFlush seems
to have failed spuriously:
> https://builds.apache.org/job/PreCommit-HDFS-Build/734//testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/680//testReport/

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message