hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1529) Incorrect handling of interrupts in waitForAckedSeqno can cause deadlock
Date Fri, 04 Feb 2011 21:05:30 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990740#comment-12990740

Todd Lipcon commented on HDFS-1529:

Hi Hairong. The difference between the handling in the two different methods is this:

In waitForAckedSeqno, the user thread is just blocked on an hflush() or flush before close
- In this case throwing an exception will not leave the stream in an inconsistent state, and
matches what one might expect: interrupting a thread blocked on IO will throw cause the thread
to throw an exception. In waitAndQueueCurrentPacket() if we return early from that wait without
an exception, we still leave the stream in a "correct" state. Keeping the interruption state
flagged means the user will see the exception on the next operation.

Since you gave +1 I'll commit as is, and if we want to clean this up we can do it separately.

> Incorrect handling of interrupts in waitForAckedSeqno can cause deadlock
> ------------------------------------------------------------------------
>                 Key: HDFS-1529
>                 URL: https://issues.apache.org/jira/browse/HDFS-1529
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Blocker
>             Fix For: 0.22.0
>         Attachments: Test.java, hdfs-1529.txt, hdfs-1529.txt, hdfs-1529.txt
> In HDFS-895 the handling of interrupts during hflush/close was changed to preserve interrupt
status. This ends up creating an infinite loop in waitForAckedSeqno if the waiting thread
gets interrupted, since Object.wait() has a strange semantic that it doesn't give up the lock
even momentarily if the thread is already in interrupted state at the beginning of the call.
> We should decide what the correct behavior is here - if a thread is interrupted while
it's calling hflush() or close() should we (a) throw an exception, perhaps InterruptedIOException
(b) ignore, or (c) wait for the flush to finish but preserve interrupt status on exit?

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message