hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-11780) Prevent IPC reader thread death
Date Tue, 27 Sep 2016 13:57:21 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Daryn Sharp updated HADOOP-11780:
    Attachment: HADOOP-11780.patch

Patch we've been using internally, but modified per HADOOP-13657 to terminate the process
if reader encounters an unrecoverable runtime exception (ex. jdk bug).  No test due to difficulty
of instrumenting the failure mode.

> Prevent IPC reader thread death
> -------------------------------
>                 Key: HADOOP-11780
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11780
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ipc
>    Affects Versions: 2.0.0-alpha
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Critical
>         Attachments: HADOOP-11780.patch
> Reader threads can die to a race condition with the responder thread.  If the server's
ipc handler cannot send a response in one write, it delegates sending the rest of the response
to the responder thread.
> The race occurs when the responder thread has an exception writing to the socket.  The
responder closes the socket.  This wakes up the reader polling on the socket.  If a {{CancelledKeyException}}
is thrown, which is a runtime exception, the reader dies.  All connections serviced by that
reader are now in limbo until the client possibly times out.  New connections play roulette
as to whether they are assigned to a defunct reader.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message