hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6762) exception while doing RPC I/O closes channel
Date Fri, 04 Jun 2010 01:17:57 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875406#action_12875406
] 

Todd Lipcon commented on HADOOP-6762:
-------------------------------------

Hey Sam. Sorry, I misunderstood your point earlier. You're definitely right that interrupting
one thread shouldn't take down the RPC connection.

Adding yet another thread to IPC seems a bit complicated, though. What about if we added a
flag to Connection saying "no more sends on this connection", so that interrupting the sender
did kill the connection but lets currently pending calls complete? Then when the queue has
been quiesced the connection shuts down like it does today?

The issue I'm thinking is that we solve the interrupt problem, but don't solve the general
case of exceptions during sendparam. The user can still have a Writable which eg throws an
NPE, and we're back to the same problem of lack of isolation between writers.

> exception while doing RPC I/O closes channel
> --------------------------------------------
>
>                 Key: HADOOP-6762
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6762
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.2
>            Reporter: sam rash
>            Assignee: sam rash
>         Attachments: hadoop-6762-1.txt, hadoop-6762-2.txt, hadoop-6762-3.txt, hadoop-6762-4.txt,
hadoop-6762-6.txt
>
>
> If a single process creates two unique fileSystems to the same NN using FileSystem.newInstance(),
and one of them issues a close(), the leasechecker thread is interrupted.  This interrupt
races with the rpc namenode.renew() and can cause a ClosedByInterruptException.  This closes
the underlying channel and the other filesystem, sharing the connection will get errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message