hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-11802) DomainSocketWatcher thread terminates sometimes after there is an I/O error during requestShortCircuitShm
Date Thu, 23 Apr 2015 19:38:41 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-11802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14509652#comment-14509652

Colin Patrick McCabe commented on HADOOP-11802:

bq. extra imports in DataXCeiver, though really you probably meant to add the @Private annotation
and just forgot.


bq. Add a newline in the DSW C file change, break the new POLLHUP check to the next line (like
the other if you changed)


bq. Adding a link to the webpage reference (along with mentioning portability / Cygwin) would
also be nice, since I wondered why we didn't have to catch yet more poll errors.

I added a comment explaining why POLLHUP

bq. Typo "repsponse" in DataXceiver


bq. We typically have used a singleton to do fault injection, would be good to be consistent
since it doesn't look like we need per-instance injection. See DataNodeFaultInjector, probably
the best home.

OK.  That would eliminate the need to make the DataXceiver class public, which would be nice.

bq. Good fix on the javadoc for allocSlot, but mind adding the blockId param doc too for full

Hey, I'm trying to make incremental changes here :)  Fixed.

bq. The Throwable catch, it subsumes the IOException catch, so can we just delete it? I think
the more specific name of the exception will be printed by its toString.


bq. Param indentation in TestSCCache#checkNumberOfSeg... is inconsistent, I think we typically
do double indent?


bq. TestSCCache, the comment "Remove the failure injector" should be moved up a few lines

let me just get rid of that since the log messages says the same thing

> DomainSocketWatcher thread terminates sometimes after there is an I/O error during requestShortCircuitShm
> ---------------------------------------------------------------------------------------------------------
>                 Key: HADOOP-11802
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11802
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.7.0
>            Reporter: Eric Payne
>            Assignee: Colin Patrick McCabe
>         Attachments: HADOOP-11802.001.patch, HADOOP-11802.002.patch, HADOOP-11802.003.patch
> In {{DataXceiver#requestShortCircuitShm}}, we attempt to recover from some errors by
closing the {{DomainSocket}}.  However, this violates the invariant that the domain socket
should never be closed when it is being managed by the {{DomainSocketWatcher}}.  Instead,
we should call {{shutdown}} on the {{DomainSocket}}.  When this bug hits, it terminates the
{{DomainSocketWatcher}} thread.

This message was sent by Atlassian JIRA

View raw message