hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HDFS-3894) QJM: testRecoverAfterDoubleFailures can be flaky due to IPC client caching
Date Thu, 13 Sep 2012 23:02:08 GMT

     [ https://issues.apache.org/jira/browse/HDFS-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Todd Lipcon resolved HDFS-3894.

       Resolution: Fixed
    Fix Version/s: QuorumJournalManager (HDFS-3077)
     Hadoop Flags: Reviewed

Committed to branch, thx for review
> QJM: testRecoverAfterDoubleFailures can be flaky due to IPC client caching
> --------------------------------------------------------------------------
>                 Key: HDFS-3894
>                 URL: https://issues.apache.org/jira/browse/HDFS-3894
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: test
>    Affects Versions: QuorumJournalManager (HDFS-3077)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: QuorumJournalManager (HDFS-3077)
>         Attachments: hdfs-3894.txt
> TestQJMWithFaults.testRecoverAfterDoubleFailures fails really occasionally. Looking into
it, the issue seems to be that it's possible by random chance for an IPC server port to be
reused between two different iterations of the test loop. The client will then pick up and
re-use the existing IPC connection to the old server. However, the old server was shut down
and restarted, so the old IPC connection is stale (ie disconnected). This causes the new client
to get an EOF when it sends the "format()" call.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message