hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2915) HA: TestFailureOfSharedDir.testFailureOfSharedDir() has race condition
Date Wed, 08 Feb 2012 17:42:59 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203763#comment-13203763
] 

Aaron T. Myers commented on HDFS-2915:
--------------------------------------

bq. I think after addressing HDFS-2914, this problem should not be there right?

Not necessarily. Depends upon how HDFS-2914 is addressed.

I think a fine solution for this issue would just be to set the {{dfs.namenode.resource.check.interval}}
very high in the test. It defaults to 5 seconds.
                
> HA: TestFailureOfSharedDir.testFailureOfSharedDir() has race condition
> ----------------------------------------------------------------------
>
>                 Key: HDFS-2915
>                 URL: https://issues.apache.org/jira/browse/HDFS-2915
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>            Priority: Minor
>
> The test deletes the shared edits dir to simulate a failure. Then it calls rollEditLogs()
to trigger the deleted dir to be used and fail with an IOException. Unfortunately, deleting
the shared dir can put the NN in safe mode due to lack of space. This causes a SafeModeException
to be thrown when rollEditDirs() is called. This exception is caught as an IOException in
the test but the associated assert in the catch block fails.
> This always happens in the debugger because the delay in stepping through causes the
safe mode change to happen before rollEditLogs() gets called.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message