hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiaoyu Yao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDDS-347) Fix : testCloseContainerViaStandaAlone fails sometimes
Date Tue, 14 Aug 2018 17:22:00 GMT

    [ https://issues.apache.org/jira/browse/HDDS-347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16580110#comment-16580110

Xiaoyu Yao commented on HDDS-347:

Thanks [~GeLiXin] for the details. If we consider the order of the state change and the log
output, should we use GenericTestUtils.waitFor the logCapturer.getOutput().contains the
expected message first and then validate the isContainerClosed state? This way, the wait
behavior will be deterministic with minimal unnecessary sleep. 

> Fix : testCloseContainerViaStandaAlone fails sometimes
> ------------------------------------------------------
>                 Key: HDDS-347
>                 URL: https://issues.apache.org/jira/browse/HDDS-347
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: LiXin Ge
>            Assignee: LiXin Ge
>            Priority: Major
>             Fix For: 0.2.1
>         Attachments: HDDS-347.000.patch
> This issue was finded in the automatic JenKins unit test of HDDS-265.
>  The container life cycle state is : Open -> Closing -> closed, this test submit
the container close command and wait for container state change to *not equal to open*, actually
even when the state condition(not equal to open) is satisfied, the container may still in
process of closing, so the LOG which will printf after the container closed can't be find
sometimes and the test fails.
> {code:java|title=KeyValueContainer.java|borderStyle=solid}
>     try {
>       writeLock();
>       containerData.closeContainer();
>       File containerFile = getContainerFile();
>       // update the new container data to .container File
>       updateContainerFile(containerFile);
>     } catch (StorageContainerException ex) {
> {code}
> Looking at the code above, the container state changes from CLOSING to CLOSED in the
first step, the remaining *updateContainerFile* may take hundreds of milliseconds, so even
we modify the test logic to wait for the *CLOSED* state will not guarantee the test success,
>  These are two way to fix this:
>  1, Remove one of the double check which depends on the LOG.
>  2, If we have to preserve the double check, we should wait for the *CLOSED* state and
sleep for a while to wait for the LOG appears.
>  patch 000 is based on the second way.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message