hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (HDDS-2045) Partially started compose cluster left running
Date Tue, 27 Aug 2019 18:10:00 GMT

     [ https://issues.apache.org/jira/browse/HDDS-2045?focusedWorklogId=302284&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-302284
]

ASF GitHub Bot logged work on HDDS-2045:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 27/Aug/19 18:09
            Start Date: 27/Aug/19 18:09
    Worklog Time Spent: 10m 
      Work Description: adoroszlai commented on pull request #1358: HDDS-2045. Partially started
compose cluster left running
URL: https://github.com/apache/hadoop/pull/1358
 
 
   ## What changes were proposed in this pull request?
   
   If any container in the sample cluster [fails to start](https://github.com/elek/ozone-ci/blob/5c64f77f3ab64aed0826d8f40991fe621f843efd/pr/pr-hdds-2026-p4f6m/acceptance/output.log#L24),
all successfully started containers are left running.  This [prevents](https://github.com/elek/ozone-ci/blob/5c64f77f3ab64aed0826d8f40991fe621f843efd/pr/pr-hdds-2026-p4f6m/acceptance/output.log#L59)
any further acceptance tests from normal completion.  This is only a minor inconvenience,
since acceptance test as a whole fails either way.
   
   This change makes sure the cluster is stopped if startup fails.
   
   https://issues.apache.org/jira/browse/HDDS-2045
   
   ## How was this patch tested?
   
   Temporarily added fake failures in `start_docker_env` and `wait_for_datanodes`, and verified
that the cluster is stopped:
   
   ```
   $ ./test.sh
   Removing network ozone_default
   WARNING: Network ozone_default not found.
   Creating network "ozone_default" with the default driver
   Creating ozone_scm_1      ... done
   Creating ozone_datanode_1 ... done
   Creating ozone_datanode_2 ... done
   Creating ozone_datanode_3 ... done
   Creating ozone_om_1       ... done
   0 datanode is up and healthy (until now)
   Stopping ozone_datanode_1 ... done
   Stopping ozone_datanode_3 ... done
   Stopping ozone_om_1       ... done
   Stopping ozone_datanode_2 ... done
   Stopping ozone_scm_1      ... done
   Removing ozone_datanode_1 ... done
   Removing ozone_datanode_3 ... done
   Removing ozone_om_1       ... done
   Removing ozone_datanode_2 ... done
   Removing ozone_scm_1      ... done
   Removing network ozone_default
   ```
   
   Verified that the test succeeds without the fake failure.
   
   ```
   $ ./test.sh
   Removing network ozone_default
   WARNING: Network ozone_default not found.
   Creating network "ozone_default" with the default driver
   Creating ozone_scm_1      ... done
   Creating ozone_om_1       ... done
   Creating ozone_datanode_1 ... done
   Creating ozone_datanode_2 ... done
   Creating ozone_datanode_3 ... done
   0 datanode is up and healthy (until now)
   3 datanodes are up and registered to the scm
   ==============================================================================
   ozone-auditparser
   ==============================================================================
   ozone-auditparser.Auditparser :: Smoketest ozone cluster startup
   ==============================================================================
   Initiating freon to generate data                                     | PASS |
   ------------------------------------------------------------------------------
   Testing audit parser                                                  | PASS |
   ------------------------------------------------------------------------------
   ozone-auditparser.Auditparser :: Smoketest ozone cluster startup      | PASS |
   2 critical tests, 2 passed, 0 failed
   2 tests total, 2 passed, 0 failed
   ==============================================================================
   ozone-auditparser                                                     | PASS |
   2 critical tests, 2 passed, 0 failed
   2 tests total, 2 passed, 0 failed
   ==============================================================================
   Output:  /tmp/smoketest/ozone/result/robot-ozone-ozone-auditparser-om.xml
   ==============================================================================
   ozone-basic :: Smoketest ozone cluster startup
   ==============================================================================
   Check webui static resources                                          | PASS |
   ------------------------------------------------------------------------------
   Start freon testing                                                   | PASS |
   ------------------------------------------------------------------------------
   ozone-basic :: Smoketest ozone cluster startup                        | PASS |
   2 critical tests, 2 passed, 0 failed
   2 tests total, 2 passed, 0 failed
   ==============================================================================
   Output:  /tmp/smoketest/ozone/result/robot-ozone-ozone-basic-scm.xml
   Stopping ozone_datanode_1 ... done
   Stopping ozone_datanode_3 ... done
   Stopping ozone_datanode_2 ... done
   Stopping ozone_om_1       ... done
   Stopping ozone_scm_1      ... done
   Removing ozone_datanode_1 ... done
   Removing ozone_datanode_3 ... done
   Removing ozone_datanode_2 ... done
   Removing ozone_om_1       ... done
   Removing ozone_scm_1      ... done
   Removing network ozone_default
   ```
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 302284)
    Remaining Estimate: 0h
            Time Spent: 10m

> Partially started compose cluster left running
> ----------------------------------------------
>
>                 Key: HDDS-2045
>                 URL: https://issues.apache.org/jira/browse/HDDS-2045
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: docker, test
>            Reporter: Doroszlai, Attila
>            Assignee: Doroszlai, Attila
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> If any container in the sample cluster [fails to start|https://github.com/elek/ozone-ci/blob/5c64f77f3ab64aed0826d8f40991fe621f843efd/pr/pr-hdds-2026-p4f6m/acceptance/output.log#L24],
all successfully started containers are left running.  This [prevents|https://github.com/elek/ozone-ci/blob/5c64f77f3ab64aed0826d8f40991fe621f843efd/pr/pr-hdds-2026-p4f6m/acceptance/output.log#L59]
any further acceptance tests from normal completion.  This is only a minor inconvenience,
since acceptance test as a whole fails either way.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message