cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangeetha Hariharan (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CLOUDSTACK-5499) Vmware -When nfs was down for about 12 hours and then brought back up again , snasphots are not being attempted for some of the volumes which have snaphots that are in "CreatedOnPrimary" state.
Date Fri, 13 Dec 2013 21:22:07 GMT
Sangeetha Hariharan created CLOUDSTACK-5499:
-----------------------------------------------

             Summary: Vmware -When nfs was down for about 12 hours  and then brought back
up again , snasphots are not being attempted for some of the volumes which have snaphots that
are in "CreatedOnPrimary" state.
                 Key: CLOUDSTACK-5499
                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5499
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
          Components: Management Server
    Affects Versions: 4.3.0
         Environment: Build from 4.3
            Reporter: Sangeetha Hariharan
            Priority: Critical
             Fix For: 4.3.0


Vmware -When nfs was down for about 12 hours  and then brought back up again , snasphots are
not being attempted for some of the volumes which have snaphots that are in "CreatedOnPrimary"
state.

Set up :
Advanced Zone with 2 5.1 ESXI hosts.

Steps to reproduce the problem:

1. Deploy 5 Vms in each of the hosts , so we start with 11 Vms.
2. Start concurrent snapshots for ROOT volumes of all the Vms.
3. Shutdown the Secondary storage server when the snapshots are in the progress.
4. Bring the Secondary storage server up after 12 hours.

Follwoing are the issues that are seen in this run:

1. I see that the snapshots that are in Progress , report failures only after 12 hours even
though the backup.snapshot.wait is set to 12 hours.

2. New snapshot request that were executed when the NFS server was down , do  not report failure
immediately. In my case , i see that such  request eventually succeeded when the NFS server
was brought up. Is this the expected behavior ? Should we not expect to fail right away ,
instead of holding on to such active  sessions ?

3. Some of the snapshot failures resulted in snaphots that are in "CreatedOnPrimary" state.
For such volumes , snapshots are not being attempted at all , even though  the NFS server
was brought up.

Volumes in this state are - 16,18,17,22.

There are instances where  I have seen the snapshots being scheduled and succeeding even when
the previous state was "CreatedOnPrimary". Why are were able to schedule snapshots in such
cases ? And sometimes not in other cases?

mysql> select volume_id,status,created from snapshots where volume_id=18;
+-----------+------------------+---------------------+
| volume_id | status           | created             |
+-----------+------------------+---------------------+
|        18 | Destroyed        | 2013-12-12 23:24:14 |
|        18 | CreatedOnPrimary | 2013-12-12 23:53:39 |
|        18 | BackedUp         | 2013-12-13 01:53:38 |
|        18 | CreatedOnPrimary | 2013-12-13 03:53:38 |
+-----------+------------------+---------------------+



mysql> select volume_id,status,created from snapshots;
+-----------+------------------+---------------------+
| volume_id | status           | created             |
+-----------+------------------+---------------------+
|        22 | Destroyed        | 2013-12-12 23:24:13 |
|        21 | Destroyed        | 2013-12-12 23:24:13 |
|        20 | Destroyed        | 2013-12-12 23:24:14 |
|        19 | Destroyed        | 2013-12-12 23:24:14 |
|        18 | Destroyed        | 2013-12-12 23:24:14 |
|        17 | Destroyed        | 2013-12-12 23:24:14 |
|        16 | Destroyed        | 2013-12-12 23:24:14 |
|        14 | Destroyed        | 2013-12-12 23:24:15 |
|        25 | Destroyed        | 2013-12-12 23:24:15 |
|        24 | Destroyed        | 2013-12-12 23:24:15 |
|        23 | Destroyed        | 2013-12-12 23:24:15 |
|        22 | CreatedOnPrimary | 2013-12-12 23:53:38 |
|        21 | Destroyed        | 2013-12-12 23:53:38 |
|        20 | Destroyed        | 2013-12-12 23:53:38 |
|        19 | Destroyed        | 2013-12-12 23:53:39 |
|        18 | CreatedOnPrimary | 2013-12-12 23:53:39 |
|        17 | CreatedOnPrimary | 2013-12-12 23:53:40 |
|        16 | CreatedOnPrimary | 2013-12-12 23:53:40 |
|        14 | Destroyed        | 2013-12-12 23:53:40 |
|        25 | Destroyed        | 2013-12-12 23:53:41 |
|        24 | Destroyed        | 2013-12-12 23:53:41 |
|        23 | Destroyed        | 2013-12-12 23:53:42 |
|        21 | Destroyed        | 2013-12-13 00:53:37 |
|        19 | Destroyed        | 2013-12-13 00:53:38 |
|        22 | BackedUp         | 2013-12-13 01:53:37 |
|        21 | Destroyed        | 2013-12-13 01:53:38 |
|        20 | Destroyed        | 2013-12-13 01:53:38 |
|        19 | Destroyed        | 2013-12-13 01:53:38 |
|        18 | BackedUp         | 2013-12-13 01:53:38 |
|        17 | BackedUp         | 2013-12-13 01:53:38 |
|        16 | BackedUp         | 2013-12-13 01:53:39 |
|        14 | Destroyed        | 2013-12-13 01:53:39 |
|        25 | Destroyed        | 2013-12-13 01:53:39 |
|        24 | Destroyed        | 2013-12-13 01:53:39 |
|        23 | Destroyed        | 2013-12-13 01:53:40 |
|        22 | CreatedOnPrimary | 2013-12-13 03:53:37 |
|        21 | Destroyed        | 2013-12-13 03:53:38 |
|        20 | Destroyed        | 2013-12-13 03:53:38 |
|        19 | Destroyed        | 2013-12-13 03:53:38 |
|        18 | CreatedOnPrimary | 2013-12-13 03:53:38 |
|        17 | CreatedOnPrimary | 2013-12-13 03:53:38 |
|        16 | CreatedOnPrimary | 2013-12-13 03:53:39 |
|        14 | Destroyed        | 2013-12-13 03:53:39 |
|        24 | Destroyed        | 2013-12-13 08:53:37 |
|        25 | Destroyed        | 2013-12-13 09:53:37 |
|        23 | Destroyed        | 2013-12-13 10:53:37 |
|        21 | Destroyed        | 2013-12-13 16:53:37 |
|        20 | Destroyed        | 2013-12-13 16:53:38 |
|        19 | Destroyed        | 2013-12-13 16:53:38 |
|        14 | Destroyed        | 2013-12-13 16:53:38 |
|        21 | BackedUp         | 2013-12-13 18:53:37 |
|        20 | BackedUp         | 2013-12-13 18:53:38 |
|        19 | BackedUp         | 2013-12-13 18:53:38 |
|        14 | BackedUp         | 2013-12-13 18:53:38 |
|        25 | BackedUp         | 2013-12-13 18:53:38 |
|        24 | BackedUp         | 2013-12-13 18:53:38 |
|        23 | BackedUp         | 2013-12-13 18:53:39 |
|        21 | BackedUp         | 2013-12-13 19:53:37 |
|        20 | BackedUp         | 2013-12-13 19:53:38 |
|        19 | BackedUp         | 2013-12-13 19:53:38 |
|        14 | BackedUp         | 2013-12-13 19:53:38 |
|        25 | BackedUp         | 2013-12-13 19:53:38 |
|        24 | BackedUp         | 2013-12-13 19:53:39 |
|        23 | BackedUp         | 2013-12-13 19:53:39 |
+-----------+------------------+---------------------+




--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message