cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alena Prokharchyk <Alena.Prokharc...@citrix.com>
Subject Re: 答复: System VMs restarted on a disabled cluster
Date Sat, 21 Jul 2012 00:06:59 GMT
On 7/20/12 3:11 PM, "Evan Miller" <Evan.Miller@citrix.com> wrote:

>Hi Alena:
>
>I finally was able to delete the cluster. However, it required
>the following expected and unusual steps:
>
>1. To my surprise, one of the system VMs had only been stopped.
>I swear that I previously viewed the system VMs from the CMSM
>GUI and clearly thought I saw "no data".
>
>2. So, I destroyed that particular system VM.
>
>3. Then, I doubled-checked: No Storage from CSMS GUI tab.
>
>4. Then, I doubled-checked: No Instances.
>
>5. I double-checked: Primary Storage was in maintenance mode.
>
>6. I attempted to delete Primary Storage, but it failed.
>
>7. So, I went into the MySQL database and saw several recent volumes
>and just manually deleted all of them from MySQL:

We never advise to do that, don't modify the DB unless there is no other
way to recover. It might lead to all kinds of problems, first of all, the
volumes will continue exist on the backend + there might be other
cloudstack resources referencing the volumes (snapshots for instance).
Besides, if there is a removed field, don't remove the records, just mark
them for removal.


And in your case all the volumes except for 1, had not null removed field,
it means that they were succesfully removed, and didn't cause primary
storage deletion to fail.
Only one of them had null removed field (id=2); not sure what caused it to
be stuck in Destroy state.  To force deletion for storage pool, next time
call API command: deleteStoragePool with forced=true option (not sure if
it's available in the UI, call API if not)


>
>
>mysql> select * from volumes;
>+----+------------+-----------+---------+--------------+-------------+----
>-------+--------+--------------------------------------+------------+-----
>-----------------+--------------------------------------+--------+--------
>--------+------------+---------+-------------+-------------------+--------
>----------+-------------+----------------------------+-------------+------
>---------------+----------+---------------------+---------------------+---
>------+------------+--------------+
>| id | account_id | domain_id | pool_id | last_pool_id | instance_id |
>device_id | name   | uuid                                 | size       |
>folder               | path                                 | pod_id |
>data_center_id | iscsi_name | host_ip | volume_type | pool_type         |
>disk_offering_id | template_id | first_snapshot_backup_uuid | recreatable
>| created             | attached | updated             | removed
>   | state   | chain_info | update_count |
>+----+------------+-----------+---------+--------------+-------------+----
>-------+--------+--------------------------------------+------------+-----
>-----------------+--------------------------------------+--------+--------
>--------+------------+---------+-------------+-------------------+--------
>----------+-------------+----------------------------+-------------+------
>---------------+----------+---------------------+---------------------+---
>------+------------+--------------+
>|  1 |          1 |         1 |     201 |         NULL |           1 |
>     0 | ROOT-1 | c7442a08-7c15-453c-9041-2315295ef512 | 2147483648 |
>/home/export/primary | b10b18fc-7fab-4e03-8cb1-28fd77d4d42c |      1 |
>          1 | NULL       | NULL    | ROOT        | NetworkFilesystem |
>            6 |           1 | NULL                       |           1 |
>2012-07-20 19:20:29 | NULL     | 2012-07-20 19:32:29 | 2012-07-20
>19:32:30 | Destroy | NULL       |            5 |
>|  2 |          1 |         1 |     201 |         NULL |           2 |
>     0 | ROOT-2 | ccbdb9ae-b2d7-4eda-b6a0-42b1c2a598fc | 2147483648 |
>/home/export/primary | ac9dbd13-790f-491c-be4f-0d757e5a6ac3 |      1 |
>          1 | NULL       | NULL    | ROOT        | NetworkFilesystem |
>            8 |           1 | NULL                       |           1 |
>2012-07-20 19:20:29 | NULL     | 2012-07-20 19:34:33 | NULL
> | Destroy | NULL       |            5 |
>|  3 |          1 |         1 |    NULL |         NULL |           3 |
>     0 | ROOT-3 | 5fe6d1ce-e90f-4531-8b14-6c09d276dfd5 |  565240320 |
>NULL                 | NULL                                 |   NULL |
>          1 | NULL       | NULL    | ROOT        | NULL              |
>            6 |           1 | NULL                       |           1 |
>2012-07-20 19:32:59 | NULL     | 2012-07-20 19:33:29 | 2012-07-20
>19:33:29 | Destroy | NULL       |            2 |
>|  4 |          1 |         1 |    NULL |         NULL |           4 |
>     0 | ROOT-4 | b7d1c546-e31a-4cbd-bb7c-1d5df44b8d1a |  565240320 |
>NULL                 | NULL                                 |   NULL |
>          1 | NULL       | NULL    | ROOT        | NULL              |
>            6 |           1 | NULL                       |           1 |
>2012-07-20 19:33:59 | NULL     | 2012-07-20 19:34:29 | 2012-07-20
>19:34:29 | Destroy | NULL       |            2 |
>|  5 |          1 |         1 |    NULL |         NULL |           5 |
>     0 | ROOT-5 | 58f0d686-efd0-445a-9861-b5950dfcb8bb |  565240320 |
>NULL                 | NULL                                 |   NULL |
>          1 | NULL       | NULL    | ROOT        | NULL              |
>            6 |           1 | NULL                       |           1 |
>2012-07-20 19:34:59 | NULL     | 2012-07-20 19:35:09 | 2012-07-20
>19:35:10 | Destroy | NULL       |            2 |
>|  6 |          1 |         1 |    NULL |         NULL |           6 |
>     0 | ROOT-6 | d38211db-9594-49a4-8621-08a513321e6f |  565240320 |
>NULL                 | NULL                                 |   NULL |
>          1 | NULL       | NULL    | ROOT        | NULL              |
>            6 |           1 | NULL                       |           1 |
>2012-07-20 19:35:29 | NULL     | 2012-07-20 21:34:02 | 2012-07-20
>21:34:02 | Destroy | NULL       |            2 |
>+----+------------+-----------+---------+--------------+-------------+----
>-------+--------+--------------------------------------+------------+-----
>-----------------+--------------------------------------+--------+--------
>--------+------------+---------+-------------+-------------------+--------
>----------+-------------+----------------------------+-------------+------
>---------------+----------+---------------------+---------------------+---
>------+------------+--------------+
>6 rows in set (0.00 sec)
>
>mysql> delete from volumes;
>Query OK, 6 rows affected (0.05 sec)
>
>mysql> select * from volumes;
>Empty set (0.00 sec)
>
>mysql>
>
>8. After deletion, I went back to the CSMS GUI. I attempted to
>delete Primary Storage, but it was gone already.
>
>9. I could then delete the cluster.
>
>Why was it necessary to manually delete the volumes from MySQL?
>Was there something in one or more of those volume entries that
>prevented deletion of the storage pool?



It should never be necessary - see my comment to 7). We never advise
people to mock up with the database unless there is no other way to
recover from situation.

>
>NOTE: XenCenter was displaying c9c0319f-33f0-3494-9ada-4d7a2f1dafd4
>as if it were a separate device, independent of the XenServers.
>I couldn't delete c9c0319f-33f0-3494-9ada-4d7a2f1dafd4 from XenCenter
>either. Is c9c0319f-33f0-3494-9ada-4d7a2f1dafd4 the NFS share?

I don't know why it happens.

>Regards,
>Evan
>
>
>-----Original Message-----
>From: Alena Prokharchyk
>Sent: Friday, July 20, 2012 12:58 PM
>To: cloudstack-users@incubator.apache.org
>Cc: Evan Miller
>Subject: Re: 答复: System VMs restarted on a disabled cluster
>
>All volumes allocated in the pool, have to be destroyed ("Cannot delete
>pool LS_PRIMARY1 as there are associated vols for this pool" error
>indicates it). Please destroy all vms using this pool.
>
>-Alena.
>
>On 7/20/12 12:52 PM, "Evan Miller" <Evan.Miller@citrix.com> wrote:
>
>>Hi Alena:
>>
>>I got thwarted on one of the cluster deletion steps.
>>
>>>* disable cluster
>>>* enable maintenance for the primary storage in the cluster
>>>* put hosts in cluster into maintenance mode
>>>
>>>* destroy system vms
>>>* delete hosts and primary storage
>>
>>From CSMS GUI ...
>>I can delete the hosts.
>>However, I couldn't delete primary storage.
>>The error said "Failed to delete storage pool".
>>
>>I can list the particular storage pool:
>>
>>FINAL URL AFTER SPECIAL SUBSTITUTION(S):
>>
>>http://10.217.5.192:8080/client/api?apikey=bb0HqLkZWZl87olMVaQ1MCWgt_3N
>>PPf
>>oWLorilzI-vDpwSgN1KF2KfSoUl00yHNxa8x2aYrMfG2d_s-FXu_Tfg&command=listSto
>>rag
>>ePools&clusterid=c03d4dee-d8cd-475b-962b-14149ba3be45&response=json&sig
>>nat
>>ure=7q%2BIr4lZMbsjctbnUidIej9gtgk%3D
>>
>>HEADERS:
>>Date: Fri, 20 Jul 2012 19:43:40 GMT
>>Server: Apache-Coyote/1.1
>>Content-Length: 562
>>Content-Type: text/javascript;charset=UTF-8
>>Client-Date: Fri, 20 Jul 2012 19:43:39 GMT
>>Client-Peer: 10.217.5.192:8080
>>Client-Response-Num: 1
>>CONTENT:
>>HTTP/1.1 200 OK
>>Date: Fri, 20 Jul 2012 19:43:40 GMT
>>Server: Apache-Coyote/1.1
>>Content-Length: 562
>>Content-Type: text/javascript;charset=UTF-8
>>Client-Date: Fri, 20 Jul 2012 19:43:39 GMT
>>Client-Peer: 10.217.5.192:8080
>>Client-Response-Num: 1
>>
>>{ "liststoragepoolsresponse" : { "count":1 ,"storagepool" : [
>>{"id":"c9c0319f-33f0-3494-9ada-4d7a2f1dafd4","zoneid":"5127f0df-0d5e-4a
>>22-
>>9c88-fba8ff592612","zonename":"LS_ZONE1","podid":"c89cb02e-78f9-413f-87
>>83-
>>19d1baaddb03","podname":"LS_POD1","name":"LS_PRIMARY1","ipaddress":"10.
>>217
>>.5.192","path":"/home/export/primary","created":"2012-07-20T12:20:01-0700
>>"
>>,"type":"NetworkFilesystem","clusterid":"c03d4dee-d8cd-475b-962b-14149b
>>a3b
>>e45","clustername":"LS_R12345","disksizetotal":104586543104,"disksizeal
>>loc ated":2712723968,"tags":"","state":"Maintenance"} ] } }
>>
>>NOTE: Under Storage tab from the GUI, there is no data.
>>
>>But I can't delete that storage pool:
>>
>>FINAL URL AFTER SPECIAL SUBSTITUTION(S):
>>
>>http://10.217.5.192:8080/client/api?apikey=bb0HqLkZWZl87olMVaQ1MCWgt_3N
>>PPf
>>oWLorilzI-vDpwSgN1KF2KfSoUl00yHNxa8x2aYrMfG2d_s-FXu_Tfg&command=deleteS
>>tor
>>agePool&id=c9c0319f-33f0-3494-9ada-4d7a2f1dafd4&response=json&signature
>>=8z
>>4Rbi2t%2BzKHvCkJ2USIRC%2Bx8oQ%3D
>>
>>Error My Final URL:
>>http://10.217.5.192:8080/client/api?apikey=bb0HqLkZWZl87olMVaQ1MCWgt_3N
>>PPf
>>oWLorilzI-vDpwSgN1KF2KfSoUl00yHNxa8x2aYrMfG2d_s-FXu_Tfg&command=deleteS
>>tor
>>agePool&id=c9c0319f-33f0-3494-9ada-4d7a2f1dafd4&response=json&signature
>>=8z
>>4Rbi2t%2BzKHvCkJ2USIRC%2Bx8oQ%3D
>><html>
>><head><title>An Error Occurred</title></head> <body>
<h1>An Error
>>Occurred</h1>
>><p>530 Unknown code</p>
>></body>
>></html>
>>moonshine#
>>
>>The api log says:
>>
>>2012-07-20 12:46:05,499 INFO  [cloud.api.ApiServer]
>>(catalina-exec-10:null) (userId=2 accountId=2
>>sessionId=DC150E34937E29953352893CADABEA63) 10.216.134.53 -- GET
>>command=deleteStoragePool&id=c9c0319f-33f0-3494-9ada-4d7a2f1dafd4&respo
>>nse
>>=json&sessionkey=UsR2i5%2FbTT7zW8RfStD8aH6EqVA%3D&_=1342813564939 530
>>Failed to delete storage pool
>>
>>The management log says this:
>>
>>2012-07-20 12:46:05,497 WARN  [cloud.storage.StorageManagerImpl]
>>(catalina-exec-10:null) Cannot delete pool LS_PRIMARY1 as there are
>>associated vols for this pool
>>
>>I need to be able to cleanly (and often) delete clusters, since each
>>labscaler reservation will require a cluster.
>>
>>Is there something in the database that needs to be cleaned out?
>>
>>>* delete the cluster
>>
>>Regards,
>>Evan
>>
>>
>>-----Original Message-----
>>From: Alena Prokharchyk
>>Sent: Friday, July 13, 2012 4:26 PM
>>To: Evan Miller
>>Subject: FW: 答复: System VMs restarted on a disabled cluster
>>
>>On 7/11/12 8:20 PM, "Mice Xia" <mice_xia@tcloudcomputing.com> wrote:
>>
>>>Hi, Alena,
>>>
>>>Im trying to follow your steps:
>>>
>>>* disable cluster
>>>Succeed.
>>>
>>>* enable maintenance for the primary storage in the cluster
>>>Maintenance on VMware cluster failed for the first two trys, with
>>>error message
>>>like:
>>>Unable to create a deployment for VM[ConsoleProxy|v-38-VM]
>>>
>>>WARN  [cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:)
>>>Exception while trying to start console proxy
>>>com.cloud.exception.InsufficientServerCapacityException: Unable to
>>>create a deployment for VM[ConsoleProxy|v-47-VM]Scope=interface
>>>com.cloud.dc.DataCenter; id=1
>>>
>>>seems each time a new system VM was created, but still on VMware
>>>cluster, which leads to failure The maintenance succeed in the third
>>>try.
>>>
>>>* put hosts in cluster into maintenance mode Succeed
>>>
>>>* destroy system vms
>>>Destroying them does not stop them re-create
>>>
>>>* delete hosts and primary storage
>>>Failed to delete primary storage, with message: there are still
>>>volumes associated with this pool
>>>
>>>* delete the cluster
>>>
>>>
>>>Putting hosts/storage into maintenance mode does not stop system VMs
>>>re-create From codes I can see management server get supported
>>>hypervisorTypes and always fetch the first one, and the first one in
>>>my environment happens to be vmware.
>>>
>>>I have changed expunge.interval = expunge.delay = 120 Should I set
>>>consoleproxy.restart = false and update db to set
>>>secondary.storage.vm=false ?
>>>
>>>Regards
>>>Mice
>>>
>>>-----邮件原件-----
>>>发件人: Alena Prokharchyk [mailto:Alena.Prokharchyk@citrix.com]
>>>发送时间: 2012年7月12日 10:03
>>>收件人: cloudstack-dev@incubator.apache.org
>>>主题: Re: System VMs restarted on a disabled cluster
>>>
>>>On 7/11/12 6:29 PM, "Mice Xia" <mice_xia@tcloudcomputing.com> wrote:
>>>
>>>>Hi, All
>>>>
>>>>
>>>>
>>>>I've set up an environment with two clusters (in the same pod), one
>>>>Xenserver and the other is VMware, based on 3.0.x ASF branch.
>>>>
>>>>Now I'm trying to remove the VMware cluster begin with disabling it
>>>>and destroying the system VMS running on it, but the systemVMs
>>>>restarted immediately on VMware cluster, which blocks cluster removal.
>>>>
>>>>
>>>>
>>>>I wonder if this is the expected result by design, or should it be
>>>>better that the system VMs get allocated on an enabled cluster?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>Regards
>>>>
>>>>Mice
>>>>
>>>>
>>>
>>>
>>>
>>>It's by design. Disabled cluster just can't be used for creating new /
>>>starting existing user vms / routers; but it still can be used by
>>>system resources (SSVM and Console proxy).
>>>
>>>To delete the cluster, you need to:
>>>
>>>* disable cluster
>>>* enable maintenance for the primary storage in the cluster
>>>* put hosts in cluster into maintenance mode
>>>
>>>* destroy system vms
>>>* delete hosts and primary storage
>>>* delete the cluster
>>>
>>>-Alena.
>>>
>>>
>>
>>
>>
>
>
>


Mime
View raw message