incubator-cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anthony Xu (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (CLOUDSTACK-251) If one primary storage is put into maintenance mode, entire cloud goes down. CS 3.02
Date Thu, 04 Oct 2012 18:39:47 GMT

     [ https://issues.apache.org/jira/browse/CLOUDSTACK-251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Anthony Xu resolved CLOUDSTACK-251.
-----------------------------------

    Resolution: Fixed

commit 0c6bdd2781320bf057197ba040e80c3baf88e8f3
Author: Anthony Xu <anthony@cloud.com>
Date:   Thu Oct 4 11:24:30 2012 -0700

    CLOUDSTACK-251 :
    
    when host is reconnected, CS try to make sure the host can access primary storage,
    CS only do this when primary storage is UP, and even host cannot access primary storage,
    that is okay, do not throw exception, just print a warning message

                
> If one primary storage is put into maintenance mode, entire cloud goes down. CS 3.02
> ------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-251
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-251
>             Project: CloudStack
>          Issue Type: Bug
>          Components: Storage Controller
>    Affects Versions: pre-4.0.0
>         Environment: Cloudstack 3.02 on Centos, Xenserevr 6.2 Hypervisors
>            Reporter: Nik Martin
>            Assignee: Anthony Xu
>            Priority: Critical
>              Labels: storage
>             Fix For: pre-4.0.0
>
>
> I have two SANs in a cluster, both are Primary storage.  One is HD based,and one is SSD
based.  I use storage tags "HD" and "SSD" respectively.  The HD based SAN is a single 20TB
volume, with 1 iSCSI target, and 1 LUN.  The SSD SAN is two 5TB volumes, each with 1 target,
and 1 LUN each, in an Active-Active configuration.  The SSD SAN suffered from a mis-configuration
issue, so we had to put it into maintenance mode in a hurry, and shut it down.  I fully expected
the Volumes and VMs provisioned on the SSD SAN to be unavailable.  The problem is Cloudstack
continued to try to access Volume id 204, which is Target0 on the SSD san.  It shut every
VM down, and put all Hypervisors into Alert state, and went into a loop trying to connect
to a volume that is in maintenance mode.  This creates a very bad situation for me and my
customers. My entire cloud was offline until we could re-synchronize the Active-Actibve volumes
on the SSD SAN and bring it back online

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message