cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tony_caot...@163.com
Subject Re: XenServer is disconnected after CS hosts shutdown
Date Wed, 22 Jul 2015 13:03:13 GMT

Hey!  help please...

some news.
I think the cause is that the ACS host can't communicate with XenServer 
host.
ACS continues outputing logs like this

2015-07-22 20:42:13,555 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(AgentManager-Handler-7:null) Seq 5-8174877748607582212: Forwarding Seq 
5-8174877748607582212:  { Cmd , MgmtId: 279278805451459, via: 5, Ver: 
v1, Flags: 100111, [{"com.cloud.agent.api.MaintainCommand":{"wait":0}}] 
} to 280345368052992

I am not sure that if the ACS status is wrong or some services on 
xenserver are not opend.

on xenserver , I found *xenheartbeat.sh is not running.*
*(/bin/bash /opt/cloud/bin/xenheartbeat.sh 
00d8e0d0-8561-4b3d-9044-cbc496ff22cc 120 60)*

As some operations about xenserver was pendingļ¼Œ xenserver can not be 
deleted from web UI.

I got a temporary solution

1. delete jobs from DB cloud.vm_work_job.
2. delete xenserver from DB cloud.host.
3. add xenserver host back from web UI.

then it works.

Does anyone have a idea for this?

Could anyone tell what things does ACS do on xenserver host when adding 
a xenserver ?

Thanks,

-----------
Cao Tong

On 07/22/2015 04:26 PM, tony_caotong@163.com wrote:
>
> @prashant, following it the answer of you questions
>
> 1. Yes, primary storage is connected fine for my xenserver.
>
> 2. No, Xenserver's password is not changed.
>
> 3. yes, web UI is fine, and I can login.
>
> 4.  before reboot, I unmanaged and disabled resources,  and after 
> reboot I have enabled all of them.
>
> 5.  hosts is states is UP.
>
> 6. No yum update in anywhere.
>
> 7.  system VMs status is fine, i think.
>
> -----------
> Cao Tong
>
> On 07/22/2015 04:13 PM, tony_caotong@163.com wrote:
>>
>> Hi,
>>
>> After reinstall, I got the problem again
>>
>> So, I will describe once again.
>>
>> WHAT my environment looks like:
>>
>> I have a ACS server host and a xenserver host, After both reboot, I 
>> can not create a VM on xenserver through ACS.
>> A KVM and A NFS are running together in ACS manager host.
>>
>> the status of new VM is always 'staring' on the WEB, but I can create 
>> new VM using xencenter.
>>
>> ------------- ERR LOGS ----------
>> 2015-07-22 15:56:56,357 DEBUG [c.c.s.StorageManagerImpl] 
>> (StatsCollector-3:ctx-1aa2e8c9) Unable to send storage pool command 
>> to Pool[4|NetworkFilesystem] via 4
>> com.cloud.exception.OperationTimedoutException: Commands 
>> 2829104990918803478 to Host 4 timed out after 3600
>>
>> 2015-07-22 15:56:56,358 INFO  [c.c.s.StatsCollector] 
>> (StatsCollector-3:ctx-1aa2e8c9) Unable to reach 
>> Pool[4|NetworkFilesystem]
>> com.cloud.exception.StorageUnavailableException: Resource 
>> [StoragePool:4] is unreachable: Unable to send command to the pool
>>
>>
>> ------------- and there are lots of DEBUG infos  ------- repeat again 
>> and again -----------
>>
>> 2015-07-22 15:36:12,887 DEBUG [c.c.a.m.ClusteredAgentAttache] 
>> (AgentManager-Handler-14:null) Seq 4-8064821032713715922: Forwarding 
>> Seq 4-8064821032713715922:  { Cmd , MgmtId: 227448510156211, via: 4, 
>> Ver: v1, Flags: 100111, 
>> [{"com.cloud.agent.api.MaintainCommand":{"wait":0}}] } to 
>> 116784073679673
>> 2015-07-22 15:36:12,889 DEBUG [c.c.a.m.ClusteredAgentAttache] 
>> (AgentManager-Handler-10:null) Seq 4-8064821032713715883: Forwarding 
>> Seq 4-8064821032713715883:  { Cmd , MgmtId: 227448510156211, via: 4, 
>> Ver: v1, Flags: 100111, 
>> [{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"path":"template/tmpl/1/5/af949612-838f-3a6d-931b-312e612db740.vhd","origUrl":"http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2","uuid":"80b60e46-3017-11e5-8736-00259091a13a","id":5,"format":"VHD","accountId":1,"checksum":"905cec879afd9c9d22ecc8036131a180","hvm":false,"displayText":"CentOS

>> 5.6(64-bit) no GUI 
>> (XenServer)","imageDataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://10.0.0.100/storage/secondary","_role":"Image"}},"name":"centos56-x86_64-xen","hypervisorType":"XenServer"}},"destTO":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"origUrl":"http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2","uuid":"80b60e46-3017-11e5-8736-00259091a13a","id":5,"format":"VHD","accountId":1,"checksum":"905cec879afd9c9d22ecc8036131a180","hvm":false,"displayText":"CentOS

>> 5.6(64-bit) no GUI 
>> (XenServer)","imageDataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"2df26406-31bf-3a95-8a61-f5008defd9a0","id":4,"poolType":"NetworkFilesystem","host":"10.0.0.100","path":"/storage/xen/primary","port":2049,"url":"NetworkFilesystem://10.0.0.100/storage/xen/primary/?ROLE=Primary&STOREUUID=2df26406-31bf-3a95-8a61-f5008defd9a0"}},"name":"centos56-x86_64-xen","hypervisorType":"XenServer"}},"executeInSequence":true,"options":{},"wait":10800}}]

>> } to 116784073679673
>>
>>
>> -----------------------------------------
>>
>> Anyone have Any ideas?  thanks.
>>
>> -----------
>> Cao Tong
>>
>> On 07/21/2015 06:14 PM, tony_caotong@163.com wrote:
>>>
>>> Thanks all,
>>>
>>> I have already reinstall my hosts for preparing a new clear 
>>> environment to restart my research.
>>>
>>> -----------
>>> Cao Tong
>>>
>>> On 07/20/2015 09:24 PM, Prashant s wrote:
>>>> some questions :
>>>>
>>>> can you please tell ...
>>>>
>>>> 1. is your NFS storage or your primary Storage Repository in connected
>>>> mode with no red cross mark on them in xencenter.
>>>> 2. did you change any passwords on the xenservers ?
>>>> 3. is the cloudstack web ui up , can you login to the cloudstack 
>>>> web page.
>>>> 4. *are the zone , pod, or clusters in unmanaged or disabled state ? *
>>>> *5. is all the hosts in connected state  ? *
>>>> *6. did you run  yum update on host reboot on the cs manager vm ? *
>>>> *7. system vms are stateless you can kill them and cs will recreate 
>>>> a new
>>>> one .. so dont worry :-) *
>>>>
>>>>
>>>> *thanks *
>>>> *prashant *
>>>>
>>>>
>>>>
>>>> On Mon, Jul 20, 2015 at 3:47 AM, <tony_caotong@163.com> wrote:
>>>>
>>>>> Hi, I restartd All hosts (one mgr and xenserver) again.
>>>>>
>>>>>
>>>>> Following is the error log.
>>>>>
>>>>>
>>>>> 2015-07-20 15:33:49,688 INFO [c.c.u.e.CSExceptionErrorCode]
>>>>> (StatsCollector-3:ctx-692a5392) Could not find exception:
>>>>> com.cloud.exception.OperationTimedoutException in error code list for
>>>>> exceptions
>>>>> 2015-07-20 15:33:49,688 WARN  [c.c.a.m.AgentAttache]
>>>>> (StatsCollector-3:ctx-692a5392) Seq 1-3176445112179752972: Timed 
>>>>> out on null
>>>>> 2015-07-20 15:33:49,689 DEBUG [c.c.a.m.AgentAttache]
>>>>> (StatsCollector-3:ctx-692a5392) Seq 1-3176445112179752972: 
>>>>> Cancelling.
>>>>> 2015-07-20 15:33:49,689 DEBUG [c.c.s.StorageManagerImpl]
>>>>> (StatsCollector-3:ctx-692a5392) Unable to send storage pool 
>>>>> command to
>>>>> Pool[1|NetworkFilesystem] via 1
>>>>> com.cloud.exception.OperationTimedoutException: Commands
>>>>> 3176445112179752972 to Host 1 timed out after 3600
>>>>>          at 
>>>>> com.cloud.agent.manager.AgentAttache.send(AgentAttache.java:436)
>>>>>          at
>>>>> com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:433)

>>>>>
>>>>>          at
>>>>> com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:362)

>>>>>
>>>>>          at
>>>>> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:1000)

>>>>>
>>>>>          at
>>>>> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:392)

>>>>>
>>>>>          at
>>>>> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:406)

>>>>>
>>>>>          at
>>>>> com.cloud.server.StatsCollector$StorageCollector.runInContext(StatsCollector.java:642)

>>>>>
>>>>>          at
>>>>> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)

>>>>>
>>>>>          at
>>>>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)

>>>>>
>>>>>          at
>>>>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)

>>>>>
>>>>>          at
>>>>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)

>>>>>
>>>>>          at
>>>>> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)

>>>>>
>>>>>          at
>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

>>>>>
>>>>>          at 
>>>>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>>>>>          at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)

>>>>>
>>>>>          at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

>>>>>
>>>>>          at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

>>>>>
>>>>>          at java.lang.Thread.run(Thread.java:745)
>>>>> 2015-07-20 15:33:49,689 INFO  [c.c.s.StatsCollector]
>>>>> (StatsCollector-3:ctx-692a5392) Unable to reach 
>>>>> Pool[1|NetworkFilesystem]
>>>>> com.cloud.exception.StorageUnavailableException: Resource 
>>>>> [StoragePool:1]
>>>>> is unreachable: Unable to send command to the pool
>>>>>          at
>>>>> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:1010)

>>>>>
>>>>>          at
>>>>> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:392)

>>>>>
>>>>>          at
>>>>> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:406)

>>>>>
>>>>>          at
>>>>> com.cloud.server.StatsCollector$StorageCollector.runInContext(StatsCollector.java:642)

>>>>>
>>>>>          at
>>>>> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)

>>>>>
>>>>>          at
>>>>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)

>>>>>
>>>>>          at
>>>>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)

>>>>>
>>>>>          at
>>>>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)

>>>>>
>>>>>          at
>>>>> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)

>>>>>
>>>>>          at
>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

>>>>>
>>>>>          at 
>>>>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>>>>>          at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)

>>>>>
>>>>>          at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

>>>>>
>>>>>          at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

>>>>>
>>>>>          at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

>>>>>
>>>>>          at java.lang.Thread.run(Thread.java:745)
>>>>>
>>>>> -----------
>>>>> Cao Tong
>>>>>
>>>>>
>>>>> On 07/20/2015 02:52 PM, tony_caotong@163.com wrote:
>>>>>
>>>>>> No, no one's IP was changed.
>>>>>>
>>>>>> 1. In xenserver I can not login systemvms using the internal IP like
>>>>>> '169.254.1.112',  There shoud be a bridge network for this 
>>>>>> right?  it is
>>>>>> gone.
>>>>>>
>>>>>> 2. I try to delete xenserver host from CS on web, it also failed

>>>>>> with
>>>>>> lots of logs like following, then memory is full and mangement 
>>>>>> down...
>>>>>>
>>>>>> 2015-07-20 14:47:30,580 DEBUG [c.c.a.m.ClusteredAgentAttache]
>>>>>> (AgentManager-Handler-15:null) Seq 1-7282039122481381399: 
>>>>>> Forwarding Seq
>>>>>> 1-7282039122481381399:  { Cmd , MgmtId: 104062526015411, via: 1,

>>>>>> Ver: v1,
>>>>>> Flags: 100111, 
>>>>>> [{"com.cloud.agent.api.MaintainCommand":{"wait":0}}] } to
>>>>>> 192405008094602
>>>>>> 2015-07-20 14:47:30,582 DEBUG [c.c.a.m.ClusteredAgentAttache]
>>>>>> (AgentManager-Handler-5:null) Seq 1-7282039122481381399: 
>>>>>> Forwarding Seq
>>>>>> 1-7282039122481381399:  { Cmd , MgmtId: 104062526015411, via: 1,

>>>>>> Ver: v1,
>>>>>> Flags: 100111, 
>>>>>> [{"com.cloud.agent.api.MaintainCommand":{"wait":0}}] } to
>>>>>> 192405008094602
>>>>>> 2015-07-20 14:47:30,583 DEBUG [c.c.a.m.ClusteredAgentAttache]
>>>>>> (AgentManager-Handler-1:null) Seq 1-7282039122481381399: 
>>>>>> Forwarding Seq
>>>>>> 1-7282039122481381399:  { Cmd , MgmtId: 104062526015411, via: 1,

>>>>>> Ver: v1,
>>>>>> Flags: 100111, 
>>>>>> [{"com.cloud.agent.api.MaintainCommand":{"wait":0}}] } to
>>>>>> 192405008094602
>>>>>> 2015-07-20 14:47:30,584 DEBUG [c.c.a.m.ClusteredAgentAttache]
>>>>>> (AgentManager-Handler-14:null) Seq 1-7282039122481381399: 
>>>>>> Forwarding Seq
>>>>>> 1-7282039122481381399:  { Cmd , MgmtId: 104062526015411, via: 1,

>>>>>> Ver: v1,
>>>>>> Flags: 100111, 
>>>>>> [{"com.cloud.agent.api.MaintainCommand":{"wait":0}}] } to
>>>>>> 192405008094602
>>>>>>
>>>>>>
>>>>>> I guess that,  is there some service or daemons working for CS is

>>>>>> not up
>>>>>> on Xenserver ?
>>>>>>
>>>>>>
>>>>>> -----------
>>>>>> Cao Tong
>>>>>> On 07/20/2015 02:35 PM, Rajani Karuturi wrote:
>>>>>>
>>>>>>> Did the management server ip change?
>>>>>>> management server ip in the configuration table is used my 
>>>>>>> systemvms.
>>>>>>> select * from configuration where name like 'host';
>>>>>>>
>>>>>>> If it changed, correct the value in db and restart systemvms.
>>>>>>>
>>>>>>>
>>>>>>> ~Rajani
>>>>>>>
>>>>>>> On Mon, Jul 20, 2015 at 11:56 AM,<tony_caotong@163.com>
 wrote:
>>>>>>>
>>>>>>>   Hello,
>>>>>>>> I shutdown my cs-manager and xenserver last weekend, And
now 
>>>>>>>> the ssvm
>>>>>>>> and cpvm is disconnect, thost two was runing on xenserver.
so What
>>>>>>>> should i do right now ?
>>>>>>>> Please anybody help me and thanks.
>>>>>>>>
>>>>>>>> In xenserver  I found that the three system VMs are not running.
>>>>>>>> my xenserver seems can not reconnect to CS-manager. and it

>>>>>>>> seams not
>>>>>>>> under control of CS.
>>>>>>>>
>>>>>>>>
>>>>>>>> What is the right steps of shutdown all CS group machines
and 
>>>>>>>> resume
>>>>>>>> them?
>>>>>>>> How can i let my xenserver reconnected ?
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> -----------
>>>>>>>> Cao Tong
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>
>>>
>>
>>
>
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message