incubator-cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hugo Trippaers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-359) PropagateResourceEventCommand failes in cluster configuration
Date Fri, 26 Oct 2012 12:29:12 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484874#comment-13484874
] 

Hugo Trippaers commented on CLOUDSTACK-359:
-------------------------------------------

Before fix:
2012-10-26 11:26:21,017 DEBUG [cloud.cluster.ClusterManagerImpl] (Cluster-Worker-7:null) Dispatch
->1, json: [{"PropagateResourceEventCommand":{"hostId":1,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
2012-10-26 11:26:21,021 DEBUG [cloud.cluster.ClusterManagerImpl] (Cluster-Worker-7:null) Dispatch
-> 1, json: [{"PropagateResourceEventCommand":{"hostId":1,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
2012-10-26 11:26:21,023 DEBUG [agent.transport.Request] (Cluster-Worker-7:null) Seq 1-1098449597:
Sending  { Cmd , MgmtId: 2199064412171, via: 1, Ver: v1, Flags: 100011, [{"PropagateResourceEventCommand":{"hostId":1,"event":"AdminAskMaintenace","wait":0}}]
}
2012-10-26 11:26:21,023 DEBUG [agent.transport.Request] (Cluster-Worker-7:null) Seq 1-1098449597:
Executing:  { Cmd , MgmtId: 2199064412171, via: 1, Ver: v1, Flags: 100011, [{"PropagateResourceEventCommand":{"hostId":1,"event":"AdminAskMaintenace","wait":0}}]
}
2012-10-26 11:26:21,024 DEBUG [agent.manager.DirectAgentAttache] (DirectAgent-310:null) Seq
1-1098449597: Executing request
2012-10-26 11:26:21,024 DEBUG [agent.manager.DirectAgentAttache] (DirectAgent-310:null) Seq
1-1098449597: Response Received:
2012-10-26 11:26:21,024 DEBUG [agent.transport.Request] (DirectAgent-310:null) Seq 1-1098449597:
Processing:  { Ans: , MgmtId: 2199064412171, via: 1, Ver: v1, Flags: 10, [{"UnsupportedAnswer":{"result":false,"details":"Unsupported
command issued:com.cloud.agent.api.PropagateResourceEventCommand.  Are you sure you got the
right type of server?","wait":0}}] }
2012-10-26 11:26:21,034 DEBUG [agent.transport.Request] (Cluster-Worker-7:null) Seq 1-1098449597:
Received:  { Ans: , MgmtId: 2199064412171, via: 1, Ver: v1, Flags: 10, { UnsupportedAnswer
} }
2012-10-26 11:26:21,034 DEBUG [cloud.cluster.ClusterManagerImpl] (Cluster-Worker-7:null) Completed
dispatching -> 1, json: [{"PropagateResourceEventCommand":{"hostId":1,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
in 13 ms, return result: [{"UnsupportedAnswer":{"result":false,"details":"Unsupported command
issued:com.cloud.agent.api.PropagateResourceEventCommand.  Are you sure you got the right
type of server?","contextMap":{},"wait":0}}]
2012-10-26 11:26:21,057 INFO  [cloud.cluster.ClusterServiceServletImpl] (Cluster-Worker-3:null)
Setup cluster service servlet. service url: http://10.1.1.59:9090/clusterservice, request
timeout: 300 seconds
2012-10-26 11:26:21,057 DEBUG [cloud.cluster.ClusterManagerImpl] (Cluster-Worker-3:null) Cluster
PDU 2199064412171 -> 2199100915727. agent: 0, pdu seq: 2, pdu ack seq: 1, json: [{"UnsupportedAnswer":{"result":false,"details":"Unsupported
command issued:com.cloud.agent.api.PropagateResourceEventCommand.  Are you sure you got the
right type of server?","contextMap":{},"wait":0}}]
2012-10-26 11:26:21,110 DEBUG [agent.manager.DirectAgentAttache] (DirectAgent-308:null) Seq
3-393609218: Response Received:
2012-10-26 11:26:21,110 DEBUG [agent.transport.Request] (DirectAgent-308:null) Seq 3-393609218:
Processing:  { Ans: , MgmtId: 2199064412171, via: 3, Ver: v1, Flags: 10, [{"ClusterSyncAnswer":{"_clusterId":1,"_newStates":{},"_isExecuted":false,"result":true,"wait":0}}]
}
2012-10-26 11:26:21,241 DEBUG [cloud.cluster.ClusterServiceServletImpl] (Cluster-Worker-3:null)
POST http://10.1.1.59:9090/clusterservice response :true, responding time: 95 ms
2012-10-26 11:26:21,241 DEBUG [cloud.cluster.ClusterManagerImpl] (Cluster-Worker-3:null) Cluster
PDU 2199064412171 -> 2199100915727 completed. time: 184ms. agent: 0, pdu seq: 2, pdu ack
seq: 1, json: [{"UnsupportedAnswer":{"result":false,"details":"Unsupported command issued:com.cloud.agent.api.PropagateResourceEventCommand.
 Are you sure you got the right type of server?","contextMap":{},"wait":0}}]

After fix:
2012-10-26 14:17:52,182 DEBUG [cloud.cluster.ClusterManagerImpl] (Cluster-Worker-7:null) Dispatch
->1, json: [{"PropagateResourceEventCommand":{"hostId":1,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
2012-10-26 14:17:52,187 DEBUG [cloud.cluster.ClusterManagerImpl] (Cluster-Worker-7:null) Intercepting
command to propagate event AdminAskMaintenace for host 1
2012-10-26 14:17:52,195 DEBUG [agent.transport.Request] (Cluster-Worker-7:null) Seq 1-1351614733:
Sending  { Cmd , MgmtId: 2199100915727, via: 1, Ver: v1, Flags: 100111, [{"MaintainCommand":{"wait":0}}]
}
2012-10-26 14:17:52,195 DEBUG [agent.transport.Request] (Cluster-Worker-7:null) Seq 1-1351614733:
Executing:  { Cmd , MgmtId: 2199100915727, via: 1, Ver: v1, Flags: 100111, [{"MaintainCommand":{"wait":0}}]
}
2012-10-26 14:17:52,195 DEBUG [agent.manager.DirectAgentAttache] (DirectAgent-393:null) Seq
1-1351614733: Executing request
2012-10-26 14:17:52,300 DEBUG [xen.resource.CitrixResourceBase] (DirectAgent-393:null) Not
the master node so just return ok: 10.1.1.46
2012-10-26 14:17:52,300 DEBUG [agent.manager.DirectAgentAttache] (DirectAgent-393:null) Seq
1-1351614733: Response Received:
2012-10-26 14:17:52,301 DEBUG [agent.transport.Request] (DirectAgent-393:null) Seq 1-1351614733:
Processing:  { Ans: , MgmtId: 2199100915727, via: 1, Ver: v1, Flags: 110, [{"MaintainAnswer":{"willMigrate":true,"result":true,"wait":0}}]
}
2012-10-26 14:17:52,301 DEBUG [agent.transport.Request] (Cluster-Worker-7:null) Seq 1-1351614733:
Received:  { Ans: , MgmtId: 2199100915727, via: 1, Ver: v1, Flags: 110, { MaintainAnswer }
}
2012-10-26 14:17:52,301 DEBUG [agent.manager.AgentAttache] (DirectAgent-393:null) Seq 1-1351614733:
No more commands found
2012-10-26 14:17:52,310 DEBUG [cloud.resource.ResourceState] (Cluster-Worker-7:null) Resource
state update: [id = 1; name = cloudstack-xcp1; old state = Enabled; event = AdminAskMaintenace;
new state = PrepareForMaintenance]
2012-10-26 14:17:52,311 DEBUG [cloud.cluster.ClusterManagerImpl] (Cluster-Worker-7:null) Result
is true

                
> PropagateResourceEventCommand failes in cluster configuration
> -------------------------------------------------------------
>
>                 Key: CLOUDSTACK-359
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-359
>             Project: CloudStack
>          Issue Type: Bug
>          Components: Management Server
>    Affects Versions: 4.0.0
>            Reporter: Hugo Trippaers
>            Priority: Critical
>             Fix For: 4.0.0
>
>
> When enabling maintenance mode on a hypervisor the command failes. This seems to only
happen in the case where the command is received by the api on server A and the agent for
the hypervisor is running on server B.
> The setup this was encountered on is a two node cluster running an early pre release
of the 4.0 branch.
> 2012-10-16 10:01:43,589 DEBUG [cloud.async.AsyncJobManagerImpl] (TP-Processor22:null)
submit async job-18377, details: AsyncJobVO {id:18377, userId: 2, accoun
> tId: 2, sessionKey: null, instanceType: Host, instanceId: 133, cmd: com.cloud.api.commands.PrepareForMaintenanceCmd,
cmdOriginator: null, cmdInfo: {"response"
> :"json","id":"931cc0bc-a423-4600-8ccd-0597eeffaa85","sessionkey":"R4fLb60jJNSdAIe8zt4wRcfCE+E\u003d","ctxUserId":"2","_":"1350374503534","ctxAccountId":"2","c
> txStartEventId":"144113"}, cmdVersion: 0, callbackType: 0, callbackAddress: null, status:
0, processStatus: 0, resultCode: 0, result: null, initMsid: 34505243
> 3506, completeMsid: null, lastUpdated: null, lastPolled: null, created: null}
> 2012-10-16 10:01:43,589 DEBUG [cloud.async.AsyncJobManagerImpl] (Job-Executor-68:job-18377)
Executing com.cloud.api.commands.PrepareForMaintenanceCmd for job-
> 18377
> 2012-10-16 10:01:43,617 DEBUG [cloud.cluster.ClusterManagerImpl] (Job-Executor-68:job-18377)
Propagating agent change request event:AdminAskMaintenace to agen
> t:133
> 2012-10-16 10:01:43,617 DEBUG [cloud.cluster.ClusterManagerImpl] (Job-Executor-68:job-18377)
345052433506 -> 345052433504.133 [{"PropagateResourceEventCommand
> ":{"hostId":133,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
> 2012-10-16 10:01:43,618 DEBUG [cloud.cluster.ClusterManagerImpl] (Cluster-Worker-5:null)
Cluster PDU 345052433506 -> 345052433504. agent: 133, pdu seq: 75, pd
> u ack seq: 0, json: [{"PropagateResourceEventCommand":{"hostId":133,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
> 2012-10-16 10:01:43,625 DEBUG [cloud.cluster.ClusterServiceServletImpl] (Cluster-Worker-5:null)
POST http://10.200.22.16:9090/clusterservice response :true, r
> esponding time: 6 ms
> 2012-10-16 10:01:43,626 DEBUG [cloud.cluster.ClusterManagerImpl] (Cluster-Worker-5:null)
Cluster PDU 345052433506 -> 345052433504 completed. time: 7ms. agent:
>  133, pdu seq: 75, pdu ack seq: 0, json: [{"PropagateResourceEventCommand":{"hostId":133,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
> 2012-10-16 10:01:43,635 DEBUG [cloud.cluster.ClusterManagerImpl] (Job-Executor-68:job-18377)
345052433506 -> 345052433504.133 completed. result: [{"Unsupporte
> dAnswer":{"result":false,"details":"Unsupported command issued:com.cloud.agent.api.PropagateResourceEventCommand.
 Are you sure you got the right type of serv
> er?","contextMap":{},"wait":0}}]
> 2012-10-16 10:01:43,636 DEBUG [cloud.cluster.ClusterManagerImpl] (Job-Executor-68:job-18377)
Result for agent change is false
> 2012-10-16 10:01:43,636 ERROR [cloud.api.ApiDispatcher] (Job-Executor-68:job-18377) Exception
while executing PrepareForMaintenanceCmd:
> com.cloud.utils.exception.CloudRuntimeException: Unable to prepare for maintenance host
133
>         at com.cloud.resource.ResourceManagerImpl.maintain(ResourceManagerImpl.java:1176)
>         at com.cloud.api.commands.PrepareForMaintenanceCmd.execute(PrepareForMaintenanceCmd.java:102)
>         at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:138)
>         at com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:449)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> 2012-10-16 10:01:43,637 DEBUG [cloud.async.AsyncJobManagerImpl] (Job-Executor-68:job-18377)
Complete async job-18377, jobStatus: 2, resultCode: 530, result: c
> om.cloud.api.response.ExceptionResponse@6e13b651

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message