stratos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martin Eppel (meppel)" <mep...@cisco.com>
Subject Stratos 4.1 : group scaling question
Date Fri, 30 Oct 2015 05:21:44 GMT
Hi Devs,

We are seeing some interesting behavior using group scaling.

In the scenario we start an application which has a top level group with 2 cartridges and
a nested group with a single cartridge.

The min value in the group policy is at first set to 1 and the max to a value N. To scale
the (nested) group we update the application and increment the group policy min value which
spins up a second group with a second VM. The min / max value for the cartridge is set to
1, so only a single instance in each group can be spun up.

If a VM becomes inactive, auto healing kicks in and spins up  a new VM instance. Also, from
the logs, it appears that the group becomes inactive and a new group instance is instantiated
(autoscaler min_check rule ?) which causes also a new VM in this new group to be  spun up
as (see below in the logs “Group [Instance context] cvpc-sf-000-0x0-5 has been added to
[Group] cvpc-sf-000-0x0”and a VM instance in this group becomes active (see  “Publishing
member activated event: [service-name] sf-000 [cluster-id] di-000.sf-000-1x0.sf-000.domain
[cluster-instance-id] cvpc-sf-000-0x0-5 “ ).
However, soon after, the previous group instance is being terminated  (Publishing group instance
terminating event: [application] di-000 [group] cvpc-sf-000-0x0 [instance] cvpc-sf-000-0x0-4
 ) and all members in the cluster  (Starting to move all members in cluster [di-000.sf-000-1x0.sf-000.domain]
Network Partition [RegionOne], Partition [RegionOne-partition-03] to termination pending list)
 are being terminated which, as a consequence has all VMs of all the nested groups being moved
to a the maintenance mode and eventually being shutdown (see log snippet).

Here are a few questions:

Why, if the cluster is being shutdown are all VM of all the groups being restarted as well
?
Does this mean, that for all groups which share the same cartridge they also share the same
cluster (cluster id) ?
My assumption would be that if a group becomes inactive only this particular group is restarted
?
This would also raise the question why the whole cluster being shut down if one of the groups
goes inactive ?

How do cluster id, cluster instance id and member id and group relate to each other - It seems,
auto-scaled groups share the same cluster (cluster id - same cartridge type) but differ for
each group in the “cluster instance id” and, for each VM instance in the member id, is
this correct ?

I attached the complete log snippet to the email,

Thanks

Martin

TID: [0] [STRATOS] [2015-10-29 09:51:28,402]  INFO {org.apache.stratos.cep.extension.FaultHandlingWindowProcessor}
-  Faulty member detected [member-id] di-000.sf-000-1x0.sf-000.domain1eedc6fe-6c7b-4c83-9198-da54e1f2d9e7
with [last time-stamp] 1446112224588 [time-out] 60000 milliseconds                       
                                                  │
...
TID: [0] [STRATOS] [2015-10-29 09:52:28,413]  INFO {org.apache.stratos.cep.extension.FaultHandlingWindowProcessor}
-  Faulty member detected [member-id] di-000.sf-000-1x0.sf-000.domain1eedc6fe-6c7b-4c83-9198-da54e1f2d9e7
with [last time-stamp] 1446112224588 [time-out] 60000 milliseconds                       
                                                  │
…
(the above are the only two instances of the “6000” I can find in the log file. Note they
reference the same member ID)
...
TID: [0] [STRATOS] [2015-10-29 09:52:36,837]  INFO {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
-  Publishing member terminated event: [service-name] sf-000 [cluster-id] di-000.sf-000-1x0.sf-000.domain
[cluster-instance-id] cvpc-sf-000-0x0-2 [member-id] di-000.sf-000-1x0.sf-000.domain1eedc6fe-6c7b-4c83-9198-da54e1f2d9e7
[network-partition-id] RegionOne [partition-id] RegionOne-partition-03 [group-id] null
...
TID: [0] [STRATOS] [2015-10-29 09:54:25,876]  INFO {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
-  Publishing member activated event: [service-name] sf-000 [cluster-id] di-000.sf-000-1x0.sf-000.domain
[cluster-instance-id] cvpc-sf-000-0x0-2 [member-id] di-000.sf-000-1x0.sf-000.domaina5dd0936-42cb-4e81-88c3-e066eacf7a3d
[net│
..
TID: [0] [STRATOS] [2015-10-29 09:54:25,907]  INFO {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
-  Publishing cluster activated event: [service-name] sf-000 [cluster-id] di-000.sf-000-1x0.sf-000.domain
 [instance-id] cvpc-sf-000-0x0-2 [application-id] di-000                                 
                                 │
..
TID: [0] [STRATOS] [2015-10-29 09:54:26,000]  INFO {org.apache.stratos.autoscaler.status.processor.group.GroupStatusActiveProcessor}
-  Sending group instance active for [group] cvpc-di-000-x0x [instance] di-000-1…
…
TID: [0] [STRATOS] [2015-10-29 09:54:32,359]  INFO {org.apache.stratos.autoscaler.monitor.component.GroupMonitor}
-  Creating a group instance of [application] di-000 [group] cvpc-sf-000-0x0 in order to satisfy
the demand on scaling for [parent-instance] di-000-1                                     
                                                            │
…
TID: [0] [STRATOS] [2015-10-29 09:54:32,375]  INFO {org.apache.stratos.autoscaler.monitor.component.GroupMonitor}
-  Group [Instance context] cvpc-sf-000-0x0-5 has been added to [Group] cvpc-sf-000-0x0…
..
TID: [0] [STRATOS] [2015-10-29 09:54:32,473]  INFO {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
-  Publishing member created event: [service-name] sf-000 [cluster-id] di-000.sf-000-1x0.sf-000.domain
[cluster-instance-id] cvpc-sf-000-0x0-5 [member-id] di-000.sf-000-1x0.sf-000.domain14850f7c-215b-43d3-8ae1-c2dc58ac5af4
[insta┼
..

TID: [0] [STRATOS] [2015-10-29 09:55:44,649]  INFO {org.apache.stratos.autoscaler.rule.RuleLog}
-  [scale-up] Trying to scale up over max, hence not scaling up cluster itself and notifying
to parent for possible group scaling or app bursting. [cluster] cartridge-proxy.cartridge-proxy.cartridge-proxy.domain
[instance id]cartridge-proxy-1 [max] 1
..
TID: [0] [STRATOS] [2015-10-29 09:56:28,708]  INFO {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
-  Publishing member activated event: [service-name] sf-000 [cluster-id] di-000.sf-000-1x0.sf-000.domain
[cluster-instance-id] cvpc-sf-000-0x0-5 [member-id] di-000.sf-000-1x0.sf-000.domain14850f7c-215b-43d3-8ae1-c2dc58ac5af4
[net├
TID: [0] [STRATOS] [2015-10-29 09:56:28,716]  INFO {org.apache.stratos.messaging.message.processor.topology.MemberActivatedMessageProcessor}
-  Member activated: [service] sf-000 [cluster] di-000.sf-000-1x0.sf-000.domain [member] di-000.sf-000-1x0.sf-000.domain14850f7c-215b-43d3-8ae1-c2dc58ac5af4
                                                               │
..
TID: [0] [STRATOS] [2015-10-29 09:56:28,762]  INFO {org.apache.stratos.autoscaler.status.processor.group.GroupStatusActiveProcessor}
-  Sending group instance active for [group] cvpc-sf-000-0x0 [instance] cvpc-sf-000-0x0-5
                                                                                         
                                                ┼
TID: [0] [STRATOS] [2015-10-29 09:56:32,370]  INFO {org.apache.stratos.autoscaler.applications.topic.ApplicationsEventPublisher}
-  Publishing group instance terminating event: [application] di-000 [group] cvpc-sf-000-0x0
[instance] cvpc-sf-000-0x0-4                                                             
                                                 │
..
TID: [0] [STRATOS] [2015-10-29 09:56:32,408]  WARN {org.apache.stratos.autoscaler.status.processor.cluster.ClusterStatusActiveProcessor}
-  No possible state change found for [type]  [cluster] di-000.sf-000-1x0.sf-000.domain [instance]
cvpc-sf-000-0x0-4                                                                        
                                   │
TID: [0] [STRATOS] [2015-10-29 09:56:32,408]  INFO {org.apache.stratos.autoscaler.monitor.cluster.ClusterMonitor}
-  Starting to move all members in cluster [di-000.sf-000-1x0.sf-000.domain] Network Partition
[RegionOne], Partition [RegionOne-partition-03] to termination pending list <<<<<
??????
TID: [0] [STRATOS] [2015-10-29 09:56:32,485]  INFO {org.apache.stratos.cloud.controller.messaging.topology.TopologyBuilder}
-  member maintenance mode event adding status started                                   
                                                                                         
                                                         │
TID: [0] [STRATOS] [2015-10-29 09:56:32,496]  INFO {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
-  Publishing member in maintenance mode event: [service-name] sf-000 [cluster-id] di-000.sf-000-1x0.sf-000.domain
[cluster-instance-id] cvpc-sf-000-0x0-1 [member-id] di-000.sf-000-1x0.sf-000.domaindf78c5c8-89f0-4f9e-8a70-b74cee8│
TID: [0] [STRATOS] [2015-10-29 09:56:32,501]  INFO {org.apache.stratos.messaging.message.processor.topology.MemberMaintenanceModeProcessor}
-  Member updated as In_Maintenance: [service] sf-000 [cluster] di-000.sf-000-1x0.sf-000.domain
[member] di-000.sf-000-1x0.sf-000.domaindf78c5c8-89f0-4f9e-8a70-b74cee854c2d             
                                   │
TID: [0] [STRATOS] [2015-10-29 09:56:32,506]  INFO {org.apache.stratos.cloud.controller.messaging.topology.TopologyBuilder}
-  member maintenance mode event adding status started                                   
                                                                                         
                                                         │
TID: [0] [STRATOS] [2015-10-29 09:56:32,514]  INFO {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
-  Publishing member in maintenance mode event: [service-name] sf-000 [cluster-id] di-000.sf-000-1x0.sf-000.domain
[cluster-instance-id] cvpc-sf-000-0x0-3 [member-id] di-000.sf-000-1x0.sf-000.domainbe316f53-89ae-4666-b553-8b54063│
TID: [0] [STRATOS] [2015-10-29 09:56:32,519]  INFO {org.apache.stratos.messaging.message.processor.topology.MemberMaintenanceModeProcessor}
-  Member updated as In_Maintenance: [service] sf-000 [cluster] di-000.sf-000-1x0.sf-000.domain
[member] di-000.sf-000-1x0.sf-000.domainbe316f53-89ae-4666-b553-8b5406371377             
                                   │
TID: [0] [STRATOS] [2015-10-29 09:56:32,523]  INFO {org.apache.stratos.cloud.controller.messaging.topology.TopologyBuilder}
-  member maintenance mode event adding status started                                   
                                                                                         
                                                         │
TID: [0] [STRATOS] [2015-10-29 09:56:32,531]  INFO {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
-  Publishing member in maintenance mode event: [service-name] sf-000 [cluster-id] di-000.sf-000-1x0.sf-000.domain
[cluster-instance-id] cvpc-sf-000-0x0-4 [member-id] di-000.sf-000-1x0.sf-000.domain7a7a0c89-d867-426f-b85d-66a2987│
TID: [0] [STRATOS] [2015-10-29 09:56:32,537]  INFO {org.apache.stratos.messaging.message.processor.topology.MemberMaintenanceModeProcessor}
-  Member updated as In_Maintenance: [service] sf-000 [cluster] di-000.sf-000-1x0.sf-000.domain
[member] di-000.sf-000-1x0.sf-000.domain7a7a0c89-d867-426f-b85d-66a298771791             
                                   │
TID: [0] [STRATOS] [2015-10-29 09:56:32,540]  WARN {org.apache.stratos.cloud.controller.messaging.topology.TopologyBuilder}
-  Member di-000.sf-000-1x0.sf-000.domain1eedc6fe-6c7b-4c83-9198-da54e1f2d9e7 does not exist
                                                                                         
                                                      │
TID: [0] [STRATOS] [2015-10-29 09:56:32,540]  INFO {org.apache.stratos.cloud.controller.messaging.topology.TopologyBuilder}
-  member maintenance mode event adding status started                                   
                                                                                         
                                                         │
TID: [0] [STRATOS] [2015-10-29 09:56:32,548]  INFO {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
-  Publishing member in maintenance mode event: [service-name] sf-000 [cluster-id] di-000.sf-000-1x0.sf-000.domain
[cluster-instance-id] cvpc-sf-000-0x0-2 [member-id] di-000.sf-000-1x0.sf-000.domaina5dd0936-42cb-4e81-88c3-e066eac│
TID: [0] [STRATOS] [2015-10-29 09:56:32,553]  INFO {org.apache.stratos.messaging.message.processor.topology.MemberMaintenanceModeProcessor}
-  Member updated as In_Maintenance: [service] sf-000 [cluster] di-000.sf-000-1x0.sf-000.domain
[member] di-000.sf-000-1x0.sf-000.domaina5dd0936-42cb-4e81-88c3-e066eacf7a3d             
                                   │
TID: [0] [STRATOS] [2015-10-29 09:56:34,349]  INFO {org.apache.stratos.cloud.controller.messaging.topology.TopologyBuilder}
-  member maintenance mode event adding status started                                   
                                                                                         
                                                         │
TID: [0] [STRATOS] [2015-10-29 09:56:34,359]  INFO {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
-  Publishing member in maintenance mode event: [service-name] sf-000 [cluster-id] di-000.sf-000-1x0.sf-000.domain
[cluster-instance-id] cvpc-sf-000-0x0-5 [member-id] di-000.sf-000-1x0.sf-000.domain14850f7c-215b-43d3-8ae1-c2dc58a│
TID: [0] [STRATOS] [2015-10-29 09:56:34,363]  INFO {org.apache.stratos.messaging.message.processor.topology.MemberMaintenanceModeProcessor}
-  Member updated as In_Maintenance: [service] sf-000 [cluster] di-000.sf-000-1x0.sf-000.domain
[member] di-000.sf-000-1x0.sf-000.domain14850f7c-215b-43d3-8ae1-c2dc58ac5af4             
                                   │
TID: [0] [STRATOS] [2015-10-29 09:56:37,961]  INFO {org.apache.stratos.cloud.controller.messaging.topology.TopologyBuilder}
-  Member Ready to shut down event adding status started                                 
                                                                                         
                                                         │
TID: [0] [STRATOS] [2015-10-29 09:56:37,970]  INFO {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
-  Publishing member ready to shut down event: [service-name] sf-000 [cluster-id] di-000.sf-000-1x0.sf-000.domain
[cluster-instance-id] cvpc-sf-000-0x0-4 [member-id] di-000.sf-000-1x0.sf-000.domain7a7a0c89-d867-426f-b85d-66a29877│
TID: [0] [STRATOS] [2015-10-29 09:56:37,975]  INFO {org.apache.stratos.messaging.message.processor.topology.MemberReadyToShutdownMessageProcessor}
-  Member updated as Ready to shutdown: [service] sf-000 [cluster] di-000.sf-000-1x0.sf-000.domain
[member] di-000.sf-000-1x0.sf-000.domain7a7a0c89-d867-426f-b85d-66a298771791             
                         │
...
TID: [0] [STRATOS] [2015-10-29 09:56:42,357]  INFO {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
-  Publishing member terminated event: [service-name] sf-000 [cluster-id] di-000.sf-000-1x0.sf-000.domain
[cluster-instance-id] cvpc-sf-000-0x0-4 [member-id] di-000.sf-000-1x0.sf-000.domain7a7a0c89-d867-426f-b85d-66a298771791
[ne┤
…
TID: [0] [STRATOS] [2015-10-29 09:57:40,087]  INFO {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
-  Publishing member terminated event: [service-name] sf-000 [cluster-id] di-000.sf-000-1x0.sf-000.domain
[cluster-instance-id] cvpc-sf-000-0x0-1 [member-id] di-000.sf-000-1x0.sf-000.domaindf78c5c8-89f0-4f9e-8a70-b74cee854c2d
[ne│
TID: [0] [STRATOS] [2015-10-29 09:57:44,284]  INFO {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
-  Publishing member terminated event: [service-name] sf-000 [cluster-id] di-000.sf-000-1x0.sf-000.domain
[cluster-instance-id] cvpc-sf-000-0x0-2 [member-id] di-000.sf-000-1x0.sf-000.domaina5dd0936-42cb-4e81-88c3-e066eacf7a3d
[ne│
TID: [0] [STRATOS] [2015-10-29 09:58:37,458]  INFO {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
-  Publishing member terminated event: [service-name] sf-000 [cluster-id] di-000.sf-000-1x0.sf-000.domain
[cluster-instance-id] cvpc-sf-000-0x0-5 [member-id] di-000.sf-000-1x0.sf-000.domain14850f7c-215b-43d3-8ae1-c2dc58ac5af4
[ne│
TID: [0] [STRATOS] [2015-10-29 09:58:41,467]  INFO {org.apache.stratos.cloud.controller.messaging.publisher.TopologyEventPublisher}
-  Publishing member terminated event: [service-name] sf-000 [cluster-id] di-000.sf-000-1x0.sf-000.domain
[cluster-instance-id] cvpc-sf-000-0x0-3 [member-id] di-000.sf-000-1x0.sf-000.domainbe316f53-89ae-4666-b553-8b5406371377
[ne│




Mime
View raw message