stratos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nirmal Fernando (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (STRATOS-706) member terminate event should log reason
Date Wed, 16 Jul 2014 23:39:05 GMT

    [ https://issues.apache.org/jira/browse/STRATOS-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14064337#comment-14064337
] 

Nirmal Fernando commented on STRATOS-706:
-----------------------------------------

On Thu, Jul 17, 2014 at 1:11 AM, Martin Eppel (JIRA) <jira@apache.org>


All the log file you quoted is from Cloud Controller. And what CC does is
providing an API to terminate instances. The caller of this API, i.e.
auto-scaler is the one who logs the reason for calling CC to terminate
instances. Did you check auto-scaler logs?




-- 
Best Regards,
Nirmal

Nirmal Fernando.
PPMC Member & Committer of Apache Stratos,
Senior Software Engineer, WSO2 Inc.

Blog: http://nirmalfdo.blogspot.com/


> member terminate event should log reason
> ----------------------------------------
>
>                 Key: STRATOS-706
>                 URL: https://issues.apache.org/jira/browse/STRATOS-706
>             Project: Stratos
>          Issue Type: Bug
>          Components: Autoscaler
>    Affects Versions: 4.0.0
>            Reporter: Martin Eppel
>             Fix For: 4.0.1
>
>
> When Stratos terminates a member it must log the reason for it. Ideally the logging should
be systematic enough so that one can grep for different severity, or by member, or by event
type or some other useful categorization.
> The justification for this defect is that it will improve greatly debugging and troubleshooting
capabilities. Without logging it is very difficult to debug terminations of members.
>  
> For example, consider this sequence in the stratos log file:
>  
> ===================
> TID: [0] [STRATOS] [2014-07-15 09:58:48,654] DEBUG {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl}
-  Received an instance spawn request : MemberContext [memberId=null, nodeId=null, clusterId=cisco-gilan-appmgr-1.cisco-gil,
cartridgeType=null, privateIpAddress=null, publicIpAddress=null, allocatedIpAddress=null,
initTime=1405418328649, lbClusterId=null, networkPartitionId=OAM1] {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl}
> TID: [0] [STRATOS] [2014-07-15 09:58:48,654] DEBUG {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl}
-  Payload: SERVICE_NAME=cisco-gilan-appmgr,HOST_NAME=cisco-gilan-appmgr-1.qmog.cisco.com,MULTITENANT=false,TENANT_ID=-1234,TENANT_RANGE=-1234,CARTRIDGE_ALIAS=cisco-gilan-appmgr-1,CLUSTER_ID=cisco-gilan-appmgr-1.cisco-gil,CARTRIDGE_KEY=o1jbiPPmPWBgyNVM,DEPLOYMENT=default,REPO_URL=null,PORTS=9482,PUPPET_IP=PUPPET_IP,PUPPET_HOSTNAME=PUPPET_HOSTNAME,PUPPET_ENV=PUPPET_ENV,HEARTBEAT_AUTHKEY=20c9629a87f53ecdb5278d2ddb5a9d42,TRUSTSTORE_PASSWORD=wso2carbon,CEP_PORT=7611,MONITORING_SERVER_SECURE_PORT=0,MB_PORT=61616,OPENSTACK_COMPUTE_DNS=10.58.10.82,MB_IP=octl-01.qmog.cisco.com,QSB_PUPPET_ENVIR=,CEP_IP=octl-01.qmog.cisco.com,VSM_USER=admin,VEM_IP=192.168.66.43,ENABLE_DATA_PUBLISHER=false,MONITORING_SERVER_ADMIN_PASSWORD=xxxx,MONITORING_SERVER_IP=octl-01.qmog.cisco.com,VEM_USER=ubuntu,VEM_PWD=ubuntu,COMMIT_ENABLED=false,MONITORING_SERVER_ADMIN_USERNAME=xxxx,CERT_TRUSTSTORE=/opt/apache-stratos-cartridge-agent/security/client-truststore.jks,VSM_PWD=Starent123!,VSM_IP=192.168.66.2,MONITORING_SERVER_PORT=0,APPMGR_GITREPO=ssh://jenapper@10.58.10.189/home/jenapper/code/eccentrica.git,MEMBER_ID=cisco-gilan-appmgr-1.cisco-gil7ef7327f-2bb2-4768-820f-d064de29aa59,LB_CLUSTER_ID=null,NETWORK_PARTITION_ID=OAM1,PARTITION_ID=RegionOne-AZ-1
{org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl}
> TID: [0] [STRATOS] [2014-07-15 09:58:55,888]  INFO {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl}
-  Member is terminated: MemberContext [memberId=cisco-gilan-appmgr-1.cisco-gil407f5bdc-aad2-4234-80fc-6cdf17be6192,
nodeId=RegionOne/89433818-21ed-48d4-bd8f-c396ab30f6d2, clusterId=cisco-gilan-appmgr-1.cisco-gil,
cartridgeType=cisco-gilan-appmgr, privateIpAddress=192.168.66.1, publicIpAddress=null, allocatedIpAddress=null,
initTime=1405417410736, lbClusterId=null, networkPartitionId=OAM1] {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl}
> ===================
>  
> The problem is that Stratos gives no indication of why it is doing this [1]. Stratos
should be enhanced so that the above message gives some indication of *why* the member is
being terminated (loss of heartbeats, timeout on port knocking etc. etc.). This is needed
as apache stratos expands it's user base.
> This issue has high priority as it affects the efficiency of troubleshooting and system
stability.
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message