cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-8485) listAPIs are taking too long to return results
Date Thu, 05 Nov 2015 13:50:27 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-8485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14991673#comment-14991673
] 

ASF GitHub Bot commented on CLOUDSTACK-8485:
--------------------------------------------

Github user DaanHoogland commented on the pull request:

    https://github.com/apache/cloudstack/pull/1021#issuecomment-154062799
  
    I am not fine with this change but must be honest and report regressions tests from the
bubble: they all passed.
    
    ```
    Test router internal advanced zone ... === TestName: test_02_router_internal_adv | Status
: SUCCESS ===
    ok
    Test restart network ... === TestName: test_03_restart_network_cleanup | Status : SUCCESS
===
    ok
    Test router basic setup ... === TestName: test_05_router_basic | Status : SUCCESS ===
    ok
    Test router advanced setup ... === TestName: test_06_router_advanced | Status : SUCCESS
===
    ok
    Test stop router ... === TestName: test_07_stop_router | Status : SUCCESS ===
    ok
    Test start router ... === TestName: test_08_start_router | Status : SUCCESS ===
    ok
    Test reboot router ... === TestName: test_09_reboot_router | Status : SUCCESS ===
    ok
    test_privategw_acl (integration.smoke.test_privategw_acl.TestPrivateGwACL) ... === TestName:
test_privategw_acl | Status : SUCCESS ===
    ok
    Test reset virtual machine on reboot ... === TestName: test_01_reset_vm_on_reboot | Status
: SUCCESS ===
    ok
    Test advanced zone virtual router ... === TestName: test_advZoneVirtualRouter | Status
: SUCCESS ===
    ok
    Test Deploy Virtual Machine ... === TestName: test_deploy_vm | Status : SUCCESS ===
    ok
    Test Multiple Deploy Virtual Machine ... === TestName: test_deploy_vm_multiple | Status
: SUCCESS ===
    ok
    Test Stop Virtual Machine ... === TestName: test_01_stop_vm | Status : SUCCESS ===
    ok
    Test Start Virtual Machine ... === TestName: test_02_start_vm | Status : SUCCESS ===
    ok
    Test Reboot Virtual Machine ... === TestName: test_03_reboot_vm | Status : SUCCESS ===
    ok
    Test destroy Virtual Machine ... === TestName: test_06_destroy_vm | Status : SUCCESS ===
    ok
    Test recover Virtual Machine ... === TestName: test_07_restore_vm | Status : SUCCESS ===
    ok
    Test migrate VM ... === TestName: test_08_migrate_vm | Status : SUCCESS ===
    ok
    Test destroy(expunge) Virtual Machine ... === TestName: test_09_expunge_vm | Status :
SUCCESS ===
    ok
    Test to create service offering ... === TestName: test_01_create_service_offering | Status
: SUCCESS ===
    ok
    Test to update existing service offering ... === TestName: test_02_edit_service_offering
| Status : SUCCESS ===
    ok
    Test to delete service offering ... === TestName: test_03_delete_service_offering | Status
: SUCCESS ===
    ok
    Test for delete account ... === TestName: test_delete_account | Status : SUCCESS ===
    ok
    Test for Associate/Disassociate public IP address for admin account ... === TestName:
test_public_ip_admin_account | Status : SUCCESS ===
    ok
    Test for Associate/Disassociate public IP address for user account ... === TestName: test_public_ip_user_account
| Status : SUCCESS ===
    ok
    Test for release public IP address ... === TestName: test_releaseIP | Status : SUCCESS
===
    ok
    Test create VPC offering ... === TestName: test_01_create_vpc_offering | Status : SUCCESS
===
    ok
    Test VPC offering without load balancing service ... === TestName: test_03_vpc_off_without_lb
| Status : SUCCESS ===
    ok
    Test VPC offering without static NAT service ... === TestName: test_04_vpc_off_without_static_nat
| Status : SUCCESS ===
    ok
    Test VPC offering without port forwarding service ... === TestName: test_05_vpc_off_without_pf
| Status : SUCCESS ===
    ok
    Test VPC offering with invalid services ... === TestName: test_06_vpc_off_invalid_services
| Status : SUCCESS ===
    ok
    Test update VPC offering ... === TestName: test_07_update_vpc_off | Status : SUCCESS ===
    ok
    Test list VPC offering ... === TestName: test_08_list_vpc_off | Status : SUCCESS ===
    ok
    test_09_create_redundant_vpc_offering (integration.component.test_vpc_offerings.TestVPCOffering)
... === TestName: test_09_create_redundant_vpc_offering | Status : SUCCESS ===
    ok
    Test start/stop of router after addition of one guest network ... === TestName: test_01_start_stop_router_after_addition_of_one_guest_network
| Status : SUCCESS ===
    ok
    Test reboot of router after addition of one guest network ... === TestName: test_02_reboot_router_after_addition_of_one_guest_network
| Status : SUCCESS ===
    ok
    Test to change service offering of router after addition of one guest network ... ===
TestName: test_04_chg_srv_off_router_after_addition_of_one_guest_network | Status : SUCCESS
===
    ok
    Test destroy of router after addition of one guest network ... === TestName: test_05_destroy_router_after_addition_of_one_guest_network
| Status : SUCCESS ===
    ok
    Test to stop and start router after creation of VPC ... === TestName: test_01_stop_start_router_after_creating_vpc
| Status : SUCCESS ===
    ok
    Test to reboot the router after creating a VPC ... === TestName: test_02_reboot_router_after_creating_vpc
| Status : SUCCESS ===
    ok
    Tests to change service offering of the Router after ... === TestName: test_04_change_service_offerring_vpc
| Status : SUCCESS ===
    ok
    Test to destroy the router after creating a VPC ... === TestName: test_05_destroy_router_after_creating_vpc
| Status : SUCCESS ===
    ok
    
    ----------------------------------------------------------------------
    Ran 42 tests in 8451.564s
    
    OK
    ```
    and
    ```
    Create a redundant VPC with two networks with two VMs in each network ... === TestName:
test_01_create_redundant_VPC_2tiers_4VMs_4IPs_4PF_ACL | Status : SUCCESS ===
    ok
    Create a redundant VPC with two networks with two VMs in each network and check default
routes ... === TestName: test_02_redundant_VPC_default_routes | Status : SUCCESS ===
    ok
    Test iptables default INPUT/FORWARD policy on RouterVM ... === TestName: test_02_routervm_iptables_policies
| Status : SUCCESS ===
    ok
    Test iptables default INPUT/FORWARD policies on VPC router ... === TestName: test_01_single_VPC_iptables_policies
| Status : SUCCESS ===
    ok
    Stop existing router, add a PF rule and check we can access the VM ... === TestName: test_isolate_network_FW_PF_default_routes
| Status : SUCCESS ===
    ok
    Test redundant router internals ... === TestName: test_RVR_Network_FW_PF_SSH_default_routes
| Status : SUCCESS ===
    ok
    Create a VPC with two networks with one VM in each network and test nics after destroy
... === TestName: test_01_VPC_nics_after_destroy | Status : SUCCESS ===
    ok
    Create a VPC with two networks with one VM in each network and test default routes ...
=== TestName: test_02_VPC_default_routes | Status : SUCCESS ===
    ok
    Check the password file in the Router VM ... === TestName: test_isolate_network_password_server
| Status : SUCCESS ===
    ok
    Check that the /etc/dhcphosts.txt doesn't contain duplicate IPs ... === TestName: test_router_dhcphosts
| Status : SUCCESS ===
    ok
    Test to create Load balancing rule with source NAT ... === TestName: test_01_create_lb_rule_src_nat
| Status : SUCCESS ===
    ok
    Test to create Load balancing rule with non source NAT ... === TestName: test_02_create_lb_rule_non_nat
| Status : SUCCESS ===
    ok
    Test for assign & removing load balancing rule ... === TestName: test_assign_and_removal_lb
| Status : SUCCESS ===
    ok
    Test to verify access to loadbalancer haproxy admin stats page ... === TestName: test02_internallb_haproxy_stats_on_all_interfaces
| Status : SUCCESS ===
    ok
    Test create, assign, remove of an Internal LB with roundrobin http traffic to 3 vm's ...
=== TestName: test_01_internallb_roundrobin_1VPC_3VM_HTTP_port80 | Status : SUCCESS ===
    ok
    Test SSVM Internals ... === TestName: test_03_ssvm_internals | Status : SUCCESS ===
    ok
    Test CPVM Internals ... === TestName: test_04_cpvm_internals | Status : SUCCESS ===
    ok
    Test stop SSVM ... === TestName: test_05_stop_ssvm | Status : SUCCESS ===
    ok
    Test stop CPVM ... === TestName: test_06_stop_cpvm | Status : SUCCESS ===
    ok
    Test reboot SSVM ... === TestName: test_07_reboot_ssvm | Status : SUCCESS ===
    ok
    Test reboot CPVM ... === TestName: test_08_reboot_cpvm | Status : SUCCESS ===
    ok
    Test destroy SSVM ... === TestName: test_09_destroy_ssvm | Status : SUCCESS ===
    ok
    Test destroy CPVM ... === TestName: test_10_destroy_cpvm | Status : SUCCESS ===
    ok
    Test for port forwarding on source NAT ... === TestName: test_01_port_fwd_on_src_nat |
Status : SUCCESS ===
    ok
    Test for port forwarding on non source NAT ... === TestName: test_02_port_fwd_on_non_src_nat
| Status : SUCCESS ===
    ok
    Test for reboot router ... === TestName: test_reboot_router | Status : SUCCESS ===
    ok
    Test for Router rules for network rules on acquired public IP ... === TestName: test_network_rules_acquired_public_ip_1_static_nat_rule
| Status : SUCCESS ===
    ok
    Test for Router rules for network rules on acquired public IP ... === TestName: test_network_rules_acquired_public_ip_2_nat_rule
| Status : SUCCESS ===
    ok
    Test for Router rules for network rules on acquired public IP ... === TestName: test_network_rules_acquired_public_ip_3_Load_Balancer_Rule
| Status : SUCCESS ===
    ok
    
    ----------------------------------------------------------------------
    Ran 29 tests in 13087.859s
    
    OK
    ```


> listAPIs are taking too long to return results
> ----------------------------------------------
>
>                 Key: CLOUDSTACK-8485
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-8485
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Management Server
>    Affects Versions: 4.5.1, 4.6.0
>            Reporter: Sowmya Krishnan
>            Assignee: Koushik Das
>             Fix For: 4.6.0
>
>
> listAPIs are taking significantly longer than before (4.2.x)
> I tried out few listAPI calls using a simulator set up with ~ 10K VMs and 8K Routers.
Here are few results:
> listVirtualMachines is taking > 25 sec to return with pagesize set to 50. This is
in comparison to 2 sec in earlier cases such as 4.2.
> listVolumes with pagesize = 1000 took more than 10 mins and finally times out.
> Further observations show that there are also lot of slow queries being logged in catalina.out
and in MySQL slow query logs. I am not sure if this could be the reason for DB performance
getting impacted in turn causing an impact on listAPIs too.
> Here's a sample of slow queries from catalina.out:
> Mon May 11 07:31:15 UTC 2015 INFO: Profiler Event: [SLOW QUERY]         at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
duration: 3305 ms, connection-id: 637759, statement-id: 3218312, resultset-id: 0, message:
Slow query (exceeded 2,000 ms, duration: 3,305 ms):SELECT user_vm_details.id, user_vm_details.vm_id,
user_vm_details.name, user_vm_details.value, user_vm_details.display FROM user_vm_details
WHERE user_vm_details.vm_id = 9117Mon May 11 07:31:15 UTC 2015 INFO: Profiler Event: [SLOW
QUERY]         at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
duration: 3305 ms, connection-id: 637843, statement-id: 3218311, resultset-id: 0, message:
Slow query (exceeded 2,000 ms, duration: 3,305 ms):SELECT host.id, host.disconnected, host.name,
host.status, host.type, host.private_ip_address, host.private_mac_address, host.private_netmask,
host.public_netmask, host.public_ip_address, host.public_mac_address, host.storage_ip_address,
host.cluster_id, host.storage_netmask, host.storage_mac_address, host.storage_ip_address_2,
host.storage_netmask_2, host.storage_mac_address_2, host.hypervisor_type, host.proxy_port,
host.resource, host.fs_type, host.available, host.setup, host.resource_state, host.hypervisor_version,
host.update_count, host.uuid, host.data_center_id, host.pod_id, host.cpu_sockets, host.cpus,
host.url, host.speed, host.ram, host.parent, host.guid, host.capabilities, host.total_size,
host.last_ping, host.mgmt_server_id, host.dom0_memory, host.version, host.created, host.removed
FROM host WHERE host.id = 345  AND host.removed IS NULL
>  Mon May 11 07:31:17 UTC 2015 INFO: Profiler Event: [SLOW QUERY]         at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
duration: 6458 ms, connection-id: 637623, statement-id: 3218243, resultset-id: 0, message:
Slow query (exceeded 2,000 ms, duration: 6,458 ms):SELECT storage_pool_host_ref.host_id FROM
storage_pool_host_ref  INNER JOIN host ON storage_pool_host_ref.host_id=host.id WHERE storage_pool_host_ref.pool_id
= 197  AND  (host.status = 'Up'  AND host.resource_state = 'Enabled' )Mon May 11 07:31:17
UTC 2015 INFO: Profiler Event: [SLOW QUERY]         at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
duration: 2402 ms, connection-id: 637754, statement-id: 3218371, resultset-id: 0, message:
Slow query (exceeded 2,000 ms, duration: 2,402 ms):SELECT host.id, host.disconnected, host.name,
host.status, host.type, host.private_ip_address, host.private_mac_address, host.private_netmask,
host.public_netmask, host.public_ip_address, host.public_mac_address, host.storage_ip_address,
host.cluster_id, host.storage_netmask, host.storage_mac_address, host.storage_ip_address_2,
host.storage_netmask_2, host.storage_mac_address_2, host.hypervisor_type, host.proxy_port,
host.resource, host.fs_type, host.available, host.setup, host.resource_state, host.hypervisor_version,
host.update_count, host.uuid, host.data_center_id, host.pod_id, host.cpu_sockets, host.cpus,
host.url, host.speed, host.ram, host.parent, host.guid, host.capabilities, host.total_size,
host.last_ping, host.mgmt_server_id, host.dom0_memory, host.version, host.created, host.removed
FROM host WHERE host.id = 669  AND host.removed IS NULL
> The following is from MySQL slow query log:
> SELECT vm_instance.id, vm_instance.name, vm_instance.vnc_password, vm_instance.proxy_id,
vm_instance.proxy_assign_time, vm_instance.state, vm_instance.private_ip_address, vm_instance.instance_name,
vm_instance.vm_template_id, vm_instance.guest_os_id, vm_instance.host_id, vm_instance.last_host_id,
vm_instance.pod_id, vm_instance.private_mac_address, vm_instance.data_center_id, vm_instance.vm_type,
vm_instance.ha_enabled, vm_instance.display_vm, vm_instance.limit_cpu_use, vm_instance.update_count,
vm_instance.created, vm_instance.removed, vm_instance.update_time, vm_instance.domain_id,
vm_instance.account_id, vm_instance.service_offering_id, vm_instance.reservation_id, vm_instance.hypervisor_type,
vm_instance.dynamically_scalable, vm_instance.uuid, vm_instance.disk_offering_id, vm_instance.power_state,
vm_instance.power_state_update_time, vm_instance.power_state_update_count, vm_instance.power_host
FROM vm_instance WHERE vm_instance.instance_name = _binary'r-16916-VM'  AND vm_instance.removed
IS NULL  ORDER BY RAND() LIMIT 1;# User@Host: cloud[cloud] @ x3 [10.81.28.128]  Id: 637881#
Query_time: 8.193784  Lock_time: 0.000107 Rows_sent: 1  Rows_examined: 19935SET timestamp=1431329557;
> SET timestamp=1431329601;SELECT host.id, host.disconnected, host.name, host.status, host.type,
host.private_ip_address, host.private_mac_address, host.private_netmask, host.public_netmask,
host.public_ip_address, host.public_mac_address, host.storage_ip_address, host.cluster_id,
host.storage_netmask, host.storage_mac_address, host.storage_ip_address_2, host.storage_netmask_2,
host.storage_mac_address_2, host.hypervisor_type, host.proxy_port, host.resource, host.fs_type,
host.available, host.setup, host.resource_state, host.hypervisor_version, host.update_count,
host.uuid, host.data_center_id, host.pod_id, host.cpu_sockets, host.cpus, host.url, host.speed,
host.ram, host.parent, host.guid, host.capabilities, host.total_size, host.last_ping, host.mgmt_server_id,
host.dom0_memory, host.version, host.created, host.removed FROM host WHERE host.id = 913 
AND host.removed IS NULL;# User@Host: cloud[cloud] @ x3 [10.81.28.128]  Id: 637865# Query_time:
5.861241  Lock_time: 0.000139 Rows_sent: 1  Rows_examined: 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message