ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMBARI-12657) Cluster creates fail on larger deployments with SQL Azure DB
Date Sat, 08 Aug 2015 21:08:45 GMT

    [ https://issues.apache.org/jira/browse/AMBARI-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14663180#comment-14663180
] 

Hudson commented on AMBARI-12657:
---------------------------------

FAILURE: Integrated in Ambari-branch-2.1 #349 (See [https://builds.apache.org/job/Ambari-branch-2.1/349/])
AMBARI-12657 - Cluster creates fail on larger deployments with SQL Azure DB (jonathanhurley)
(jhurley: http://git-wip-us.apache.org/repos/asf?p=ambari.git&a=commit&h=3ba078d539d1010271592eb55d1b42e3f40ab5ea)
* ambari-server/src/main/java/org/apache/ambari/server/orm/dao/HostRoleCommandDAO.java
* ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionDBAccessorImpl.java
* ambari-server/src/main/java/org/apache/ambari/server/serveraction/ServerActionExecutor.java


> Cluster creates fail on larger deployments with SQL Azure DB
> ------------------------------------------------------------
>
>                 Key: AMBARI-12657
>                 URL: https://issues.apache.org/jira/browse/AMBARI-12657
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.0.0
>            Reporter: Jonathan Hurley
>            Assignee: Jonathan Hurley
>            Priority: Critical
>             Fix For: 2.1.1
>
>         Attachments: AMBARI-12657.patch
>
>
> We started doing larger cluster creates (48 workernodes) with SQL Azure DB as an Ambari
DB, and we are seeing below HTTP GET requests timeout on the client side (even after retries),
resulting in cluster create failures (15%). This is a tracking Jira to resolve the CRUD failures.
> What I’m seeing is that DB CPU usage goes above 50% in some of my experiments for 48
node clusters. This might explain why SQL is running slow.
> end_time            avg_cpu_percent            avg_data_io_percent    avg_log_write_percent
               avg_memory_usage_percent
> 2015-08-05 18:51:24.153                40.89     0.00        0.62        0.67
> 2015-08-05 18:51:09.107                41.86     0.00        1.49        0.67
> 2015-08-05 18:50:54.090                24.36     0.00        0.08        0.67
> 2015-08-05 18:50:38.763                43.16     0.00        0.57        0.67
> 2015-08-05 18:50:23.700                65.03     0.00        0.51        0.67
> 2015-08-05 18:50:07.840                28.57     0.00        0.45        0.67
> 2015-08-05 18:49:49.480                39.78     0.00        0.42        0.67
> 2015-08-05 18:49:34.383                28.14     0.00        0.43        0.67
> Most expensive queries in terms of CPU time are below. 
> Basically, it’s this one query which consumes most of the CPU. Query plan is also attached.
> {code}
> SELECT DISTINCT t0.request_id FROM host_role_command t0 WHERE NOT EXISTS (SELECT @P0
FROM host_role_command t1 WHERE (t1.status IN (@P1,@P2,@P3,@P4,@P5,@P6,@P7,@P8,@P9)))  ORDER
BY t0.request_id ASC
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message