ambari-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hurley (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AMBARI-16830) Desired Configuration Cache Expiration Caused 10,000's of Database Hits In Large Deployments
Date Tue, 24 May 2016 02:02:13 GMT
Jonathan Hurley created AMBARI-16830:
----------------------------------------

             Summary: Desired Configuration Cache Expiration Caused 10,000's of Database Hits
In Large Deployments
                 Key: AMBARI-16830
                 URL: https://issues.apache.org/jira/browse/AMBARI-16830
             Project: Ambari
          Issue Type: Bug
          Components: ambari-server
    Affects Versions: 2.2.2
            Reporter: Jonathan Hurley
            Assignee: Jonathan Hurley
            Priority: Blocker
             Fix For: 2.4.0


In large deployments where the number of hosts * the number of components is large (10,000
for example), then the {{ConfigHelper.isStale()}} method could make 10,000's of database queries
every minute. 

Consider a 3-minute trace:

{code}
server.persistence.properties.eclipselink.profiler=PerformanceMonitor
{code}

{code:title=Time = 3 minutes}
Counter:ReadAllQuery:org.apache.ambari.server.orm.entities.ClusterConfigMappingEntity:null
   11,716

Timer:ReadAllQuery:org.apache.ambari.server.orm.entities.ClusterConfigMappingEntity:null 
  80,520,541,000
Timer:ReadAllQuery:org.apache.ambari.server.orm.entities.ClusterConfigMappingEntity:null:ObjectBuilding
   19,741,257,000
Timer:ReadAllQuery:org.apache.ambari.server.orm.entities.ClusterConfigMappingEntity:null:QueryPreparation
   414,000
Timer:ReadAllQuery:org.apache.ambari.server.orm.entities.ClusterConfigMappingEntity:null:RowFetch
   6,032,673,000
Timer:ReadAllQuery:org.apache.ambari.server.orm.entities.ClusterConfigMappingEntity:null:SqlGeneration
   79,000
Timer:ReadAllQuery:org.apache.ambari.server.orm.entities.ClusterConfigMappingEntity:null:SqlPrepare
   232,532,000
Timer:ReadAllQuery:org.apache.ambari.server.orm.entities.ClusterConfigMappingEntity:null:StatementExecute
   33,624,662,000
{code}

The {{ClusterConfigMappingEntity:null}} is requested over 10,000 times. If this value exceeds
the cache of stale configs (or even if it doesn't) this causes a massive performance delay
in the Jetty threads since the database is being hammered and other {{PropertyProviders}}
must wait until it's done.

- Setting the {{server.cache.isStale.expiration}} value to 28800 improves the behavior of
the system
-- Ambari goes from totally unsuable to usable
-- Startup is still an issue as the code still has to make 10,000's of calls, but those flatten
out after the cache is populated. So, during startup, it's unresponsive.
-- After startup, you can use Ambari to send commands and browse around without delay
-- If you change a config, however, the problem returns as the cache is emptied and we make
10,000 more calls. This causes Ambari to be unresponsive until the cache is repopulated

There are a ton of threads stuck at:
{code}
"qtp-ambari-client-275" prio=10 tid=0x00007f9de801b800 nid=0x6735 waiting for monitor entry
[0x00007f9dd66e3000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at org.apache.ambari.server.controller.internal.AbstractProviderModule.checkInit(AbstractProviderModule.java:805)
	- waiting to lock <0x00007fa0744cc3b0> (a org.apache.ambari.server.controller.internal.DefaultProviderModule)
	at org.apache.ambari.server.controller.internal.AbstractProviderModule.getMetricsServiceType(AbstractProviderModule.java:275)
{code}

They're all blocked by {{qtp-ambari-client-247}}:
{code}
"qtp-ambari-client-247" prio=10 tid=0x00007f9dd8001000 nid=0x5915 runnable [0x00007f9ddd0c2000]
   java.lang.Thread.State: RUNNABLE
	at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2961)
	at com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:2159)
	at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1964)
	at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:3316)
	at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:463)
	at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3040)
	at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2288)
	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2681)
	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2551)
	- locked <0x00007fa075265510> (a com.mysql.jdbc.JDBC4Connection)
	at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
	- locked <0x00007fa075265510> (a com.mysql.jdbc.JDBC4Connection)
	at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
	- locked <0x00007fa075265510> (a com.mysql.jdbc.JDBC4Connection)
	at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeQuery(NewProxyPreparedStatement.java:353)
	at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeSelect(DatabaseAccessor.java:1009)
	at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.basicExecuteCall(DatabaseAccessor.java:644)
	at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeCall(DatabaseAccessor.java:560)
	at org.eclipse.persistence.internal.sessions.AbstractSession.basicExecuteCall(AbstractSession.java:2055)
	at org.eclipse.persistence.sessions.server.ServerSession.executeCall(ServerSession.java:570)
	at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.executeCall(DatasourceCallQueryMechanism.java:242)
	at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.executeCall(DatasourceCallQueryMechanism.java:228)
	at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.executeSelectCall(DatasourceCallQueryMechanism.java:299)
	at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.selectAllRows(DatasourceCallQueryMechanism.java:694)
	at org.eclipse.persistence.internal.queries.ExpressionQueryMechanism.selectAllRowsFromTable(ExpressionQueryMechanism.java:2740)
	at org.eclipse.persistence.internal.queries.ExpressionQueryMechanism.selectAllRows(ExpressionQueryMechanism.java:2693)
	at org.eclipse.persistence.queries.ReadAllQuery.executeObjectLevelReadQuery(ReadAllQuery.java:559)
	at org.eclipse.persistence.queries.ObjectLevelReadQuery.executeDatabaseQuery(ObjectLevelReadQuery.java:1175)
	at org.eclipse.persistence.queries.DatabaseQuery.execute(DatabaseQuery.java:904)
	at org.eclipse.persistence.queries.ObjectLevelReadQuery.execute(ObjectLevelReadQuery.java:1134)
	at org.eclipse.persistence.queries.ReadAllQuery.execute(ReadAllQuery.java:460)
	at org.eclipse.persistence.queries.ObjectLevelReadQuery.executeInUnitOfWork(ObjectLevelReadQuery.java:1222)
	at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.internalExecuteQuery(UnitOfWorkImpl.java:2896)
	at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1857)
	at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1839)
	at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1804)
	at org.eclipse.persistence.internal.jpa.QueryImpl.executeReadQuery(QueryImpl.java:258)
	at org.eclipse.persistence.internal.jpa.QueryImpl.getResultList(QueryImpl.java:473)
	at org.apache.ambari.server.orm.dao.DaoUtils.selectList(DaoUtils.java:62)
	at org.apache.ambari.server.orm.dao.ClusterDAO.getClusterConfigMappingEntitiesByCluster(ClusterDAO.java:240)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message