ambari-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suraj Naik (Jira)" <j...@apache.org>
Subject [jira] [Updated] (AMBARI-25613) Concurrent Host Modification exception while sending INSTALL/START Host request
Date Sun, 31 Jan 2021 11:18:00 GMT

     [ https://issues.apache.org/jira/browse/AMBARI-25613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Suraj Naik updated AMBARI-25613:
--------------------------------
    Description: 
 
{code:java}
java.lang.RuntimeException: START Host request submission failed: java.lang.RuntimeException:
Update Host request submission failed: java.util.ConcurrentModificationException
at org.apache.ambari.server.topology.AmbariContext.startHost(AmbariContext.java:497)
at org.apache.ambari.server.topology.ClusterTopologyImpl.startHost(ClusterTopologyImpl.java:268)
at org.apache.ambari.server.topology.tasks.StartHostTask.runTask(StartHostTask.java:51)
at org.apache.ambari.server.topology.tasks.TopologyHostTask.run(TopologyHostTask.java:55)
at org.apache.ambari.server.topology.HostOfferResponse$1.run(HostOfferResponse.java:85)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Update Host request submission failed: java.util.ConcurrentModificationException
at org.apache.ambari.server.controller.internal.HostComponentResourceProvider$4.invoke(HostComponentResourceProvider.java:865)
at org.apache.ambari.server.controller.internal.HostComponentResourceProvider$4.invoke(HostComponentResourceProvider.java:852)
at org.apache.ambari.server.controller.internal.AbstractResourceProvider.invokeWithRetry(AbstractResourceProvider.java:465)
at org.apache.ambari.server.controller.internal.AbstractResourceProvider.modifyResources(AbstractResourceProvider.java:346)
at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.doUpdateResources(HostComponentResourceProvider.java:852)
at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.start(HostComponentResourceProvider.java:492)
at org.apache.ambari.server.topology.AmbariContext.startHost(AmbariContext.java:494)
at org.apache.ambari.server.topology.ClusterTopologyImpl.startHost(ClusterTopologyImpl.java:268)
at org.apache.ambari.server.topology.tasks.StartHostTask.runTask(StartHostTask.java:51)
at org.apache.ambari.server.topology.tasks.TopologyHostTask.run(TopologyHostTask.java:55)
at org.apache.ambari.server.topology.HostOfferResponse$1.run(HostOfferResponse.java:85)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.ConcurrentModificationException: NA
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1445)
at java.util.HashMap$EntryIterator.next(HashMap.java:1479)
at java.util.HashMap$EntryIterator.next(HashMap.java:1477)
at java.util.HashMap.putMapEntries(HashMap.java:512)
at java.util.HashMap.<init>(HashMap.java:490)
at org.apache.ambari.server.topology.HostRequest.getPhysicalTaskMapping(HostRequest.java:458)
at org.apache.ambari.server.topology.LogicalRequest.getStageSummaries(LogicalRequest.java:286)
at org.apache.ambari.server.topology.TopologyManager.getPendingHostComponents(TopologyManager.java:823)
at org.apache.ambari.server.utils.StageUtils.getClusterHostInfo(StageUtils.java:306)
at org.apache.ambari.server.controller.AmbariManagementControllerImpl.doStageCreation(AmbariManagementControllerImpl.java:2788)
at org.apache.ambari.server.controller.AmbariManagementControllerImpl.addStages(AmbariManagementControllerImpl.java:3513)
at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.updateHostComponents(HostComponentResourceProvider.java:707)
at org.apache.ambari.server.controller.internal.HostComponentResourceProvider$4.invoke(HostComponentResourceProvider.java:857)
at org.apache.ambari.server.controller.internal.HostComponentResourceProvider$4.invoke(HostComponentResourceProvider.java:852)
at org.apache.ambari.server.controller.internal.AbstractResourceProvider.invokeWithRetry(AbstractResourceProvider.java:465)
at org.apache.ambari.server.controller.internal.AbstractResourceProvider.modifyResources(AbstractResourceProvider.java:346)
at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.doUpdateResources(HostComponentResourceProvider.java:852)
at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.start(HostComponentResourceProvider.java:492)
at org.apache.ambari.server.topology.AmbariContext.startHost(AmbariContext.java:494)
at org.apache.ambari.server.topology.ClusterTopologyImpl.startHost(ClusterTopologyImpl.java:268)
at org.apache.ambari.server.topology.tasks.StartHostTask.runTask(StartHostTask.java:51)
at org.apache.ambari.server.topology.tasks.TopologyHostTask.run(TopologyHostTask.java:55)
at org.apache.ambari.server.topology.HostOfferResponse$1.run(HostOfferResponse.java:85)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}
 

 

 

My teammate [~ramkrishna] did some analysis on this one by adding logs and latches and found that
the installation and registration though done parallely each thread tries to get the entire
cluster’s view of the current physical tasks. So it is bound to happen that when a registration
is happening the other thread can do a getPhysicalTaskMapping().  (leading to CME)

 

 
  
  

  was:
java.lang.RuntimeException: START Host request submission failed: java.lang.RuntimeException:
Update Host request submission failed: java.util.ConcurrentModificationException

at org.apache.ambari.server.topology.AmbariContext.startHost(AmbariContext.java:497)

at org.apache.ambari.server.topology.ClusterTopologyImpl.startHost(ClusterTopologyImpl.java:268)

at org.apache.ambari.server.topology.tasks.StartHostTask.runTask(StartHostTask.java:51)

at org.apache.ambari.server.topology.tasks.TopologyHostTask.run(TopologyHostTask.java:55)

at org.apache.ambari.server.topology.HostOfferResponse$1.run(HostOfferResponse.java:85)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.lang.RuntimeException: Update Host request submission failed: java.util.ConcurrentModificationException

at org.apache.ambari.server.controller.internal.HostComponentResourceProvider$4.invoke(HostComponentResourceProvider.java:865)

at org.apache.ambari.server.controller.internal.HostComponentResourceProvider$4.invoke(HostComponentResourceProvider.java:852)

at org.apache.ambari.server.controller.internal.AbstractResourceProvider.invokeWithRetry(AbstractResourceProvider.java:465)

at org.apache.ambari.server.controller.internal.AbstractResourceProvider.modifyResources(AbstractResourceProvider.java:346)

at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.doUpdateResources(HostComponentResourceProvider.java:852)

at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.start(HostComponentResourceProvider.java:492)

at org.apache.ambari.server.topology.AmbariContext.startHost(AmbariContext.java:494)

at org.apache.ambari.server.topology.ClusterTopologyImpl.startHost(ClusterTopologyImpl.java:268)

at org.apache.ambari.server.topology.tasks.StartHostTask.runTask(StartHostTask.java:51)

at org.apache.ambari.server.topology.tasks.TopologyHostTask.run(TopologyHostTask.java:55)

at org.apache.ambari.server.topology.HostOfferResponse$1.run(HostOfferResponse.java:85)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.util.ConcurrentModificationException: NA

at java.util.HashMap$HashIterator.nextNode(HashMap.java:1445)

at java.util.HashMap$EntryIterator.next(HashMap.java:1479)

at java.util.HashMap$EntryIterator.next(HashMap.java:1477)

at java.util.HashMap.putMapEntries(HashMap.java:512)

at java.util.HashMap.<init>(HashMap.java:490)

at org.apache.ambari.server.topology.HostRequest.getPhysicalTaskMapping(HostRequest.java:458)

at org.apache.ambari.server.topology.LogicalRequest.getStageSummaries(LogicalRequest.java:286)

at org.apache.ambari.server.topology.TopologyManager.getPendingHostComponents(TopologyManager.java:823)

at org.apache.ambari.server.utils.StageUtils.getClusterHostInfo(StageUtils.java:306)

at org.apache.ambari.server.controller.AmbariManagementControllerImpl.doStageCreation(AmbariManagementControllerImpl.java:2788)

at org.apache.ambari.server.controller.AmbariManagementControllerImpl.addStages(AmbariManagementControllerImpl.java:3513)

at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.updateHostComponents(HostComponentResourceProvider.java:707)

at org.apache.ambari.server.controller.internal.HostComponentResourceProvider$4.invoke(HostComponentResourceProvider.java:857)

at org.apache.ambari.server.controller.internal.HostComponentResourceProvider$4.invoke(HostComponentResourceProvider.java:852)

at org.apache.ambari.server.controller.internal.AbstractResourceProvider.invokeWithRetry(AbstractResourceProvider.java:465)

at org.apache.ambari.server.controller.internal.AbstractResourceProvider.modifyResources(AbstractResourceProvider.java:346)

at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.doUpdateResources(HostComponentResourceProvider.java:852)

at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.start(HostComponentResourceProvider.java:492)

at org.apache.ambari.server.topology.AmbariContext.startHost(AmbariContext.java:494)

at org.apache.ambari.server.topology.ClusterTopologyImpl.startHost(ClusterTopologyImpl.java:268)

at org.apache.ambari.server.topology.tasks.StartHostTask.runTask(StartHostTask.java:51)

at org.apache.ambari.server.topology.tasks.TopologyHostTask.run(TopologyHostTask.java:55)

at org.apache.ambari.server.topology.HostOfferResponse$1.run(HostOfferResponse.java:85)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)

 

 

My teammate [~ramkrishna] did some analysis on this one by adding logs and latches and found that
the installation and registration though done parallely each thread tries to get the entire
cluster’s view of the current physical tasks. So it is bound to happen that when a registration
is happening the other thread can do a getPhysicalTaskMapping().  (leading to CME)

 

 
 
 


> Concurrent Host Modification exception while sending INSTALL/START Host request
> -------------------------------------------------------------------------------
>
>                 Key: AMBARI-25613
>                 URL: https://issues.apache.org/jira/browse/AMBARI-25613
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.7.6
>            Reporter: Suraj Naik
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
>  
> {code:java}
> java.lang.RuntimeException: START Host request submission failed: java.lang.RuntimeException:
Update Host request submission failed: java.util.ConcurrentModificationException
> at org.apache.ambari.server.topology.AmbariContext.startHost(AmbariContext.java:497)
> at org.apache.ambari.server.topology.ClusterTopologyImpl.startHost(ClusterTopologyImpl.java:268)
> at org.apache.ambari.server.topology.tasks.StartHostTask.runTask(StartHostTask.java:51)
> at org.apache.ambari.server.topology.tasks.TopologyHostTask.run(TopologyHostTask.java:55)
> at org.apache.ambari.server.topology.HostOfferResponse$1.run(HostOfferResponse.java:85)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.RuntimeException: Update Host request submission failed: java.util.ConcurrentModificationException
> at org.apache.ambari.server.controller.internal.HostComponentResourceProvider$4.invoke(HostComponentResourceProvider.java:865)
> at org.apache.ambari.server.controller.internal.HostComponentResourceProvider$4.invoke(HostComponentResourceProvider.java:852)
> at org.apache.ambari.server.controller.internal.AbstractResourceProvider.invokeWithRetry(AbstractResourceProvider.java:465)
> at org.apache.ambari.server.controller.internal.AbstractResourceProvider.modifyResources(AbstractResourceProvider.java:346)
> at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.doUpdateResources(HostComponentResourceProvider.java:852)
> at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.start(HostComponentResourceProvider.java:492)
> at org.apache.ambari.server.topology.AmbariContext.startHost(AmbariContext.java:494)
> at org.apache.ambari.server.topology.ClusterTopologyImpl.startHost(ClusterTopologyImpl.java:268)
> at org.apache.ambari.server.topology.tasks.StartHostTask.runTask(StartHostTask.java:51)
> at org.apache.ambari.server.topology.tasks.TopologyHostTask.run(TopologyHostTask.java:55)
> at org.apache.ambari.server.topology.HostOfferResponse$1.run(HostOfferResponse.java:85)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.util.ConcurrentModificationException: NA
> at java.util.HashMap$HashIterator.nextNode(HashMap.java:1445)
> at java.util.HashMap$EntryIterator.next(HashMap.java:1479)
> at java.util.HashMap$EntryIterator.next(HashMap.java:1477)
> at java.util.HashMap.putMapEntries(HashMap.java:512)
> at java.util.HashMap.<init>(HashMap.java:490)
> at org.apache.ambari.server.topology.HostRequest.getPhysicalTaskMapping(HostRequest.java:458)
> at org.apache.ambari.server.topology.LogicalRequest.getStageSummaries(LogicalRequest.java:286)
> at org.apache.ambari.server.topology.TopologyManager.getPendingHostComponents(TopologyManager.java:823)
> at org.apache.ambari.server.utils.StageUtils.getClusterHostInfo(StageUtils.java:306)
> at org.apache.ambari.server.controller.AmbariManagementControllerImpl.doStageCreation(AmbariManagementControllerImpl.java:2788)
> at org.apache.ambari.server.controller.AmbariManagementControllerImpl.addStages(AmbariManagementControllerImpl.java:3513)
> at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.updateHostComponents(HostComponentResourceProvider.java:707)
> at org.apache.ambari.server.controller.internal.HostComponentResourceProvider$4.invoke(HostComponentResourceProvider.java:857)
> at org.apache.ambari.server.controller.internal.HostComponentResourceProvider$4.invoke(HostComponentResourceProvider.java:852)
> at org.apache.ambari.server.controller.internal.AbstractResourceProvider.invokeWithRetry(AbstractResourceProvider.java:465)
> at org.apache.ambari.server.controller.internal.AbstractResourceProvider.modifyResources(AbstractResourceProvider.java:346)
> at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.doUpdateResources(HostComponentResourceProvider.java:852)
> at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.start(HostComponentResourceProvider.java:492)
> at org.apache.ambari.server.topology.AmbariContext.startHost(AmbariContext.java:494)
> at org.apache.ambari.server.topology.ClusterTopologyImpl.startHost(ClusterTopologyImpl.java:268)
> at org.apache.ambari.server.topology.tasks.StartHostTask.runTask(StartHostTask.java:51)
> at org.apache.ambari.server.topology.tasks.TopologyHostTask.run(TopologyHostTask.java:55)
> at org.apache.ambari.server.topology.HostOfferResponse$1.run(HostOfferResponse.java:85)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
>  
>  
>  
> My teammate [~ramkrishna] did some analysis on this one by adding logs and latches and
found that the installation and registration though done parallely each thread tries to get
the entire cluster’s view of the current physical tasks. So it is bound to happen that when
a registration is happening the other thread can do a getPhysicalTaskMapping().  (leading
to CME)
>  
>  
>   
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message