Return-Path: X-Original-To: apmail-brooklyn-dev-archive@minotaur.apache.org Delivered-To: apmail-brooklyn-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 29258187B6 for ; Wed, 13 Jan 2016 10:43:40 +0000 (UTC) Received: (qmail 81958 invoked by uid 500); 13 Jan 2016 10:43:40 -0000 Delivered-To: apmail-brooklyn-dev-archive@brooklyn.apache.org Received: (qmail 81886 invoked by uid 500); 13 Jan 2016 10:43:40 -0000 Mailing-List: contact dev-help@brooklyn.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@brooklyn.apache.org Delivered-To: mailing list dev@brooklyn.apache.org Received: (qmail 81796 invoked by uid 99); 13 Jan 2016 10:43:40 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Jan 2016 10:43:39 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id DB7ED2C1F57 for ; Wed, 13 Jan 2016 10:43:39 +0000 (UTC) Date: Wed, 13 Jan 2016 10:43:39 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: dev@brooklyn.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (BROOKLYN-214) OutOfMemoryError (too many threads): repeated calls to AttributeWhenReady MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/BROOKLYN-214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095997#comment-15095997 ] ASF GitHub Bot commented on BROOKLYN-214: ----------------------------------------- Github user asfgit closed the pull request at: https://github.com/apache/incubator-brooklyn/pull/1137 > OutOfMemoryError (too many threads): repeated calls to AttributeWhenReady > ------------------------------------------------------------------------- > > Key: BROOKLYN-214 > URL: https://issues.apache.org/jira/browse/BROOKLYN-214 > Project: Brooklyn > Issue Type: Bug > Reporter: Aled Sage > > When launching Clocker, an {{OutOfMemoryError}} was encountered due to too many threads. The underlying cause is repeated task execution to {{AttributeWhenReady}}, where each task blocks a thread. > The exception encountered was: > {noformat} > 2016-01-11 16:36:32,460 DEBUG o.a.b.u.c.t.BasicExecutionManager [brooklyn-execmanager-vzwdtuv4-5490]: Exception running task Task[machine.loadAverage @ h2jAHTjo <- ssh[uptime->machine.loadAverage]:LBUslVfG] (rethrowing): unable to > create new native thread > java.lang.OutOfMemoryError: unable to create new native thread > {noformat} > Shortly before the OOME, this was the resource usage: > {noformat} > 2016-01-11 16:36:26,884 DEBUG o.a.b.c.m.i.BrooklynGarbageCollector [brooklyn-gc]: brooklyn gc (after) - using 202 MB / 310 MB memory (122 kB soft); 1987 threads; storage: {datagrid={size=7, createCount=7}, refsMapSize=0, listsMapS > ize=0}; tasks: 1835 active, 1040 unfinished; 1425 remembered, 169790 total submitted) > {noformat} > Looking at a thread dump, there are 977 threads waiting for a lock on {{org.apache.brooklyn.camp.brooklyn.spi.dsl.methods.DslComponent$AttributeWhenReady}}}, e.g. > {noformat} > "brooklyn-execmanager-vzwdtuv4-1859" #57280 daemon prio=5 os_prio=31 tid=0x00007fa0baef0000 nid=0xf307 waiting for monitor entry [0x0000700009780000] > java.lang.Thread.State: BLOCKED (on object monitor) > at org.apache.brooklyn.camp.brooklyn.spi.dsl.BrooklynDslDeferredSupplier.get(BrooklynDslDeferredSupplier.java:93) > - waiting to lock <0x0000000784bc2828> (a org.apache.brooklyn.camp.brooklyn.spi.dsl.methods.DslComponent$AttributeWhenReady) > at org.apache.brooklyn.util.core.task.ValueResolver$2.call(ValueResolver.java:322) > at org.apache.brooklyn.util.core.task.DynamicSequentialTask$DstJob.call(DynamicSequentialTask.java:342) > at org.apache.brooklyn.util.core.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:493) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > The one thread holding that lock is doing: > {noformat} > "brooklyn-execmanager-vzwdtuv4-1864" #57290 daemon prio=5 os_prio=31 tid=0x00007fa0bbc19800 nid=0x76e7 waiting on condition [0x00007000061e1000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00000007851a4cb8> (a java.util.concurrent.FutureTask) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429) > at java.util.concurrent.FutureTask.get(FutureTask.java:191) > at com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:63) > at org.apache.brooklyn.util.core.task.BasicTask.get(BasicTask.java:342) > at org.apache.brooklyn.camp.brooklyn.spi.dsl.BrooklynDslDeferredSupplier.get(BrooklynDslDeferredSupplier.java:105) > - locked <0x0000000784bc2828> (a org.apache.brooklyn.camp.brooklyn.spi.dsl.methods.DslComponent$AttributeWhenReady) > at org.apache.brooklyn.util.core.task.ValueResolver$2.call(ValueResolver.java:322) > at org.apache.brooklyn.util.core.task.DynamicSequentialTask$DstJob.call(DynamicSequentialTask.java:342) > at org.apache.brooklyn.util.core.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:493) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Looking at the caller of {{org.apache.brooklyn.util.core.task.ValueResolver$2.call(ValueResolver.java:322)}}, it's interesting to see that there are only two instances of that. This tells us that the other calls (to {{ValueResolver.getMaybeInternal()}}) must all have had a short timeout. Inside getMaybeInternal(), it waits for the given timeout for the resolved value, and then calls {{task.cancel(true)}} before returning. > Given that the tasks' threads are waiting for a {{synchronized}} lock, they cannot be interrupted. One part of the fix is to change the implementation of {{BrooklynDslDeferredSupplier.get(BrooklynDslDeferredSupplier.java:93)}} to use a java.util.concurrent.lock that can be interrupted. However, it still feels unsafe (there could be other code that uses Java's {{synchronized}}). > Looking at where this {{ValueResolver.timeout(Duration)}} is set could tell us where these 977ish calls came from. One place is the REST api in {{RestValueResolver.getImmediateValue}}. If the web-console were polling for the entity's config, that could explain it. Another place is in the {{org.apache.brooklyn.enricher.stock.Transformer}} enricher. > This was encountered with 0.9.0-SNAPSHOT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)