brooklyn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BROOKLYN-313) Exception stacktrace never logged when resize effector fails, called by AutoScaler policy
Date Mon, 11 Jul 2016 12:16:11 GMT

    [ https://issues.apache.org/jira/browse/BROOKLYN-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370628#comment-15370628
] 

ASF GitHub Bot commented on BROOKLYN-313:
-----------------------------------------

Github user bostko commented on the issue:

    https://github.com/apache/brooklyn-server/pull/245
  
    LGTM


> Exception stacktrace never logged when resize effector fails, called by AutoScaler policy
> -----------------------------------------------------------------------------------------
>
>                 Key: BROOKLYN-313
>                 URL: https://issues.apache.org/jira/browse/BROOKLYN-313
>             Project: Brooklyn
>          Issue Type: Bug
>    Affects Versions: 0.9.0
>            Reporter: Aled Sage
>
> A customer was using the ControlledDynamicWebAppCluster, with a AutoScalerPolicy. However,
they also had an EntitlementManager that throws a NullPointerException whenever it is called
with {{context==null || context.user==null}} (which happens for automated calls not triggered
explicitly by a user, such as for policies).
> The underlying cause of the NullPointerException was never logged. All that made it into
the logs was something like:
> {noformat}
> 2016-07-09 16:07:57,491 DEBUG o.a.b.util.core.task.BasicTask [brooklyn-autoscalerpolicy-0]:
call from Thread[brooklyn-autoscalerpolicy-0,5,main], blocking until 'Task[Auto-scaler]@DHtSIGZf'
finishes, ended with error: java.util.concurrent.ExecutionException: org.apache.brooklyn.util.exceptions.PropagatedRuntimeException:
Error invoking resize at DynamicWebAppClusterImpl{id=iyo37ngogv}: NullPointerException
> {noformat}
> We should *always* log (at least at debug) the underlying cause of an exception, even
if it risks logging it multiple times.
> I've reproduced this in a test case. The thread stacktraces are shown below:
> The thread below is about to throw a NullPointerException. The logging from this thread
is also shown. The underlying exception is recorded in the task (so a subsequent task.get()
would throw it):
> {noformat}
> Daemon Thread [brooklyn-execmanager-cU8EcGHT-39] (Suspended (breakpoint at line 64 in
EffectorExceptionLoggingTest$ThrowingEntitlementManager))	
> 	EffectorExceptionLoggingTest$ThrowingEntitlementManager.isEntitled(EntitlementContext,
EntitlementClass<T>, T) line: 64	
> 	Entitlements.isEntitled(EntitlementManager, EntitlementClass<T>, T) line: 409

> 	Entitlements.checkEntitled(EntitlementManager, EntitlementClass<T>, T) line: 414

> 	EntityManagementSupport$EntityChangeListenerImpl.onEffectorStarting(Effector<?>,
Object) line: 469	
> 	EffectorUtils.invokeMethodEffector(Entity, Effector<T>, Map<String,?>) line:
274	
> 	MethodEffector<T>.call(Entity, Map) line: 148	
> 	MethodEffector<T>(AbstractEffector<T>).call(Map, Entity) line: 61	
> 	AbstractEffector$1$1.call() line: 83	
> 	DynamicSequentialTask$DstJob.call() line: 359	
> 	BasicExecutionManager$SubmissionCallable<T>.call() line: 519	
> 	FutureTask<V>.run() line: 262	
> 	ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) line: 1145	
> 	ThreadPoolExecutor$Worker.run() line: 615	
> 	Thread.run() line: 745	
> 2016-07-09 16:33:47,333 WARN  o.a.b.c.m.i.EffectorUtils [brooklyn-execmanager-cU8EcGHT-39]:
Error invoking resize at DynamicWebAppClusterImpl{id=iyo37ngogv}: NullPointerException
> 2016-07-09 16:33:51,213 DEBUG o.a.b.u.c.t.BasicExecutionManager [brooklyn-execmanager-cU8EcGHT-39]:
Exception running task Task[resize]@ZbLY4WUg (rethrowing): org.apache.brooklyn.core.mgmt.internal.EffectorUtils$EffectorCallPropagatedRuntimeException:
Error invoking resize at DynamicWebAppClusterImpl{id=iyo37ngogv}: NullPointerException
> 2016-07-09 16:33:51,214 DEBUG o.a.b.u.core.task.CompoundTask [brooklyn-execmanager-cU8EcGHT-36]:
Secondary job queue for Task[Auto-scaler]@pOJ7r6Y0 ignoring error in inessential task Task[resize]@ZbLY4WUg:
java.util.concurrent.ExecutionException: org.apache.brooklyn.core.mgmt.internal.EffectorUtils$EffectorCallPropagatedRuntimeException:
Error invoking resize at DynamicWebAppClusterImpl{id=iyo37ngogv}: NullPointerException
> {noformat}
> The thread below invoked the effector - note that the AutoScalerPolicy has wrapped the
invocation in a task. The logging from this thread is also shown. Within that task, it will
not catch the exception:
> {noformat}
> "brooklyn-execmanager-cU8EcGHT-0" daemon prio=5 tid=0x00007fcfe5b5f800 nid=0x6003 waiting
on condition [0x00007000018de000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00000007f93d7ab0> (a java.util.concurrent.FutureTask)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>         at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:425)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:187)
>         at com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:63)
>         at org.apache.brooklyn.util.core.task.BasicTask.get(BasicTask.java:361)
>         at org.apache.brooklyn.util.core.task.BasicTask.getUnchecked(BasicTask.java:370)
>         at org.apache.brooklyn.util.core.task.DynamicTasks$TaskQueueingResult.andWaitForSuccess(DynamicTasks.java:160)
>         at org.apache.brooklyn.core.objs.proxy.EntityProxyImpl.invoke(EntityProxyImpl.java:211)
>         at com.sun.proxy.$Proxy19.resize(Unknown Source)
>         at org.apache.brooklyn.entity.webapp.ControlledDynamicWebAppClusterImpl.resize(ControlledDynamicWebAppClusterImpl.java:297)
>         at org.apache.brooklyn.policy.autoscaling.AutoScalerPolicy$3.resize(AutoScalerPolicy.java:364)
>         at org.apache.brooklyn.policy.autoscaling.AutoScalerPolicy$14.call(AutoScalerPolicy.java:1054)
>         at org.apache.brooklyn.policy.autoscaling.AutoScalerPolicy$14.call(AutoScalerPolicy.java:1)
>         at org.apache.brooklyn.util.core.task.DynamicSequentialTask$DstJob.call(DynamicSequentialTask.java:359)
>         at org.apache.brooklyn.util.core.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:519)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> 2016-07-09 16:36:54,385 DEBUG o.a.b.u.c.t.BasicExecutionManager [brooklyn-execmanager-cU8EcGHT-0]:
Exception running task Task[Auto-scaler]@pOJ7r6Y0 (rethrowing): org.apache.brooklyn.util.exceptions.PropagatedRuntimeException:
Error invoking resize at DynamicWebAppClusterImpl{id=iyo37ngogv}: NullPointerException
> {noformat}
> The thread below invoked the task. It has called {{task.blockUntilEnded}} to wait for
the result:
> {noformat}
> "brooklyn-autoscalerpolicy-0" prio=5 tid=0x00007fcfe5b49800 nid=0x1407 waiting on condition
[0x0000700002b14000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00000007f81a07b0> (a java.util.concurrent.FutureTask)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>         at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:425)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:187)
>         at com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:63)
>         at org.apache.brooklyn.util.core.task.BasicTask.blockUntilEnded(BasicTask.java:417)
>         at org.apache.brooklyn.util.core.task.BasicTask.blockUntilEnded(BasicTask.java:407)
>         at org.apache.brooklyn.policy.autoscaling.AutoScalerPolicy.resizeNow(AutoScalerPolicy.java:1066)
>         at org.apache.brooklyn.policy.autoscaling.AutoScalerPolicy.access$15(AutoScalerPolicy.java:1021)
>         at org.apache.brooklyn.policy.autoscaling.AutoScalerPolicy$13.run(AutoScalerPolicy.java:952)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> 2016-07-09 16:38:45,089 DEBUG o.a.b.util.core.task.BasicTask [brooklyn-autoscalerpolicy-0]:
call from Thread[brooklyn-autoscalerpolicy-0,5,main], blocking until 'Task[Auto-scaler]@pOJ7r6Y0'
finishes, ended with error: java.util.concurrent.ExecutionException: org.apache.brooklyn.util.exceptions.PropagatedRuntimeException:
Error invoking resize at DynamicWebAppClusterImpl{id=iyo37ngogv}: NullPointerException
> {noformat}
> ---
> Note we'd have had the same problem if the cluster.resize() effector itself had thrown
the exception (rather than the entitlements). We'd have only got logging like:
> {noformat}
> 2016-07-09 16:43:52,293 DEBUG o.a.b.u.c.t.BasicExecutionManager [brooklyn-execmanager-NpLl6jt3-1]:
Exception running task Task[Auto-scaler]@uLBsngUp (rethrowing): java.lang.NullPointerException
> 2016-07-09 16:43:52,294 DEBUG o.a.b.util.core.task.BasicTask [brooklyn-autoscalerpolicy-0]:
call from Thread[brooklyn-autoscalerpolicy-0,5,main], blocking until 'Task[Auto-scaler]@uLBsngUp'
finishes, ended with error: java.util.concurrent.ExecutionException: java.lang.NullPointerException
> {noformat}
> This lack of logging is hard to reproduce in a simpler test case where effectors are
invoked directly. It is primarily because of the additional layers of task, and the use of
{{basicTask.blockUntilEnded}}, that we lose the message.
> However, I have in the past seen something similar when there is an exception in an effector
invoked from the web-console: the web-console's activity view showed the full exception stacktrace,
but the debug log did not have it. This is very bad for when a customer sends us a debug log!
> ---
> It's unclear what the holistic fix for this should be. 
> We do not want to require the caller of the effector to always wrap the call in a try-catch
with logging: the failure of an effector should get logged automatically, and should include
the underlying cause. Any other option risks us being sent customer log files that give no
indication of what caused the problem.
> ---
> We can hopefully log any effector invocation error in the same way. Alternatively we
could treat framework exceptions (e.g. NPE preparing to execute) as different from exceptions
during the actual effector implementation. 
> I don't want to do it in low-level code like {{BasicTask.blockUntilEnded}}, because that
will probably result in a huge amount of duplicate logging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message