ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jun aoki" <jun.aoki....@gmail.com>
Subject Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected:<ABORTED> but was:<PENDING>
Date Thu, 09 Oct 2014 22:04:32 GMT


> On Oct. 9, 2014, 7:26 p.m., Sid Wagle wrote:
> > This test does something a unit test should not do, wait on a asynchronous operation.
> > How about, instead of starting ActionScheduler thread "scheduler.start()" and wait
for the Thread to run, call the doWork() which is called by the run() method.

Make sense, Sid.
One thing I noticed from the jenkins failure [1] is 
doWork() is called a few times and throws NullPointerException's while ActionScheduler is
running, and eventually goes into the expected state.
(I believe this is the expected behavior since it is a negative test case)
So I added a while loop along with calling doWork().kk


[1]
https://builds.apache.org/view/A-D/view/Ambari/job/Ambari-trunk-Commit/526/testReport/junit/org.apache.ambari.server.actionmanager/TestActionScheduler/testOpFailedEventRaisedForAbortedHostRole/


2014-10-09 17:36:49,198 INFO  [main] configuration.Configuration (Configuration.java:<init>(398))
- Reading password from existing file
2014-10-09 17:36:49,700 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(208))
- Scheduler wakes up
2014-10-09 17:36:49,702 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(209))
- Processing 1 in progress stages 
2014-10-09 17:36:49,702 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(229))
- ==> STAGE_i = 1(requestId=1,StageId=-1)
2014-10-09 17:36:49,703 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(509))
- ==> Collecting commands to schedule...
2014-10-09 17:36:49,705 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(615))
- Collected 2 commands to schedule in this wakeup.
2014-10-09 17:36:49,706 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(261))
- Stats for role:DATANODE, stats=numQueued=0, numInProgress=0, numSucceeded=0, numFailed=0,
numTimedOut=0, numPending=1, numAborted=0, totalHosts=1, successFactor=0.5
2014-10-09 17:36:49,706 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(261))
- Stats for role:NAMENODE, stats=numQueued=0, numInProgress=0, numSucceeded=0, numFailed=0,
numTimedOut=0, numPending=1, numAborted=0, totalHosts=1, successFactor=1.0
2014-10-09 17:36:49,777 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(368))
- Scheduler finished work.
2014-10-09 17:36:49,778 WARN  [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:run(189))
- Exception received
java.lang.NullPointerException
	at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191)
	at com.google.common.cache.LocalCache.put(LocalCache.java:4210)
	at com.google.common.cache.LocalCache$LocalManualCache.put(LocalCache.java:4804)
	at org.apache.ambari.server.actionmanager.ActionScheduler.processHostRole(ActionScheduler.java:782)
	at org.apache.ambari.server.actionmanager.ActionScheduler.doWork(ActionScheduler.java:298)
	at org.apache.ambari.server.actionmanager.ActionScheduler.run(ActionScheduler.java:184)
	at java.lang.Thread.run(Thread.java:724)
2014-10-09 17:36:49,880 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(208))
- Scheduler wakes up
2014-10-09 17:36:49,880 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(209))
- Processing 1 in progress stages 
2014-10-09 17:36:49,881 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(229))
- ==> STAGE_i = 1(requestId=1,StageId=-1)
2014-10-09 17:36:49,881 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(509))
- ==> Collecting commands to schedule...
2014-10-09 17:36:49,882 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:timeOutActionNeeded(722))
- Timing out action since agent is not heartbeating.
2014-10-09 17:36:49,883 INFO  [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(587))
- Host:host1, role:DATANODE, actionId:1--1 timed out
2014-10-09 17:36:49,886 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(615))
- Collected 2 commands to schedule in this wakeup.
2014-10-09 17:36:49,887 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(261))
- Stats for role:DATANODE, stats=numQueued=1, numInProgress=0, numSucceeded=0, numFailed=0,
numTimedOut=0, numPending=0, numAborted=0, totalHosts=1, successFactor=0.5
2014-10-09 17:36:49,887 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(261))
- Stats for role:NAMENODE, stats=numQueued=0, numInProgress=0, numSucceeded=0, numFailed=0,
numTimedOut=0, numPending=1, numAborted=0, totalHosts=1, successFactor=1.0
2014-10-09 17:36:49,888 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(368))
- Scheduler finished work.
2014-10-09 17:36:49,888 WARN  [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:run(189))
- Exception received
java.lang.NullPointerException
	at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191)
	at com.google.common.cache.LocalCache.put(LocalCache.java:4210)
	at com.google.common.cache.LocalCache$LocalManualCache.put(LocalCache.java:4804)
	at org.apache.ambari.server.actionmanager.ActionScheduler.processHostRole(ActionScheduler.java:782)
	at org.apache.ambari.server.actionmanager.ActionScheduler.doWork(ActionScheduler.java:298)
	at org.apache.ambari.server.actionmanager.ActionScheduler.run(ActionScheduler.java:184)
	at java.lang.Thread.run(Thread.java:724)
2014-10-09 17:36:49,989 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(208))
- Scheduler wakes up
2014-10-09 17:36:49,989 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(209))
- Processing 1 in progress stages 
2014-10-09 17:36:49,990 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(229))
- ==> STAGE_i = 1(requestId=1,StageId=-1)
2014-10-09 17:36:49,990 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(509))
- ==> Collecting commands to schedule...
2014-10-09 17:36:49,991 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:timeOutActionNeeded(722))
- Timing out action since agent is not heartbeating.
2014-10-09 17:36:49,991 INFO  [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(587))
- Host:host1, role:DATANODE, actionId:1--1 timed out
2014-10-09 17:36:49,992 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(615))
- Collected 2 commands to schedule in this wakeup.
2014-10-09 17:36:49,992 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(261))
- Stats for role:DATANODE, stats=numQueued=1, numInProgress=0, numSucceeded=0, numFailed=0,
numTimedOut=0, numPending=0, numAborted=0, totalHosts=1, successFactor=0.5
2014-10-09 17:36:49,992 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(261))
- Stats for role:NAMENODE, stats=numQueued=0, numInProgress=0, numSucceeded=0, numFailed=0,
numTimedOut=0, numPending=1, numAborted=0, totalHosts=1, successFactor=1.0
2014-10-09 17:36:49,992 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(368))
- Scheduler finished work.
2014-10-09 17:36:49,993 WARN  [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:run(189))
- Exception received
java.lang.NullPointerException
	at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191)
	at com.google.common.cache.LocalCache.put(LocalCache.java:4210)
	at com.google.common.cache.LocalCache$LocalManualCache.put(LocalCache.java:4804)
	at org.apache.ambari.server.actionmanager.ActionScheduler.processHostRole(ActionScheduler.java:782)
	at org.apache.ambari.server.actionmanager.ActionScheduler.doWork(ActionScheduler.java:298)
	at org.apache.ambari.server.actionmanager.ActionScheduler.run(ActionScheduler.java:184)
	at java.lang.Thread.run(Thread.java:724)
2014-10-09 17:36:50,093 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(208))
- Scheduler wakes up
2014-10-09 17:36:50,094 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(209))
- Processing 1 in progress stages 
2014-10-09 17:36:50,094 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(229))
- ==> STAGE_i = 1(requestId=1,StageId=-1)
2014-10-09 17:36:50,094 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(509))
- ==> Collecting commands to schedule...
2014-10-09 17:36:50,095 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:timeOutActionNeeded(722))
- Timing out action since agent is not heartbeating.
2014-10-09 17:36:50,096 INFO  [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(587))
- Host:host1, role:DATANODE, actionId:1--1 timed out
2014-10-09 17:36:50,096 WARN  [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(589))
- Host:host1, role:DATANODE, actionId:1--1 expired
2014-10-09 17:36:50,097 INFO  [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(599))
- Removing command from queue, host=host1, commandId=1--1 
2014-10-09 17:36:50,098 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(615))
- Collected 1 commands to schedule in this wakeup.
2014-10-09 17:36:50,098 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(261))
- Stats for role:DATANODE, stats=numQueued=0, numInProgress=0, numSucceeded=0, numFailed=0,
numTimedOut=1, numPending=0, numAborted=0, totalHosts=1, successFactor=0.5
2014-10-09 17:36:50,099 WARN  [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(275))
- Operation completely failed, aborting request id:1
2014-10-09 17:36:50,101 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(368))
- Scheduler finished work.


- jun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56035
-----------------------------------------------------------


On Oct. 9, 2014, 6:34 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2014, 6:34 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java
a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message