hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matteo Bertozzi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16874) Potential NPE from ProcedureExecutor#stop()
Date Tue, 18 Oct 2016 21:42:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15586746#comment-15586746
] 

Matteo Bertozzi commented on HBASE-16874:
-----------------------------------------

Looks like in this test we forgot to set the number of executors thread to 1.
When we inject failures with the setToggleKillBeforeStoreUpdate() we have the assumption that
there is only one executor. In this case we have multiple executor running and toggling the
flag and killing the executor when we are restarting it on the other side. 
{noformat}
2016-10-18 18:55:30,857 INFO  [ProcedureExecutorWorker-5] procedure.ServerCrashProcedure(204):
Start processing crashed priapus.apache.org,38407,1476816438880
2016-10-18 18:55:30,857 WARN  [ProcedureExecutorWorker-5] procedure2.ProcedureExecutor$Testing(92):
Toggle Kill before store update to: true
Exception in thread "ProcedureExecutorWorker-5" java.lang.RuntimeException: the store must
be running before inserting data
	at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:542)
{noformat}

> Potential NPE from ProcedureExecutor#stop()
> -------------------------------------------
>
>                 Key: HBASE-16874
>                 URL: https://issues.apache.org/jira/browse/HBASE-16874
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Yu
>            Assignee: Matteo Bertozzi
>            Priority: Minor
>         Attachments: 16874.v1.txt, HBASE-16874-v0.patch
>
>
> When examining failed test :
> https://builds.apache.org/job/HBase-TRUNK_matrix/lastCompletedBuild/jdk=JDK%201.8%20(latest),label=yahoo-not-h2/testReport/org.apache.hadoop.hbase.master.procedure/TestMasterFailoverWithProcedures/org_apache_hadoop_hbase_master_procedure_TestMasterFailoverWithProcedures/
> I noticed the following:
> {code}
> 2016-10-18 18:47:39,313 INFO  [Time-limited test] procedure.TestMasterFailoverWithProcedures(306):
Restart 2 exec state: TRUNCATE_TABLE_CLEAR_FS_LAYOUT
> Exception in thread "ProcedureExecutorWorker-1" java.lang.NullPointerException
> 	at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.stop(ProcedureExecutor.java:533)
> 	at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1197)
> 	at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:959)
> 	at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$700(ProcedureExecutor.java:73)
> 	at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1405)
> {code}
> This seems to be the result of race between stop() and join() methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message