phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-4131) UngroupedAggregateRegionObserver.preClose() and doPostScannerOpen() can deadlock
Date Thu, 31 Aug 2017 07:18:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16148556#comment-16148556
] 

Hadoop QA commented on PHOENIX-4131:
------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12884597/PHOENIX-4131.patch
  against master branch at commit 7d8b8430212fae117ac09faf6b7c22bf673e9073.
  ATTACHMENT ID: 12884597

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 3 new or modified
tests.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 67 warning messages.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 lineLengths{color}.  The patch does not introduce lines longer than 100

     {color:red}-1 core tests{color}.  The patch failed these unit tests:
     ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.KeyOnlyIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.MultiCfQueryExecIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.QueryWithTableSampleIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.ParallelIteratorsIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.TenantSpecificTablesDDLIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.SaltedViewIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.coprocessor.StatisticsCollectionRunTrackerIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.ExplainPlanWithStatsEnabledIT

Test results: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1331//testReport/
Javadoc warnings: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1331//artifact/patchprocess/patchJavadocWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/1331//console

This message is automatically generated.

> UngroupedAggregateRegionObserver.preClose() and doPostScannerOpen() can deadlock
> --------------------------------------------------------------------------------
>
>                 Key: PHOENIX-4131
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4131
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Samarth Jain
>            Assignee: Samarth Jain
>         Attachments: PHOENIX-4131.patch
>
>
> On my local test run I saw that the tests were not completing because the mini cluster
couldn't shut down. So I took a jstack and discovered the following deadlock:
> {code}
> "RS:0;samarthjai-wsm4:59006" #16265 prio=5 os_prio=31 tid=0x00007fafa6327000 nid=0x37b3f
runnable [0x00007000115f5000]
>    java.lang.Thread.State: RUNNABLE
> 	at java.lang.Object.wait(Native Method)
> 	at org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.preClose(UngroupedAggregateRegionObserver.java:1201)
> 	- locked <0x000000072bc406b8> (a java.lang.Object)
> 	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$4.call(RegionCoprocessorHost.java:494)
> 	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1673)
> 	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1749)
> 	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preClose(RegionCoprocessorHost.java:490)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:2843)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegionIgnoreErrors(HRegionServer.java:2805)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.closeUserRegions(HRegionServer.java:2423)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1052)
> 	at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:157)
> 	at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:110)
> 	at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:141)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:360)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
> 	at org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:334)
> 	at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:139)
> 	at java.lang.Thread.run(Thread.java:748)
> {code}
> {code}
> "RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=59006" #16246 daemon prio=5 os_prio=31
tid=0x00007fafae856000 nid=0x1abdb waiting for monitor entry [0x00007000102bc000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
> 	at org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.doPostScannerOpen(UngroupedAggregateRegionObserver.java:734)
> 	- waiting to lock <0x000000072bc406b8> (a java.lang.Object)
> 	at org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.overrideDelegate(BaseScannerRegionObserver.java:236)
> 	at org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.nextRaw(BaseScannerRegionObserver.java:281)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2629)
> 	- locked <0x000000072b625a90> (a org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2833)
> 	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:34950)
> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2339)
> 	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> {code}
> preClose() has the object monitor and is waiting for scanReferencesCount to go down to
0. doPostScannerOpen() is trying to acquire the same lock so that it can reduce the scanReferencesCount
to 0.
> I think this bug was introduced in PHOENIX-3111 to solve other deadlocks. FYI, [~rajeshbabu],
[~sergey.soldatov], [~enis], [~lhofhansl].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message