hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14241) Fix deadlock during cluster shutdown due to concurrent connection close
Date Wed, 19 Aug 2015 15:21:46 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703173#comment-14703173
] 

Hadoop QA commented on HBASE-14241:
-----------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12751257/14241-v5.txt
  against master branch at commit 1bb9e3ae966853db8c4138b4abbe14636d7592db.
  ATTACHMENT ID: 12751257

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include any new or modified
tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions
(2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 protoc{color}.  The applied patch does not increase the total number of
protoc compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any warning messages.

    {color:green}+1 checkstyle{color}.  The applied patch does not increase the total number
of checkstyle errors

    {color:green}+1 findbugs{color}.  The patch does not introduce any  new Findbugs (version
2.0.3) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 lineLengths{color}.  The patch does not introduce lines longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

     {color:red}-1 core tests{color}.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.regionserver.TestSCVFWithMiniCluster
                  org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2
                  org.apache.hadoop.hbase.TestAcidGuarantees
                  org.apache.hadoop.hbase.mapreduce.TestTableSnapshotInputFormat
                  org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1
                  org.apache.hadoop.hbase.TestMetaTableAccessor
                  org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDistributedLogReplay
                  org.apache.hadoop.hbase.mapreduce.TestTableMapReduce
                  org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction
                  org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper
                  org.apache.hadoop.hbase.TestFullLogReconstruction
                  org.apache.hadoop.hbase.regionserver.TestRowTooBig
                  org.apache.hadoop.hbase.mapreduce.TestImportTSVWithVisibilityLabels
                  org.apache.hadoop.hbase.mapreduce.TestCellCounter
                  org.apache.hadoop.hbase.mob.compactions.TestMobCompactor
                  org.apache.hadoop.hbase.mapreduce.TestImportTSVWithTTLs
                  org.apache.hadoop.hbase.mapreduce.TestWALRecordReader
                  org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithACL
                  org.apache.hadoop.hbase.mapreduce.TestWALPlayer
                  org.apache.hadoop.hbase.mapreduce.TestImportExport
                  org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat
                  org.apache.hadoop.hbase.mapreduce.TestSecureLoadIncrementalHFiles
                  org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDeletes
                  org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesUseSecurityEndPoint
                  org.apache.hadoop.hbase.mapreduce.TestSyncTable
                  org.apache.hadoop.hbase.mob.compactions.TestPartitionedMobCompactor
                  org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesSplitRecovery
                  org.apache.hadoop.hbase.regionserver.TestSplitWalDataLoss
                  org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFiles
                  org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover
                  org.apache.hadoop.hbase.mapreduce.TestImportTSVWithOperationAttributes
                  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
                  org.apache.hadoop.hbase.mapreduce.TestHashTable
                  org.apache.hadoop.hbase.mapreduce.TestHRegionPartitioner
                  org.apache.hadoop.hbase.mapreduce.TestImportTsv

     {color:red}-1 core zombie tests{color}.  There are 12 zombie test(s): 	at org.apache.hadoop.hbase.TestAcidGuarantees.testGetAtomicity(TestAcidGuarantees.java:362)

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15167//testReport/
Release Findbugs (version 2.0.3) 	warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15167//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15167//artifact/patchprocess/checkstyle-aggregate.html

  Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15167//console

This message is automatically generated.

> Fix deadlock during cluster shutdown due to concurrent connection close
> -----------------------------------------------------------------------
>
>                 Key: HBASE-14241
>                 URL: https://issues.apache.org/jira/browse/HBASE-14241
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.0.2
>            Reporter: Andrew Purtell
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 2.0.0, 1.0.2, 1.2.0, 1.3.0, 1.1.3
>
>         Attachments: 14241-v2.txt, 14241-v3.txt, 14241-v4.txt, 14241-v5.txt, deadlock.txt.gz
>
>
> Caught while testing branch-1.0, shutting down TestMasterMetricsWrapper.
> Found one Java-level deadlock:
> =============================
> "MASTER_META_SERVER_OPERATIONS-ip-10-32-130-237:55342-0":
>   waiting to lock monitor 0x00007f2a040051c8 (object 0x00000007e36108a8, a org.apache.hadoop.hbase.util.PoolMap),
>   which is held by "M:0;ip-10-32-130-237:55342"
> "M:0;ip-10-32-130-237:55342":
>   waiting to lock monitor 0x00007f2a04005118 (object 0x00000007e3610b00, a org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection),
>   which is held by "MASTER_META_SERVER_OPERATIONS-ip-10-32-130-237:55342-0"
> Full stack dump and deadlock debug output attached.
> Root cause:
> In RpcClientImpl#close(), we obtain lock on connections first:
> {code}
>     synchronized (connections) {
>       for (Connection conn : connections.values()) {
> {code}
> Then markClosed() tries to obtain lock on connection object:
> {code}
>         if (!conn.isAlive()) {
>           conn.markClosed(new InterruptedIOException("RpcClient is closing"));
>           conn.close();
> {code}
> Another thread, MetaServerShutdownHandler, calls RpcClientImpl$Connection#setupIOstreams()
where :
> {code}
>         markClosed(e);
>         close();
> {code}
> Lock on connection object is obtained first, then lock on connections is attempted, leading
to deadlock:
> {code}
>       synchronized (connections) {
>         connections.removeValue(remoteId, this);
>       }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message