hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9294) DFSClient deadlock when close file and failed to renew lease
Date Thu, 03 Dec 2015 00:33:11 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036953#comment-15036953
] 

Hadoop QA commented on HDFS-9294:
---------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue}
Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green}
The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red}
The patch doesn't appear to include any new or modified tests. Please justify why no new tests
are needed for this patch. Also please list what manual steps were performed to verify this
patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 10s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} |
{color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} |
{color:green} trunk passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 3s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} |
{color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} |
{color:green} trunk passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} |
{color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s {color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} |
{color:green} the patch passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s {color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 10s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color}
| {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} |
{color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} |
{color:green} the patch passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s {color} | {color:green}
hadoop-hdfs-client in the patch passed with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 0s {color} | {color:green}
hadoop-hdfs-client in the patch passed with JDK v1.7.0_85. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s {color}
| {color:green} Patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 29s {color} | {color:black}
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12775431/HDFS-9294-002.patch
|
| JIRA Issue | HDFS-9294 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  unit  findbugs
 checkstyle  |
| uname | Linux 09ffb4893e70 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12
UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 6b9a5be |
| findbugs | v3.0.0 |
| JDK v1.7.0_85  Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/13742/testReport/
|
| modules | C: hadoop-hdfs-project/hadoop-hdfs-client U: hadoop-hdfs-project/hadoop-hdfs-client
|
| Max memory used | 76MB |
| Powered by | Apache Yetus   http://yetus.apache.org |
| Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/13742/console |


This message was automatically generated.



> DFSClient  deadlock when close file and failed to renew lease
> -------------------------------------------------------------
>
>                 Key: HDFS-9294
>                 URL: https://issues.apache.org/jira/browse/HDFS-9294
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.2.0, 2.7.1
>         Environment: Hadoop 2.2.0
>            Reporter: 邓飞
>            Assignee: Brahma Reddy Battula
>            Priority: Blocker
>         Attachments: HDFS-9294-002.patch, HDFS-9294-002.patch, HDFS-9294-branch-2.7.patch,
HDFS-9294-branch-2.patch, HDFS-9294.patch
>
>
> We found a deadlock at our HBase(0.98) cluster(and the Hadoop Version is 2.2.0),and it
should be HDFS BUG,at the time our network is not stable.
>  below is the stack:
> *************************************************************************************************************************************
> Found one Java-level deadlock:
> =============================
> "MemStoreFlusher.1":
>   waiting to lock monitor 0x00007ff27cfa5218 (object 0x00000002fae5ebe0, a org.apache.hadoop.hdfs.LeaseRenewer),
>   which is held by "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel"
> "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel":
>   waiting to lock monitor 0x00007ff2e67e16a8 (object 0x0000000486ce6620, a org.apache.hadoop.hdfs.DFSOutputStream),
>   which is held by "MemStoreFlusher.0"
> "MemStoreFlusher.0":
>   waiting to lock monitor 0x00007ff27cfa5218 (object 0x00000002fae5ebe0, a org.apache.hadoop.hdfs.LeaseRenewer),
>   which is held by "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel"
> Java stack information for the threads listed above:
> ===================================================
> "MemStoreFlusher.1":
> 	at org.apache.hadoop.hdfs.LeaseRenewer.addClient(LeaseRenewer.java:216)
> 	- waiting to lock <0x00000002fae5ebe0> (a org.apache.hadoop.hdfs.LeaseRenewer)
> 	at org.apache.hadoop.hdfs.LeaseRenewer.getInstance(LeaseRenewer.java:81)
> 	at org.apache.hadoop.hdfs.DFSClient.getLeaseRenewer(DFSClient.java:648)
> 	at org.apache.hadoop.hdfs.DFSClient.endFileLease(DFSClient.java:659)
> 	at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1882)
> 	- locked <0x000000055b606cb0> (a org.apache.hadoop.hdfs.DFSOutputStream)
> 	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:71)
> 	at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:104)
> 	at org.apache.hadoop.hbase.io.hfile.AbstractHFileWriter.finishClose(AbstractHFileWriter.java:250)
> 	at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:402)
> 	at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
> 	at org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:78)
> 	at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:75)
> 	- locked <0x000000059869eed8> (a java.lang.Object)
> 	at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:812)
> 	at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1974)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1795)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1678)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1591)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:472)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:211)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:66)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:238)
> 	at java.lang.Thread.run(Thread.java:744)
> "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel":
> 	at org.apache.hadoop.hdfs.DFSOutputStream.abort(DFSOutputStream.java:1822)
> 	- waiting to lock <0x0000000486ce6620> (a org.apache.hadoop.hdfs.DFSOutputStream)
> 	at org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:780)
> 	at org.apache.hadoop.hdfs.DFSClient.abort(DFSClient.java:753)
> 	at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:453)
> 	- locked <0x00000002fae5ebe0> (a org.apache.hadoop.hdfs.LeaseRenewer)
> 	at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
> 	at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:298)
> 	at java.lang.Thread.run(Thread.java:744)
> "MemStoreFlusher.0":
> 	at org.apache.hadoop.hdfs.LeaseRenewer.addClient(LeaseRenewer.java:216)
> 	- waiting to lock <0x00000002fae5ebe0> (a org.apache.hadoop.hdfs.LeaseRenewer)
> 	at org.apache.hadoop.hdfs.LeaseRenewer.getInstance(LeaseRenewer.java:81)
> 	at org.apache.hadoop.hdfs.DFSClient.getLeaseRenewer(DFSClient.java:648)
> 	at org.apache.hadoop.hdfs.DFSClient.endFileLease(DFSClient.java:659)
> 	at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1882)
> 	- locked <0x0000000486ce6620> (a org.apache.hadoop.hdfs.DFSOutputStream)
> 	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:71)
> 	at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:104)
> 	at org.apache.hadoop.hbase.io.hfile.AbstractHFileWriter.finishClose(AbstractHFileWriter.java:250)
> 	at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:402)
> 	at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
> 	at org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:78)
> 	at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:75)
> 	- locked <0x00000004888f6848> (a java.lang.Object)
> 	at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:812)
> 	at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1974)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1795)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1678)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1591)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:472)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:435)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:66)
> 	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:253)
> 	at java.lang.Thread.run(Thread.java:744)
> Found 1 deadlock. 
> **********************************************************************
> the thread "MemStoreFlusher.0" is closing outputStream and remove it's lease ;
> other side the daemon thread "LeaseRenewer" failed to connect active nn  for renewing
 lease,but  got SocketTimeoutException   cause of network is not good,so abort outputstream.
> then deadlock is made.
> and it seems not solved at Hadoop 2.7.1 .If confirmed , we can fixed the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message