hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10220) Namenode failover due to too long loking in LeaseManager.Monitor
Date Fri, 15 Apr 2016 18:05:25 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243338#comment-15243338
] 

Hadoop QA commented on HDFS-10220:
----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue}
Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green}
The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color}
| {color:green} The patch appears to include 1 new or modified test files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 48s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s {color} |
{color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s {color} |
{color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s {color} |
{color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s {color} | {color:green}
trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 53s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s {color} |
{color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 53s {color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s {color} |
{color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s {color} | {color:green}
the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 27s {color} | {color:red}
hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 601 unchanged - 4 fixed = 602 total
(was 605) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color}
| {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green}
The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 19s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s {color} |
{color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 2s {color} | {color:green}
the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 107m 30s {color} | {color:red}
hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m 50s {color} | {color:red}
hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s {color}
| {color:green} Patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 236m 9s {color} | {color:black}
{color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.server.namenode.TestDiskspaceQuotaUpdate |
|   | hadoop.hdfs.server.namenode.TestNestedEncryptionZones |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotDeletion |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
|   | hadoop.hdfs.TestRenameWhileOpen |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade |
|   | hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics |
|   | hadoop.hdfs.server.namenode.ha.TestHAAppend |
|   | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
|   | hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks |
|   | hadoop.hdfs.server.namenode.TestNameNodeRespectsBindHostKeys |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
| JDK v1.8.0_77 Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
|   | org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing |
|   | org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding |
| JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots |
|   | hadoop.hdfs.TestModTime |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
|   | hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
| JDK v1.7.0_95 Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
|   | org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12798961/HADOOP-10220.004.patch
|
| JIRA Issue | HDFS-10220 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  unit  findbugs
 checkstyle  xml  |
| uname | Linux f458e5d8514f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12
UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 6e6b6dd |
| Default Java | 1.7.0_95 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_77 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
|
| findbugs | v3.0.0 |
| checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/15170/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
|
| unit | https://builds.apache.org/job/PreCommit-HDFS-Build/15170/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_77.txt
|
| unit | https://builds.apache.org/job/PreCommit-HDFS-Build/15170/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
|
| unit test logs |  https://builds.apache.org/job/PreCommit-HDFS-Build/15170/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_77.txt
https://builds.apache.org/job/PreCommit-HDFS-Build/15170/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
|
| JDK v1.7.0_95  Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/15170/testReport/
|
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
| Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/15170/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> Namenode failover due to too long loking in LeaseManager.Monitor
> ----------------------------------------------------------------
>
>                 Key: HDFS-10220
>                 URL: https://issues.apache.org/jira/browse/HDFS-10220
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>            Reporter: Nicolas Fraison
>            Assignee: Nicolas Fraison
>            Priority: Minor
>         Attachments: HADOOP-10220.001.patch, HADOOP-10220.002.patch, HADOOP-10220.003.patch,
HADOOP-10220.004.patch, threaddump_zkfc.txt
>
>
> I have faced a namenode failover due to unresponsive namenode detected by the zkfc with
lot's of WARN messages (5 millions) like this one:
> _org.apache.hadoop.hdfs.StateChange: BLOCK* internalReleaseLease: All existing blocks
are COMPLETE, lease removed, file closed._
> On the threaddump taken by the zkfc there are lots of thread blocked due to a lock.
> Looking at the code, there are a lock taken by the LeaseManager.Monitor when some lease
must be released. Due to the really big number of lease to be released the namenode has taken
too many times to release them blocking all other tasks and making the zkfc thinking that
the namenode was not available/stuck.
> The idea of this patch is to limit the number of leased released each time we check for
lease so the lock won't be taken for a too long time period.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message