hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11377) Balancer hung due to no available mover threads
Date Wed, 01 Feb 2017 21:27:52 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15848954#comment-15848954
] 

Hadoop QA commented on HDFS-11377:
----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 18s{color} | {color:blue}
Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} |
{color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  0s{color} | {color:red}
The patch doesn't appear to include any new or modified tests. Please justify why no new tests
are needed for this patch. Also please list what manual steps were performed to verify this
patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 31s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 51s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 28s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 57s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 12s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 49s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 41s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 51s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 46s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 46s{color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 28s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 58s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 12s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m  0s{color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 58s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 41s{color} |
{color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}112m 25s{color} | {color:red}
hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 25s{color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}138m 56s{color} | {color:black}
{color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.TestDFSClientExcludedNodes |
|   | hadoop.hdfs.server.namenode.ha.TestHAAppend |
| Timed out junit tests | org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
|
|   | org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11377 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12850459/HDFS-11377.002.patch
|
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  unit  findbugs
 checkstyle  |
| uname | Linux b53c73150f1c 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 20:15:20 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 59c5f18 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | https://builds.apache.org/job/PreCommit-HDFS-Build/18307/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
|
|  Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/18307/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
| Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/18307/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Balancer hung due to no available mover threads
> -----------------------------------------------
>
>                 Key: HDFS-11377
>                 URL: https://issues.apache.org/jira/browse/HDFS-11377
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer & mover
>    Affects Versions: 2.7.3
>            Reporter: yunjiong zhao
>            Assignee: yunjiong zhao
>         Attachments: HDFS-11377.001.patch, HDFS-11377.002.patch
>
>
> When running balancer on large cluster which have more than 3000 Datanodes, it might
be hung due to "No mover threads available".
> The stack trace shows it waiting forever like below.
> {code}
> "main" #1 prio=5 os_prio=0 tid=0x00007ff6cc014800 nid=0x6b2c waiting on condition [0x00007ff6d1bad000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at org.apache.hadoop.hdfs.server.balancer.Dispatcher.waitForMoveCompletion(Dispatcher.java:1043)
>         at org.apache.hadoop.hdfs.server.balancer.Dispatcher.dispatchBlockMoves(Dispatcher.java:1017)
>         at org.apache.hadoop.hdfs.server.balancer.Dispatcher.dispatchAndCheckContinue(Dispatcher.java:981)
>         at org.apache.hadoop.hdfs.server.balancer.Balancer.runOneIteration(Balancer.java:611)
>         at org.apache.hadoop.hdfs.server.balancer.Balancer.run(Balancer.java:663)
>         at org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:776)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>         at org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:905)
> {code}
> In the log, there are lots of WARN about "No mover threads available".
> {quote}
> 2017-01-26 15:36:40,085 WARN org.apache.hadoop.hdfs.server.balancer.Dispatcher: No mover
threads available: skip moving blk_13700554102_1112815018180 with size=268435456 from 10.115.67.137:50010:DISK
to 10.140.21.55:50010:DISK through 10.115.67.137:50010
> 2017-01-26 15:36:40,085 WARN org.apache.hadoop.hdfs.server.balancer.Dispatcher: No mover
threads available: skip moving blk_4009558842_1103118359883 with size=268435456 from 10.115.67.137:50010:DISK
to 10.140.21.55:50010:DISK through 10.115.67.137:50010
> 2017-01-26 15:36:40,085 WARN org.apache.hadoop.hdfs.server.balancer.Dispatcher: No mover
threads available: skip moving blk_13881956058_1112996460026 with size=133509566 from 10.115.67.137:50010:DISK
to 10.140.21.55:50010:DISK through 10.115.67.36:50010
> {quote}
> What happened here is, when there are no mover threads available, DDatanode.isPendingQEmpty()
will return false, so Balancer hung.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message