hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6485) MR job hanged forever because all resources are taken up by reducers and the last map attempt never get resource to run
Date Tue, 29 Sep 2015 12:11:05 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14935065#comment-14935065
] 

Hadoop QA commented on MAPREDUCE-6485:
--------------------------------------

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  19m 18s | Pre-patch trunk compilation is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any @author tags.
|
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to include 1 new
or modified test files. |
| {color:green}+1{color} | javac |   9m  6s | There were no new javac warning messages. |
| {color:green}+1{color} | javadoc |  11m 55s | There were no new javadoc warning messages.
|
| {color:green}+1{color} | release audit |   0m 27s | The applied patch does not increase
the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 43s | The applied patch generated  1 new checkstyle
issues (total was 356, now 355). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that end in whitespace.
Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 44s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 38s | The patch built with eclipse:eclipse.
|
| {color:green}+1{color} | findbugs |   1m 18s | The patch does not introduce any new Findbugs
(version 3.0.0) warnings. |
| {color:green}+1{color} | mapreduce tests |  10m  5s | Tests passed in hadoop-mapreduce-client-app.
|
| | |  55m 18s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | http://issues.apache.org/jira/secure/attachment/12764223/MAPREDUCE-6485.004.patch
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / d6fa34e |
| checkstyle |  https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6032/artifact/patchprocess/diffcheckstylehadoop-mapreduce-client-app.txt
|
| whitespace | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6032/artifact/patchprocess/whitespace.txt
|
| hadoop-mapreduce-client-app test log | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6032/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt
|
| Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6032/testReport/
|
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep
3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6032/console |


This message was automatically generated.

> MR job hanged forever because all resources are taken up by reducers and the last map
attempt never get resource to run
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6485
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6485
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: applicationmaster
>    Affects Versions: 3.0.0, 2.4.1, 2.6.0, 2.7.1
>            Reporter: Bob
>            Assignee: Xianyin Xin
>            Priority: Critical
>         Attachments: MAPREDUCE-6485.001.patch, MAPREDUCE-6485.004.patch, MAPREDUCE-6845.002.patch,
MAPREDUCE-6845.003.patch
>
>
> The scenarios is like this:
> With configuring mapreduce.job.reduce.slowstart.completedmaps=0.8, reduces will take
resource and  start to run when all the map have not finished. 
> But It could happened that when all the resources are taken up by running reduces, there
is still one map not finished. 
> Under this condition , the last map have two task attempts .
> As for the first attempt was killed due to timeout(mapreduce.task.timeout), and its state
transitioned from RUNNING to FAIL_CONTAINER_CLEANUP then to FAILED, but failed map attempt
would not be restarted for there is still one speculate map attempt in progressing. 
> As for the second attempt which was started due to having enable map task speculative
is pending at UNASSINGED state because of no resource available. But the second map attempt
request have lower priority than reduces, so preemption would not happened.
> As a result all reduces would not finished because of there is one map left. and the
last map hanged there because of no resource available. so, the job would never finish.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message