hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ccw (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4208) The job is hanging up but never continuing until you kill the child process
Date Fri, 04 May 2012 03:09:20 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268065#comment-13268065
] 

ccw commented on MAPREDUCE-4208:
--------------------------------

Thank you! I found the cause of the problem . The fundamental reason is that data access of
Hbase . Because the data(region server) of map task is not on local region server ,so each
data access is very slow .
But I think there must be a timeout mechanism to prevent the occurrence of such events. If
a task is running too slow ,Hadoop should automatically kill the task rather than manually
kill.
Thank you!
                
> The job is hanging up but never continuing until you kill the child process 
> ----------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4208
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4208
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>         Environment: Hadoop 0.20.203.0
> Hbase 0.90.3
> Hive 0.80.1
>            Reporter: ccw
>
> I use the hive MR query on hbase,but the job is never end.
> The job is hanging but never continuing util you kill the child process 
> 2012-04-28 18:22:33,661 Stage-1 map = 0%,  reduce = 0%
> 2012-04-28 18:22:59,760 Stage-1 map = 25%,  reduce = 0%
> 2012-04-28 18:23:04,782 Stage-1 map = 38%,  reduce = 0%
> 2012-04-28 18:23:07,796 Stage-1 map = 50%,  reduce = 0%
> 2012-04-28 18:23:08,801 Stage-1 map = 50%,  reduce = 8%
> 2012-04-28 18:23:17,839 Stage-1 map = 50%,  reduce = 17%
> 2012-04-28 18:23:19,848 Stage-1 map = 63%,  reduce = 17%
> 2012-04-28 18:23:32,909 Stage-1 map = 63%,  reduce = 21%
> 2012-04-28 18:23:57,017 Stage-1 map = 75%,  reduce = 21%
> 2012-04-28 18:24:09,075 Stage-1 map = 75%,  reduce = 25%
> 2012-04-28 18:25:09,397 Stage-1 map = 75%,  reduce = 25%
> 2012-04-28 18:26:09,688 Stage-1 map = 75%,  reduce = 25%
> 2012-04-28 18:27:09,980 Stage-1 map = 75%,  reduce = 25%
> 2012-04-28 18:28:10,262 Stage-1 map = 75%,  reduce = 25%
> 2012-04-28 18:29:10,522 Stage-1 map = 75%,  reduce = 25%
> 2012-04-28 18:30:10,742 Stage-1 map = 75%,  reduce = 25%
> 2012-04-28 18:31:10,985 Stage-1 map = 75%,  reduce = 25%
> 2012-04-28 18:32:11,238 Stage-1 map = 75%,  reduce = 25%
> 2012-04-28 18:33:11,467 Stage-1 map = 75%,  reduce = 25%
> 2012-04-28 18:34:11,731 Stage-1 map = 75%,  reduce = 25%
> 2012-04-28 18:35:11,968 Stage-1 map = 75%,  reduce = 25%
> 2012-04-28 18:36:12,213 Stage-1 map = 75%,  reduce = 25%
> 2012-04-28 18:37:12,508 Stage-1 map = 75%,  reduce = 25%
> 2012-04-28 18:38:12,747 Stage-1 map = 75%,  reduce = 25%
> 2012-04-28 18:39:12,970 Stage-1 map = 75%,  reduce = 25%
> 2012-04-28 18:40:13,205 Stage-1 map = 75%,  reduce = 25%
> I checked the TT log,
> 2012-04-28 18:31:53,879 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:31:56,883 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:31:59,887 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:02,892 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:05,897 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:08,902 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:11,906 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:14,910 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:17,915 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:20,920 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:23,924 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:26,929 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:29,934 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:32,938 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:35,943 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:38,948 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:41,953 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:44,957 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:47,961 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:50,966 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:53,970 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:56,974 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:32:59,979 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:33:02,983 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:33:05,987 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:33:08,992 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:33:11,997 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:33:15,001 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:33:18,006 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:33:21,011 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:33:24,015 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:33:27,020 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:33:30,025 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:33:33,029 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:33:36,034 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:33:39,038 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:33:42,043 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:33:45,047 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:33:48,051 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:33:51,057 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> 2012-04-28 18:33:54,062 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201204281725_0002_m_000002_0
0.0%
> [hadoop@mem1 logs]$ jps
> 3282 Child
> 31547 QuorumPeerMain
> 1840 TaskTracker
> 3469 Jps
> 31070 HRegionServer
> 30120 DataNode
> [hadoop@mem1 logs]$  kill 3282
> When I kill the child process , then the job continue and complete.
> 2012-04-28 18:40:51,324 Stage-1 map = 88%,  reduce = 25%
> 2012-04-28 18:41:04,364 Stage-1 map = 88%,  reduce = 29%
> 2012-04-28 18:41:31,448 Stage-1 map = 100%,  reduce = 29%
> 2012-04-28 18:41:43,485 Stage-1 map = 100%,  reduce = 100%

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message