hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thushara Wijeratna <thu...@gmail.com>
Subject debugging task timeouts on 0.19.1
Date Fri, 29 May 2009 16:29:13 GMT
how do i debug a job being killed this way? :

2009-05-29 01:28:56,672 INFO org.apache.hadoop.mapred.TaskInProgress: Error
from attempt_200905281652_0006_m_000007_2: Task
attempt_200905281652_0006_m_000007_2 failed to report status for 603
seconds. Killing!
2009-05-29 01:28:56,673 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200905281652_0006_m_000007_3' to tip
task_200905281652_0006_m_000007, for tracker
'tracker_domU-12-31-38-01-79-93.compute-1.internal:localhost.localdomain/
127.0.0.1:33837'
2009-05-29 01:28:56,673 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing data-local task task_200905281652_0006_m_0000072009-05-29
01:28:56,673 INFO org.apache.hadoop.mapred.JobTracker: Removed completed
task 'attempt_200905281652_0006_m_000007_2' from 'tracker_domU-12-3
1-38-01-79-93.compute-1.internal:localhost.localdomain/127.0.0.1:33837'2009-05-29
01:39:15,008 INFO org.apache.hadoop.mapred.TaskInProgress: Error from
attempt_200905281652_0006_m_000007_3: Task attempt_200905281652_0006_m
_000007_3 failed to report status for 603 seconds. Killing!2009-05-29
01:39:15,008 INFO org.apache.hadoop.mapred.TaskInProgress: TaskInProgress
task_200905281652_0006_m_000007 has failed 4 times.
2009-05-29 01:39:15,009 INFO org.apache.hadoop.mapred.JobInProgress:
Aborting job job_200905281652_00062009-05-29 01:39:15,009 INFO
org.apache.hadoop.mapred.JobInProgress: Killing job 'job_200905281652_0006'

this is a pass/through map/reduce job - map/reduce code doesn't do anything
except report status via Conters, like:

reporter.incrCounter(Counters.MAP_RECORDS, 1);

thanks,
thushara

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message