hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1374) TaskTracker falls into an infinite loop.
Date Sat, 26 May 2007 01:36:16 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499270
] 

Konstantin Shvachko commented on HADOOP-1374:
---------------------------------------------

Debugging TaskTracker being in the loop on Linux.
It falls into loop when all 8 maps are done. The reduce will never finish.
All 2 nodes are sending heartbeats every 10 secs, nobody is dying.
This is what WebUI showes for the bad task-tracker

Running tasks
Task Attempts	Status 	Progress	Errors
task_0001_r_000000_1	RUNNING	16.66%	

Non-Running Tasks
Task Attempts	Status
task_0001_m_000004_0	SUCCEEDED
task_0001_m_000007_0	SUCCEEDED
task_0001_m_000003_0	SUCCEEDED
task_0001_m_000006_0	SUCCEEDED

I put a breakpoint in org.apache.hadoop.ipc.Server.Handler.run() where the calls are proccessed,
at
value = call(call.param);             // make the call

I see it is processing only the following 3 calls.

ping(task_0001_r_000000_1) from 66.22.15.15:58122
progress(task_0001_r_000000_1, 0.16666667, reduce > copy (4 of 8 at 0.00 MB/s) > , SHUFFLE,
org.apache.hadoop.mapred.Counters@b0f534) from 66.22.15.15:58122
getMapCompletionEvents(job_0001, 8, 50) from 66.22.15.15:58122

Hope this helps.

> TaskTracker falls into an infinite loop.
> ----------------------------------------
>
>                 Key: HADOOP-1374
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1374
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.12.3
>            Reporter: Konstantin Shvachko
>         Assigned To: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.13.0
>
>         Attachments: DataNode1.log, DataNode2.log, JobTracker.log, NameNode.log, TaskTracker1.log,
TaskTracker2.log, TestDFSIO.log
>
>
> All maps had been completed successfully. I had only one reduce task during which
> TaskTracker infinitely outputs:
> 07/05/15 19:35:41 INFO mapred.TaskTracker: task_0001_r_000000_0 0.16666667% reduce >
copy (4 of 8 at 0.00 MB/s) > 
> 07/05/15 19:35:42 INFO mapred.TaskTracker: task_0001_r_000000_0 0.16666667% reduce >
copy (4 of 8 at 0.00 MB/s) > 
> 07/05/15 19:35:43 INFO mapred.TaskTracker: task_0001_r_000000_0 0.16666667% reduce >
copy (4 of 8 at 0.00 MB/s) > 
> 07/05/15 19:35:44 INFO mapred.TaskTracker: task_0001_r_000000_0 0.16666667% reduce >
copy (4 of 8 at 0.00 MB/s) > 
> 07/05/15 19:35:45 INFO mapred.TaskTracker: task_0001_r_000000_0 0.16666667% reduce >
copy (4 of 8 at 0.00 MB/s) > 
> JobTracker does not log anything about task task_0001_r_000000_0 except for
> 07/05/15 19:49:01 INFO mapred.JobTracker: Adding task 'task_0001_r_000000_0' to tip tip_0001_r_000000,
for tracker 'tracker_my-host.com:50050'

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message