hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sreekanth Ramakrishnan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-964) Inaccurate values in jobSummary logs
Date Fri, 18 Sep 2009 06:54:57 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757008#action_12757008
] 

Sreekanth Ramakrishnan commented on MAPREDUCE-964:
--------------------------------------------------

The reason for the negative values being present in the JobSummary results from the taskStatus
not having the finishTime field not being set in the taskStatus. This happens especially when
the tasks which are being killed when they just have finished, resulting in a race condition.
If look at the attempt which is mentioned has actually finished and same time a kill task
action has been recv for same attempt at the same time.


Had modified the code to put in the debug statement following is example of task which upset
the metering:

{noformat}
2009-09-18 06:18:32,001 INFO org.apache.hadoop.mapred.JobInProgress: TaskDebug attemptId :
attempt_200909180346_0004_m_000575_0 slots : SLOTS_MILLIS_MAPS tip.numslots is: 1 difference
to add : -1253254705577 status start : 1253254705577 status end time : 0 
{noformat}


TT logs :
{noformat}
2009-09-18 06:18:01,200 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask):
attempt_200909180346_0004_m_000575_0 task's state:UNASSIGNED
2009-09-18 06:18:01,200 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_200909180346_0004_m_000575_0
which needs 1 slots
2009-09-18 06:18:01,200 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current
free slots : 1 and trying to launch attempt_200909180346_0004_m_000575_0 which needs 1 slots
2009-09-18 06:18:02,750 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_200909180346_0004_m_1883468461
given task: attempt_200909180346_0004_m_000575_0
2009-09-18 06:18:09,034 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200909180346_0004_m_000575_0
0.18553433% xxxx/my_reliability_test_input/part-00032:335544320+67108864
2009-09-18 06:18:12,040 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200909180346_0004_m_000575_0
0.25399202% xxx/my_reliability_test_input/part-00032:335544320+67108864
2009-09-18 06:18:15,317 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200909180346_0004_m_000575_0
0.3378629% xxx/my_reliability_test_input/part-00032:335544320+67108864
2009-09-18 06:18:15,319 INFO org.apache.hadoop.mapred.TaskTracker: Received KillTaskAction
for task: attempt_200909180346_0004_m_000575_0
2009-09-18 06:18:15,319 INFO org.apache.hadoop.mapred.TaskTracker: About to purge task: attempt_200909180346_0004_m_000575_0
2009-09-18 06:18:20,413 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200909180346_0004_m_000575_0
done; removing files.
2009-09-18 06:18:20,415 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_200909180346_0004_m_000575_0
not found in cache
2009-09-18 06:18:25,511 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200909180346_0004_m_000575_0
done; removing files.
2009-09-18 06:18:25,515 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask):
attempt_200909180346_0004_m_000575_0 task's state:FAILED_UNCLEAN
2009-09-18 06:18:25,516 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_200909180346_0004_m_000575_0
which needs 1 slots
2009-09-18 06:18:25,516 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current
free slots : 2 and trying to launch attempt_200909180346_0004_m_000575_0 which needs 1 slots
2009-09-18 06:18:26,651 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_200909180346_0004_m_-1834354161
given task: attempt_200909180346_0004_m_000575_0
2009-09-18 06:18:26,888 INFO org.apache.hadoop.mapred.TaskTracker: Received KillTaskAction
for task: attempt_200909180346_0004_m_000575_0
2009-09-18 06:18:26,888 INFO org.apache.hadoop.mapred.TaskTracker: About to purge task: attempt_200909180346_0004_m_000575_0
2009-09-18 06:18:31,986 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200909180346_0004_m_000575_0
0.0% 
2009-09-18 06:18:31,986 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200909180346_0004_m_000575_0
Ignoring status-update since runState: FAILED
2009-09-18 06:18:31,989 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200909180346_0004_m_000575_0
done; removing files.
2009-09-18 06:18:31,990 WARN org.apache.hadoop.ipc.Server: IPC Server Responder, call statusUpdate(attempt_200909180346_0004_m_000575_0,
org.apache.hadoop.mapred.MapTaskStatus@4e69b84c) from 127.0.0.1:51146: output error
2009-09-18 06:18:31,995 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_200909180346_0004_m_000575_0
not found in cache
2009-09-18 06:18:34,992 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200909180346_0004_m_000575_0
done; removing files.
{noformat}

> Inaccurate values in jobSummary logs
> ------------------------------------
>
>                 Key: MAPREDUCE-964
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-964
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.1
>            Reporter: Rajiv Chittajallu
>            Priority: Critical
>
> For some jobs the mapSlotSeconds is incorrect.
> negative value
> 09/09/01 18:31:44 INFOmapred.JobInProgress$JobSummary: jobId=job_200908270718_4568,submitTime=1251823543976,launchTime=1251823554310,finishTime=1251829904565,
           numMaps=7965,numSlotsPerMap=1,numReduces=40,numSlotsPerReduce=1,user=wile,queue=runner,status=SUCCEEDED,
        mapSlotSeconds=-2503133523,reduceSlotsSeconds=186536,clusterMapCapacity=11262,clusterReduceCapacity=3754
> or too high
> 09/09/02 23:59:57 INFO mapred.JobInProgress$JobSummary: jobId=job_200908270718_5861,submitTime=1251935672924,launchTime=1251935687698,finishTime=1251935997949,
           numMaps=1026,numSlotsPerMap=1,numReduces=10,numSlotsPerReduce=1,user=dfsload,queue=gridops,status=SUCCEEDED,
        
> mapSlotSeconds=1251949742,reduceSlotsSeconds=537,clusterMapCapacity=11262,clusterReduceCapacity=3754

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message