hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-2114) user finer grained locks in JT getCounters implementation
Date Thu, 07 Oct 2010 00:55:31 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Joydeep Sen Sarma updated MAPREDUCE-2114:
-----------------------------------------

    Description: 
We are bound on the JobTracker lock on our largest cluster. One pattern i have seen is the
following:

- JT acquires JobTracker lock - but blocked on JIP lock:

java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:1028)
  waiting to lock <0x00002aae21092ff8> (a org.apache.hadoop.mapred.JobInProgress)
at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:4403)
at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:3444)
  locked <0x00002aab6ebb6640> (a org.apache.hadoop.mapred.JobTracker)

- the JIP lock is typically held by a getcounters call:

  locked <0x00002aaaf88beff8> (a org.apache.hadoop.mapred.Counters$Group)
at org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:445)
  locked <0x00002aaaf88bb948> (a org.apache.hadoop.mapred.Counters)
at org.apache.hadoop.mapred.JobInProgress.incrementTaskCounters(JobInProgress.java:1253)
at org.apache.hadoop.mapred.JobInProgress.getCounters(JobInProgress.java:1240)
  locked <0x00002aae21092ff8> (a org.apache.hadoop.mapred.JobInProgress)

the solution seems simple. in order to summarize the counters for all tasks - we need to only
lock one task's counters at a time. we don't need to lock the entire job. 

  was:
We are bound on the JobTracker lock on our largest cluster. One pattern i have seen is the
following:

- JT acquires JobTracker lock - but blocked on JIP lock:

java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:1028)
- waiting to lock <0x00002aae21092ff8> (a org.apache.hadoop.mapred.JobInProgress)
at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:4403)
at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:3444)
- locked <0x00002aab6ebb6640> (a org.apache.hadoop.mapred.JobTracker)

- the JIP lock is typically held by a getcounters call:

- locked <0x00002aaaf88beff8> (a org.apache.hadoop.mapred.Counters$Group)
at org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:445)
- locked <0x00002aaaf88bb948> (a org.apache.hadoop.mapred.Counters)
at org.apache.hadoop.mapred.JobInProgress.incrementTaskCounters(JobInProgress.java:1253)
at org.apache.hadoop.mapred.JobInProgress.getCounters(JobInProgress.java:1240)
- locked <0x00002aae21092ff8> (a org.apache.hadoop.mapred.JobInProgress)

the solution seems simple. in order to summarize the counters for all tasks - we need to only
lock one task's counters at a time. we don't need to lock the entire job. 


> user finer grained locks in JT getCounters implementation
> ---------------------------------------------------------
>
>                 Key: MAPREDUCE-2114
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2114
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker
>            Reporter: Joydeep Sen Sarma
>
> We are bound on the JobTracker lock on our largest cluster. One pattern i have seen is
the following:
> - JT acquires JobTracker lock - but blocked on JIP lock:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:1028)
>   waiting to lock <0x00002aae21092ff8> (a org.apache.hadoop.mapred.JobInProgress)
> at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:4403)
> at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:3444)
>   locked <0x00002aab6ebb6640> (a org.apache.hadoop.mapred.JobTracker)
> - the JIP lock is typically held by a getcounters call:
>   locked <0x00002aaaf88beff8> (a org.apache.hadoop.mapred.Counters$Group)
> at org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:445)
>   locked <0x00002aaaf88bb948> (a org.apache.hadoop.mapred.Counters)
> at org.apache.hadoop.mapred.JobInProgress.incrementTaskCounters(JobInProgress.java:1253)
> at org.apache.hadoop.mapred.JobInProgress.getCounters(JobInProgress.java:1240)
>   locked <0x00002aae21092ff8> (a org.apache.hadoop.mapred.JobInProgress)
> the solution seems simple. in order to summarize the counters for all tasks - we need
to only lock one task's counters at a time. we don't need to lock the entire job. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message