hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravi Prakash (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3476) Optimize YARN API calls
Date Mon, 28 Nov 2011 17:15:40 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158577#comment-13158577

Ravi Prakash commented on MAPREDUCE-3476:

Courtesy [~amar_kamat] These APIs need to be investigated for optimization

1. JobClient.getClusterStatus()
2. clusterStatus.getMaxMapTasks()
3. clusterStatus.getMaxReduceTasks()
4. clusterStatus.getTaskTrackers()
5. o.p.h.mapreduce.job.mapProgress()
6. o.p.h.mapreduce.job.reduceProgress()

>From another quote
While improving Gridmix we also got a chance to benchmark few YARN APIs. Here is the summary:
1. APIs to get map and reduce slot capacity cost ~0 secs.
2. API to get the job's map task progress takes 115secs in the worst case. Around 8 calls
took more than 10 secs.
Around 26 calls took more than 5 secs. Around 144 calls took more than 1 sec. There were ~43,883
calls made to this
3. API to get job's reduce task progress takes 16secs in the worst case. Around 3 calls took
more than 10 secs. Around
4 calls took more than 5 secs. Around 34 calls took more than 1 sec. Around 22,446 calls were
made to this API.
4. API to get the number of trackers also take ~0 secs.

The fact that getting map progress of a single job can take ~115secs in the worst case is
surprising! I guess
optimizing the map progress and reduce progress APIs can be the first step.

> Optimize YARN API calls
> -----------------------
>                 Key: MAPREDUCE-3476
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3476
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>    Affects Versions: 0.23.0
>            Reporter: Ravi Prakash
>            Assignee: Ravi Prakash
>            Priority: Critical
> Several YARN API calls are taking inordinately long. This might be a performance blocker.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message