hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3511) Counters occupy a good part of AM heap
Date Sat, 07 Jan 2012 20:15:39 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13182081#comment-13182081

Vinod Kumar Vavilapalli commented on MAPREDUCE-3511:

Sid and I did some experiments again with AMScalability benchmark (100K 1second maps), and
found that counters are one of the biggest culprits of the job slowdown. Counters are occupying
lots of heap, causing full GCs very frequently and slowing down the AM.

I have a raw patch that I am cleaning up now which moves the counters storage to the MRV1

Regarding Robert's concern above about the doubling of heap on an incoming RPC for all counters:
I checked and there is only one API which tries to obtain all TaskReports - {{getTaskReports()}}.
This call is *supposed to* be rare and the blowup of heap on that rare event is unavoidable
if we want to optimize for the general case. The best we can do is lock that call so that
only one call can go through to the AM at any time.

As for my *supposed to* be rare comment, unfortunately this api got sneaked into the "job
-list" path via MAPREDUCE-2789. None of the commands like "job -list" should go to each AM
anyways for performance reasons. I will fix it as part of MAPREDUCE-3476.
> Counters occupy a good part of AM heap
> --------------------------------------
>                 Key: MAPREDUCE-3511
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3511
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: mr-am, mrv2
>    Affects Versions: 0.23.0
>            Reporter: Siddharth Seth
>            Assignee: Vinod Kumar Vavilapalli
>            Priority: Blocker
> Per task counters seem to be occupying a good part of an AMs heap. Looks like more than
50% of what's used by a TaskAttemptImpl object.
> This could be optimized by interning strings or possibly using mrv1 counters which are
optimized. Currently counters are converted from mrv1 to mrv2 format for in memory storage.
The conversion could be delayed till it's actually required for RPC transfers.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message