hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-3511) Counters occupy a good part of AM heap
Date Mon, 09 Jan 2012 20:01:40 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-3511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vinod Kumar Vavilapalli updated MAPREDUCE-3511:
-----------------------------------------------

    Attachment: MAPREDUCE-3511-20120109.txt

Thanks for looking into it, Robert!

bq. If the old counter API is going to be the long term fix then perhaps we should not mark
is as deprecated any more.
The counters that I am now using are from {{mapreduce.Counters}} which aren't deprecated.
Irrespective of this, it makes sense to undeprecate other old v1 stuff (mapped.*), I'll propose
and merge MAPREDUCE-1735 into trunk/23 also.

bq. [...] a minor performance improvement [...] I would prefer to see the datum start out
as null, and only have its fields set if it is not null, inside getDatum.
Not sure if {{getDatum()}} will be called multiple times as each event will be logged only
once. Makes sense to implement your proposal anyways just to be sure. Done.

bq. Why was TokenCache.java modified at all? It does not seem to be related to this JIRA.
It isn't. But without that, due to HADOOP-7963/MAPREDUCE-3639 the patch couldn't be tested
on cluster. I'll revert those changes from the patch.

bq. You added a TODO in CompletedTask.java and CompletedJob.java
Done, added that to remind myself to avoid any clones :) Removing the clones now.

bq. good catch on TestHsWebServicesTasks.java, TestAMWebServicesAttempts.java, TestHsWebServicesTasks.java

Am surprised Jenkins didn't catch these.

bq.  I also don't think we need any more tests because, all we are doing is reducing memory
usage, which is very hard to write a unit test for.
Yes, +1 :)

bq. Inside JobHistoryEventHandler.java you added in // TODO: Only job-counters is enough?
How about the myriad clones in this code-path. is this TODO still needed?
Avoided that extra clone for a finished Job.
                
> Counters occupy a good part of AM heap
> --------------------------------------
>
>                 Key: MAPREDUCE-3511
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3511
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: mr-am, mrv2
>    Affects Versions: 0.23.0
>            Reporter: Siddharth Seth
>            Assignee: Vinod Kumar Vavilapalli
>            Priority: Blocker
>             Fix For: 0.23.1
>
>         Attachments: MAPREDUCE-3511-20120107.1.txt, MAPREDUCE-3511-20120109.txt
>
>
> Per task counters seem to be occupying a good part of an AMs heap. Looks like more than
50% of what's used by a TaskAttemptImpl object.
> This could be optimized by interning strings or possibly using mrv1 counters which are
optimized. Currently counters are converted from mrv1 to mrv2 format for in memory storage.
The conversion could be delayed till it's actually required for RPC transfers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message