hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Graves (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-4752) Reduce MR AM memory usage through String Interning
Date Wed, 31 Oct 2012 15:05:12 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Thomas Graves updated MAPREDUCE-4752:

          Resolution: Fixed
       Fix Version/s: 0.23.5
    Target Version/s: 2.0.3-alpha, 0.23.5  (was: 0.23.5)
              Status: Resolved  (was: Patch Available)
> Reduce MR AM memory usage through String Interning
> --------------------------------------------------
>                 Key: MAPREDUCE-4752
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4752
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>            Reporter: Robert Joseph Evans
>            Assignee: Robert Joseph Evans
>             Fix For: 3.0.0, 2.0.3-alpha, 0.23.5
>         Attachments: MR-4752-branch-0.23.txt, MR-4752-branch-0.23.txt, MR-4752-trunk.txt,
> There are a lot of strings that are duplicates of one another in the AM.  This comes
from all of the PB events the come across the wire and also tasks heart-beating in through
the umbilical.  There are even several duplicates from Configuration.  By "interning" all
of these strings on the Heap I have been able to reduce the resting memory usage of the AM
to be about 5KB per task attempt.  With about half of this coming from counters.  This results
in a 5MB heap for a typical 1000 task job, or a 500MB heap for a 100,000 task attempt job.
 I think I could cut the size of the counters in half by completely rewriting how counters
work in the AM and History Server, but I don't think it is worth it at this point.
> I am still investigating what the memory usage of the AM is like when running very large
jobs, and I will probably have a follow-up JIRA for reducing that memory usage as well.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message