hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Joseph Evans (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-4752) Reduce MR AM memory usage through String Interning
Date Fri, 26 Oct 2012 14:49:11 GMT
Robert Joseph Evans created MAPREDUCE-4752:

             Summary: Reduce MR AM memory usage through String Interning
                 Key: MAPREDUCE-4752
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4752
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: mrv2
            Reporter: Robert Joseph Evans
            Assignee: Robert Joseph Evans

There are a lot of strings that are duplicates of one another in the AM.  This comes from
all of the PB events the come across the wire and also tasks heart-beating in through the
umbilical.  There are even several duplicates from Configuration.  By "interning" all of these
strings on the Heap I have been able to reduce the resting memory usage of the AM to be about
5KB per task attempt.  With about half of this coming from counters.  This results in a 5MB
heap for a typical 1000 task job, or a 500MB heap for a 100,000 task attempt job.  I think
I could cut the size of the counters in half by completely rewriting how counters work in
the AM and History Server, but I don't think it is worth it at this point.

I am still investigating what the memory usage of the AM is like when running very large jobs,
and I will probably have a follow-up JIRA for reducing that memory usage as well.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message