hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mayank Bansal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5368) Save memory by set capacity, load factor and concurrency level for ConcurrentHashMap in TaskInProgress
Date Tue, 02 Jul 2013 17:59:20 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13698042#comment-13698042
] 

Mayank Bansal commented on MAPREDUCE-5368:
------------------------------------------

The Default values of the Load factor is .75 anyways 
The Default concurrency level is 16 which I think is reasonable for the jobs.
The Default initial capacity is also 16 which is also reasonable.

I am not sure how we are saving memory here. Can you please explain a bit?

Moreover I really dont think to change the concurrency level so low as it will increase the
contention in the threads a lot.

Thoughts?

Thanks,
Mayank

                
> Save memory by  set capacity, load factor and concurrency level for ConcurrentHashMap
in TaskInProgress
> -------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5368
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5368
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv1
>    Affects Versions: 1.2.0
>            Reporter: zhaoyunjiong
>             Fix For: 1.2.1
>
>         Attachments: MAPREDUCE-5368.patch
>
>
> Below is histo from our JobTracker:
>  num     #instances         #bytes  class name
> ----------------------------------------------
>    1:     136048824    11347237456  [C
>    2:     124156992     5959535616  java.util.concurrent.locks.ReentrantLock$NonfairSync
>    3:     124156973     5959534704  java.util.concurrent.ConcurrentHashMap$Segment
>    4:     135887753     5435510120  java.lang.String
>    5:     124213692     3975044400  [Ljava.util.concurrent.ConcurrentHashMap$HashEntry;
>    6:      63777311     3061310928  java.util.HashMap$Entry
>    7:      35038252     2803060160  java.util.TreeMap
>    8:      16921110     2712480072  [Ljava.util.HashMap$Entry;
>    9:       4803617     2420449192  [Ljava.lang.Object;
>   10:      50392816     2015712640  org.apache.hadoop.mapred.Counters$Counter
>   11:       7775438     1181866576  [Ljava.util.concurrent.ConcurrentHashMap$Segment;
>   12:       3882847     1118259936  org.apache.hadoop.mapred.TaskInProgress
> ConcurrentHashMap takes more than 14G(5959535616 + 5959534704 + 3975044400).
> The trouble maker are below codes in TaskInProgress.java:
>   Map<TaskAttemptID, Locality> taskLocality = 
>       new ConcurrentHashMap<TaskAttemptID, Locality>();
>   Map<TaskAttemptID, Avataar> taskAvataar = 
>       new ConcurrentHashMap<TaskAttemptID, Avataar>();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message