TrackerDistributedCacheManager takes a blocking lock fo a loop that executes 10K times
--------------------------------------------------------------------------------------
Key: MAPREDUCE-1909
URL: https://issues.apache.org/jira/browse/MAPREDUCE-1909
Project: Hadoop Map/Reduce
Issue Type: Improvement
Reporter: Dick King
Assignee: Dick King
In {{TrackerDistributedCachaManager.java}} , the portion where the cache is cleaned up, the
lock is taken on the main hash table and then all the entries are scanned to see if they can
be deleted. That's a long lockage. The table is likely to have 10K entries.
I would like to reduce the longest lock duration by maintaining the set of {{CacheStatus}}
es to delete incrementally.
1: Let there be a new {{HashSet}}, {{deleteSet}}, that's protected under {{synchronized(cachedArchives)}}
2: When {{refcount}} is decreased to 0, move the {{CacheStatus}} from {{cachedArchives}} to
{{deleteSet}}
3: When seeking an existing {{CacheStatus}}, look in {{deleteSet}} if it isn't in {{cachedArchives}}
4: When {{refcount}} is increased from 0 to 1 in a pre-existing {{CacheStatus}} [see 3:, above]
move the {{CacheStatus}} from {{deleteSet}} to {{cachedArchives}}
5: When we clean the cache, under {{synchronized(cachedArchives)}} , move {{deleteSet}} to
a local variable and create a new empty {{HashSet}}. This is constant time.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
|