hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dick King (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-1909) TrackerDistributedCacheManager takes a blocking lock fo a loop that executes 10K times
Date Thu, 01 Jul 2010 21:50:52 GMT
TrackerDistributedCacheManager takes a blocking lock fo a loop that executes 10K times
--------------------------------------------------------------------------------------

                 Key: MAPREDUCE-1909
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1909
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
            Reporter: Dick King
            Assignee: Dick King


In {{TrackerDistributedCachaManager.java}} , the portion where the cache is cleaned up, the
lock is taken on the main hash table and then all the entries are scanned to see if they can
be deleted.  That's a long lockage.  The table is likely to have 10K entries.

I would like to reduce the longest lock duration by maintaining the set of {{CacheStatus}}
es to delete incrementally.

1: Let there be a new {{HashSet}}, {{deleteSet}}, that's protected under {{synchronized(cachedArchives)}}

2: When {{refcount}} is decreased to 0, move the {{CacheStatus}} from {{cachedArchives}} to
{{deleteSet}}

3: When seeking an existing {{CacheStatus}}, look in {{deleteSet}} if it isn't in {{cachedArchives}}

4: When {{refcount}} is increased from 0 to 1 in a pre-existing {{CacheStatus}} [see 3:, above]
move the {{CacheStatus}} from {{deleteSet}} to {{cachedArchives}}

5: When we clean the cache, under {{synchronized(cachedArchives)}} , move {{deleteSet}} to
a local variable and create a new empty {{HashSet}}.  This is constant time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message