hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1017) Optimization: Reduce Overhead from ReflectionUtils.newInstance
Date Fri, 16 Feb 2007 19:01:14 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473790
] 

Doug Cutting commented on HADOOP-1017:
--------------------------------------

I just realized that this cache is not thread-safe: access to the cache should be synchronized.
 I can see a few ways of doing this:

a. wrap 'synchronized (constructorCache) { ... }' around the newInstance method body

b. switch to using a synchronized map (Collections.synchronizedMap)

c. keep a separate constructorCache per thread, via a threadLocal

(a) is the simplest, but could result in more contention, since the cache stays locked while
constructors are created.  (b) would unlock the cache while constructors are created, but
might sometimes create a given constructor twice.  (c) would be fastest, but would always
create a new constructor per thread.  I'd probably opt for (b).  What do others think?

> Optimization: Reduce Overhead from ReflectionUtils.newInstance
> --------------------------------------------------------------
>
>                 Key: HADOOP-1017
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1017
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: util
>            Reporter: Ron Bodkin
>         Attachments: cacheCtor.patch, ReflectionUtils.patch.txt, TestReflectionUtils.java
>
>
> I found that a significant amount of time on my project was being spent in creating constructors
for each row of data. I dramatically optimized this performance by creating a simple WeakHashMap
to cache constructors by class. For example, in a sample job I find that ReflectionUtils.newInstance
takes 200 ms (2% of total) with the cache enabled, but it uses 900 ms (6% of total) without
the cache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message