Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 78746 invoked from network); 16 Feb 2007 19:01:38 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 16 Feb 2007 19:01:38 -0000 Received: (qmail 31761 invoked by uid 500); 16 Feb 2007 19:01:44 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 31729 invoked by uid 500); 16 Feb 2007 19:01:44 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 31720 invoked by uid 99); 16 Feb 2007 19:01:44 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Feb 2007 11:01:44 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Feb 2007 11:01:36 -0800 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id A0E467141B8 for ; Fri, 16 Feb 2007 11:01:14 -0800 (PST) Message-ID: <1172730.1171652474656.JavaMail.jira@brutus> Date: Fri, 16 Feb 2007 11:01:14 -0800 (PST) From: "Doug Cutting (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-1017) Optimization: Reduce Overhead from ReflectionUtils.newInstance In-Reply-To: <8341615.1171393805599.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-1017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473790 ] Doug Cutting commented on HADOOP-1017: -------------------------------------- I just realized that this cache is not thread-safe: access to the cache should be synchronized. I can see a few ways of doing this: a. wrap 'synchronized (constructorCache) { ... }' around the newInstance method body b. switch to using a synchronized map (Collections.synchronizedMap) c. keep a separate constructorCache per thread, via a threadLocal (a) is the simplest, but could result in more contention, since the cache stays locked while constructors are created. (b) would unlock the cache while constructors are created, but might sometimes create a given constructor twice. (c) would be fastest, but would always create a new constructor per thread. I'd probably opt for (b). What do others think? > Optimization: Reduce Overhead from ReflectionUtils.newInstance > -------------------------------------------------------------- > > Key: HADOOP-1017 > URL: https://issues.apache.org/jira/browse/HADOOP-1017 > Project: Hadoop > Issue Type: Improvement > Components: util > Reporter: Ron Bodkin > Attachments: cacheCtor.patch, ReflectionUtils.patch.txt, TestReflectionUtils.java > > > I found that a significant amount of time on my project was being spent in creating constructors for each row of data. I dramatically optimized this performance by creating a simple WeakHashMap to cache constructors by class. For example, in a sample job I find that ReflectionUtils.newInstance takes 200 ms (2% of total) with the cache enabled, but it uses 900 ms (6% of total) without the cache. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.