Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 46085 invoked from network); 27 Feb 2008 21:04:54 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 27 Feb 2008 21:04:54 -0000 Received: (qmail 36282 invoked by uid 500); 27 Feb 2008 21:04:43 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 36224 invoked by uid 500); 27 Feb 2008 21:04:43 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 36213 invoked by uid 99); 27 Feb 2008 21:04:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Feb 2008 13:04:43 -0800 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Feb 2008 21:04:04 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id BAE5D234C042 for ; Wed, 27 Feb 2008 13:03:51 -0800 (PST) Message-ID: <1811547135.1204146231764.JavaMail.jira@brutus> Date: Wed, 27 Feb 2008 13:03:51 -0800 (PST) From: "Michael Busch (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-1195) Performance improvement for TermInfosReader In-Reply-To: <222260612.1204066852510.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573065#action_12573065 ] Michael Busch commented on LUCENE-1195: --------------------------------------- {quote} Unfortunately, it needs to be... no getting around it. {quote} You're right, and I'm stupid :) Actually what I meant was that the get() and put() methods don't need to be synchronized if the underlying data structure, i. e.the LinkedHashMap, that I'm using is thread-safe, otherwise it might return inconsistent data. But the LinkedHashMap is not, unless I decorate it with Collections.synchronizedMap(). Do you know what's faster? Using the synchronized map or making get() and put() synchronized? Probably there's not really a difference, because the decorator that Collections.synchronizedMap() returns just does essentially the same? > Performance improvement for TermInfosReader > ------------------------------------------- > > Key: LUCENE-1195 > URL: https://issues.apache.org/jira/browse/LUCENE-1195 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Reporter: Michael Busch > Assignee: Michael Busch > Priority: Minor > Fix For: 2.4 > > Attachments: lucene-1195.patch > > > Currently we have a bottleneck for multi-term queries: the dictionary lookup is being done > twice for each term. The first time in Similarity.idf(), where searcher.docFreq() is called. > The second time when the posting list is opened (TermDocs or TermPositions). > The dictionary lookup is not cheap, that's why a significant performance improvement is > possible here if we avoid the second lookup. An easy way to do this is to add a small LRU > cache to TermInfosReader. > I ran some performance experiments with an LRU cache size of 20, and an mid-size index of > 500,000 documents from wikipedia. Here are some test results: > 50,000 AND queries with 3 terms each: > old: 152 secs > new (with LRU cache): 112 secs (26% faster) > 50,000 OR queries with 3 terms each: > old: 175 secs > new (with LRU cache): 133 secs (24% faster) > For bigger indexes this patch will probably have less impact, for smaller once more. > I will attach a patch soon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org