Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 60117 invoked from network); 3 Oct 2007 13:40:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 3 Oct 2007 13:40:43 -0000 Received: (qmail 97561 invoked by uid 500); 3 Oct 2007 13:40:31 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 97521 invoked by uid 500); 3 Oct 2007 13:40:30 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 97510 invoked by uid 99); 3 Oct 2007 13:40:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Oct 2007 06:40:30 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Oct 2007 13:40:40 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id CA1EC714212 for ; Wed, 3 Oct 2007 06:39:50 -0700 (PDT) Message-ID: <30882588.1191418790825.JavaMail.jira@brutus> Date: Wed, 3 Oct 2007 06:39:50 -0700 (PDT) From: "Grant Ingersoll (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-1017) BoostingTermQuery performance In-Reply-To: <4909739.1191369950741.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-1017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532107 ] Grant Ingersoll commented on LUCENE-1017: ----------------------------------------- You will have to look at what setFreqCurrentDoc() does. I have a feeling, though, that there really isn't anyway around what the current version does and that the performance difference is due to it not checking all the positions on a document. At any rate, the Span stuff needs more scrutiny performance wise, so it is worth another look. There should be a unit test in the code that checks multiple payloads per document, etc. Have a look at that and try it out. > BoostingTermQuery performance > ----------------------------- > > Key: LUCENE-1017 > URL: https://issues.apache.org/jira/browse/LUCENE-1017 > Project: Lucene - Java > Issue Type: Improvement > Components: Search > Affects Versions: 2.2 > Environment: all > Reporter: Peter Keegan > Attachments: BoostingTermQuery.java, termquery.patch > > > I have been experimenting with payloads and BoostingTermQuery, which I think are excellent additions to Lucene core. Currently, BoostingTermQuery extends SpanQuery. I would suggest changing this class to extend TermQuery and refactor the current version to something like 'BoostingSpanQuery'. > The reason is rooted in performance. In my testing, I compared query throughput using TermQuery against 2 versions of BoostingTermQuery - the current one that extends SpanQuery and one that extends TermQuery (which I've included, below). Here are the results (qps = queries per second): > TermQuery: 200 qps > BoostingTermQuery (extends SpanQuery): 97 qps > BoostingTermQuery (extends TermQuery): 130 qps > Here is a version of BoostingTermQuery that extends TermQuery. I had to modify TermQuery and TermScorer to make them public. A code review would be in order, and I would appreciate your comments on this suggestion. > Peter -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org