Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 14697 invoked from network); 26 Oct 2006 06:46:45 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 26 Oct 2006 06:46:45 -0000 Received: (qmail 73556 invoked by uid 500); 24 Oct 2006 14:27:35 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 73518 invoked by uid 500); 24 Oct 2006 14:27:35 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 73507 invoked by uid 99); 24 Oct 2006 14:27:35 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Oct 2006 07:27:35 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: local policy) Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Oct 2006 07:27:21 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id A5B1F7142C2 for ; Tue, 24 Oct 2006 07:26:17 -0700 (PDT) Message-ID: <26971239.1161699977676.JavaMail.root@brutus> Date: Tue, 24 Oct 2006 07:26:17 -0700 (PDT) From: "Yonik Seeley (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-693) ConjunctionScorer - more tuneup In-Reply-To: <7545754.1161643096551.JavaMail.root@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ http://issues.apache.org/jira/browse/LUCENE-693?page=comments#action_12444320 ] Yonik Seeley commented on LUCENE-693: ------------------------------------- Ah, I see the problem... in the constructor I have boolean more = scorers[i].next(); for each scorer... but note that the local "more" is masking the member "more". Doh! You can just remove "boolean" from "boolean more" in the ConjunctionScorer constructor, and I'll try to see why this was never reproduced by any test cases in the meantime. > ConjunctionScorer - more tuneup > ------------------------------- > > Key: LUCENE-693 > URL: http://issues.apache.org/jira/browse/LUCENE-693 > Project: Lucene - Java > Issue Type: Bug > Components: Search > Affects Versions: 2.1 > Environment: Windows Server 2003 x64, Java 1.6, pretty large index > Reporter: Peter Keegan > Attachments: conjunction.patch > > > (See also: #LUCENE-443) > I did some profile testing with the new ConjuctionScorer in 2.1 and discovered a new bottleneck in ConjunctionScorer.sortScorers. The java.utils.Arrays.sort method is cloning the Scorers array on every sort, which is quite expensive on large indexes because of the size of the 'norms' array within, and isn't necessary. > Here is one possible solution: > private void sortScorers() { > // squeeze the array down for the sort > // if (length != scorers.length) { > // Scorer[] temps = new Scorer[length]; > // System.arraycopy(scorers, 0, temps, 0, length); > // scorers = temps; > // } > insertionSort( scorers,length ); > // note that this comparator is not consistent with equals! > // Arrays.sort(scorers, new Comparator() { // sort the array > // public int compare(Object o1, Object o2) { > // return ((Scorer)o1).doc() - ((Scorer)o2).doc(); > // } > // }); > > first = 0; > last = length - 1; > } > private void insertionSort( Scorer[] scores, int len) > { > for (int i=0; i for (int j=i; j>0 && scores[j-1].doc() > scores[j].doc();j-- ) { > swap (scores, j, j-1); > } > } > return; > } > private void swap(Object[] x, int a, int b) { > Object t = x[a]; > x[a] = x[b]; > x[b] = t; > } > > The squeezing of the array is no longer needed. > We also initialized the Scorers array to 8 (instead of 2) to avoid having to grow the array for common queries, although this probably has less performance impact. > This change added about 3% to query throughput in my testing. > Peter -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org