Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 90887 invoked from network); 26 Jul 2006 09:51:29 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 26 Jul 2006 09:51:29 -0000 Received: (qmail 13630 invoked by uid 500); 26 Jul 2006 09:51:23 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 13600 invoked by uid 500); 26 Jul 2006 09:51:23 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 13587 invoked by uid 99); 26 Jul 2006 09:51:22 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Jul 2006 02:51:22 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (asf.osuosl.org: local policy) Received: from [213.133.33.30] (HELO mailrelay.is.nl) (213.133.33.30) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Jul 2006 02:51:21 -0700 Received: from [213.133.51.241] (HELO hai01.hippo.local) by mailrelay.is.nl (CommuniGate Pro SMTP 4.3.5) with ESMTP id 20531239 for java-user@lucene.apache.org; Wed, 26 Jul 2006 11:51:00 +0200 X-MimeOLE: Produced By Microsoft Exchange V6.0.6603.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: Method to speed up caching for faceted navigation Date: Wed, 26 Jul 2006 11:50:58 +0200 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Method to speed up caching for faceted navigation Thread-Index: AcawmLRGqVfQ9JNuR5qjAEeLsd11uw== From: "Johan Stuyts" To: X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Hi, I am working on faceted navigation. This is nothing new but I am anticpating an index that changes very frequently (every couple of seconds). After the index has been updated, I need to cache the bit sets of the facet values so I can do counting during searches later on. Because I need to get a lot of bit sets often this needs to be as fast as possible. I did the following: IndexReader ir =3D ...; TermDocs td =3D ir.termDocs(new Term("facet name", "facet value")); while (td.next()) { bitSet.set(td.doc()); } The problem with this code is that it gets the document IDs one by one. I tried to optimize the loop by reading blocks of IDs by using 'read(int[], int[])', but this did not have a noticable effect. I looked at the implementation of 'read(int[], int[])' in 'SegmentTermDocs' and saw that it did the following things: - check if the document has a frequency higher than 1, and if so read it; - check if the document has been deleted, and if so don't add it to the result; - store the document IDs, counts and frequences in attributes instead of local variables. Given that the following preconditions hold in my situation: - all documents have a frequency of 1 for the term; - I never delete documents using the 'IndexReader' from which I get the 'TermDocs' object; - I am only interested in the document IDs. I made 'SegmentTermDocs' a public class and added the following method. This method eliminates the overhead in the 'read(int[], int[]) method: public void readDocsWithoutFreqsAssumingNoDeletions(final BitSet destination) throws IOException { int count =3D this.count; final int df =3D this.df; int doc =3D this.doc; while (count < df) { doc +=3D freqStream.readVInt() >>> 1; count++; destination.set(doc); } // Leave a consistent state this.doc =3D doc; freq =3D 1; this.count =3D df; } By using the method above I gained a speed improvement of over 20%. Will this method always work correctly given the preconditions? Kind regards, Johan Stuyts Hippo --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org