Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 70048 invoked from network); 7 Oct 2005 13:17:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 7 Oct 2005 13:17:39 -0000 Received: (qmail 8199 invoked by uid 500); 7 Oct 2005 13:17:28 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 8077 invoked by uid 500); 7 Oct 2005 13:17:27 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 8035 invoked by uid 99); 7 Oct 2005 13:17:27 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Oct 2005 06:17:27 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of TAigner@wescodist.com designates 12.29.179.202 as permitted sender) Received: from [12.29.179.202] (HELO mail01.wescodist.com) (12.29.179.202) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Oct 2005 06:17:30 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.0.6603.0 Content-Class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: RE: Optimization Date: Fri, 7 Oct 2005 09:17:04 -0400 Message-ID: <14FBF41EF1411B45B2EC4ADEAC53D131040BFA2F@MAIL01.wescodist.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Optimization Thread-Index: AcXJsLBozsivTsxvSbCWOTja7/JUKwBj5cQg From: "Aigner, Thomas" To: X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Thanks Erik, I tried the reverse index and it worked like a charm. While I was doing this, we figured out a way to handle contains within search and wildcard searches at the beginning. I thought I would share it with the community (and realized it handled the reverse index as well) Word: ABCDEFG Tokens created: Have a question.. Is there any obvious things that can be done > to help speed up query lookups especially wildcard searches (i.e. > *lamps). Obvious? Sort of. *lamps needs to scan through _every_ single term =20 in the index (for the specified field only, of course) because terms =20 are lexicographically ordered. If you reverse terms during analysis and lay them in the same =20 position (increment 0) as the original token you'd end up with =20 "spmal..." terms. Now pre-process the query string and if there is a =20 prefixed wildcard query, reverse it so that "*lamps" turns into =20 "spmal*" and you will likely achieve a dramatic speed-up. This is just one technique for dealing with prefixed wildcard =20 queries. There is more fun to be had with queries like *lamps*. A =20 technique I learned from the book Managing Gigabytes is to rotate =20 terms through all their possible variations and index all of those, =20 which also requires cleverness on the querying side of things. Erik --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org