Return-Path: Delivered-To: apmail-jakarta-lucene-dev-archive@apache.org Received: (qmail 41347 invoked from network); 13 Feb 2003 12:40:51 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 13 Feb 2003 12:40:51 -0000 Received: (qmail 12736 invoked by uid 97); 13 Feb 2003 12:42:20 -0000 Delivered-To: qmlist-jakarta-archive-lucene-dev@nagoya.betaversion.org Received: (qmail 12729 invoked from network); 13 Feb 2003 12:42:20 -0000 Received: from daedalus.apache.org (HELO apache.org) (208.185.179.12) by nagoya.betaversion.org with SMTP; 13 Feb 2003 12:42:20 -0000 Received: (qmail 40635 invoked by uid 500); 13 Feb 2003 12:40:44 -0000 Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Developers List" Reply-To: "Lucene Developers List" Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 40567 invoked from network); 13 Feb 2003 12:40:43 -0000 Received: from main.gmane.org (80.91.224.249) by daedalus.apache.org with SMTP; 13 Feb 2003 12:40:43 -0000 Received: from list by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 18jIcp-00017I-00 for ; Thu, 13 Feb 2003 13:37:59 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: lucene-dev@jakarta.apache.org Received: from news by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 18jIbd-00011g-00 for ; Thu, 13 Feb 2003 13:36:45 +0100 From: "Christoph Kiehl" Subject: Re: [PATCH] Refactoring QueryParser.jj, setLowercaseWildcardTerms() Date: Thu, 13 Feb 2003 13:35:44 +0100 Lines: 31 Message-ID: References: <3E4A8425.6050602@lucene.com> <200302121843.43835.tatu@hypermall.net> <049d01c2d32d$366716f0$0200a8c0@whale> X-Complaints-To: usenet@main.gmane.org X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: news X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N maurits van wijland wrote: > Hi all, > > Maybe it we should start using stemming in a different maner. Look at > it from the perspective > of queryexpansion. In case we store stems in a different table, we > will not have this problem! > > So, each token in stored in the index as a term. > Each term is stemmed with the appropriate stemmer > Store each stem and unstemed term in a separate index. > > We could then, search using the terms entered, and firstfind all the > terms that match the WildcardQuery. Next,you coulde use the terms > found, and then stem them. > From there, you retrieve all the terms related to that stem! > Finally, search for documents with all terms retrieved. This, might be an idea. But it would slow down everything by factor 3 if I understand you correctly. This problem is more complicated then I thought first. Hm, are we really the first people on earth facing this problem? There must be a common way solving this ;) I tried to find out how google handles wildcard, but they seem to be ignored. Thoughtfully Christoph --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-dev-help@jakarta.apache.org