From java-user-return-28561-apmail-lucene-java-user-archive=lucene.apache.org@lucene.apache.org Sat Jun 09 15:08:56 2007 Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 99815 invoked from network); 9 Jun 2007 15:08:55 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 9 Jun 2007 15:08:55 -0000 Received: (qmail 78335 invoked by uid 500); 9 Jun 2007 15:08:52 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 78293 invoked by uid 500); 9 Jun 2007 15:08:52 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 78282 invoked by uid 99); 9 Jun 2007 15:08:52 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 09 Jun 2007 08:08:52 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (herse.apache.org: local policy) Received: from [212.27.42.28] (HELO smtp2-g19.free.fr) (212.27.42.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 09 Jun 2007 08:08:48 -0700 Received: from [192.168.1.102] (dau94-4-82-227-122-98.fbx.proxad.net [82.227.122.98]) by smtp2-g19.free.fr (Postfix) with ESMTP id 85AC498ED3 for ; Sat, 9 Jun 2007 17:08:25 +0200 (CEST) Mime-Version: 1.0 (Apple Message framework v752.2) In-Reply-To: <6e3ae6310706081509m9650425t907a324a981d0b16@mail.gmail.com> References: <6e3ae6310706071831s6e2b239as77bcbbc39e47548f@mail.gmail.com> <359a92830706080710q4df331ddve6f9b804a9c1c823@mail.gmail.com> <6e3ae6310706081509m9650425t907a324a981d0b16@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; delsp=yes; format=flowed Message-Id: Content-Transfer-Encoding: quoted-printable X-Image-Url: http://homepage.mac.com/rdupond/.cv/thumbs/me.thumbnail From: Mathieu Lecarme Subject: Re: How to implement AJAX search~Lucene Search part? Date: Sat, 9 Jun 2007 17:08:23 +0200 To: java-user@lucene.apache.org X-Mailer: Apple Mail (2.752.2) X-Virus-Checked: Checked by ClamAV on apache.org You can work like with lucene spelling. A specific Index with word as Document, boost with something =20 proportionnal of number of occurences (with log and math magic) The magical stuff is n Fields with starting ngram, not stored, no =20 tokenized. For example, if you wont to index the word "carott", you will index =20 the fields carott, carot, caro, car, ca, c. With this huge index, you can search quickly, ordered by what you wont. Two improvement, for reducing index size, you can limit the number of =20= letter (min and max), and extract the right one in a little set, =20 after a request. What about bad words? This index could be extended with two words suggestion like Google =20 and co do. M. Le 9 juin 07 =E0 00:09, Chris Lu a =E9crit : > Thanks to all who answered with their experience and insights! > > LUCENE-625 is very interesting, but not sure about the scalability. > "Begin completion only with 3 letters or more" is reasonable for > special cases, but not ideal. What I wanted to implement is a pretty > general software. > > WildcardTermEnum seems closest to what I planned to search on existing > Lucene index, possibly pretty large. I can use it to list, say top 10 > matching terms, and I can use another search to find all matching > docs. This is actually 2 searches. > > Sounds pertty good? > > --=20 > Chris Lu > ------------------------- > Instant Scalable Full-Text Search On Any Database/Application > site: http://www.dbsight.net > demo: http://search.dbsight.com > Lucene Database Search in 3 minutes: > http://wiki.dbsight.com/index.php?=20 > title=3DCreate_Lucene_Database_Search_in_3_minutes > > > On 6/8/07, Erick Erickson wrote: >> You can get the information pretty quickly by using a >> WildcardTermEnum (NOT query). Especially if you >> terminate after some number of characters.... >> >> Erick >> >> On 6/7/07, Chris Lu wrote: >> > >> > Hi, >> > >> > I would like to implement an AJAX search. Basically when user =20 >> types in >> > several characters, I will try to search the Lucene index and =20 >> found >> > all possible matching items. >> > >> > Seems I need to use wildcard query like "test*" to matching =20 >> anything. >> > Is this the only way to do it? It doesn't seems quite efficient, >> > especially when you just typed in the first character. >> > >> > I guess the "good" way is to go through the terms, and return as =20= >> soon >> > as, for example, 10 terms are found. >> > >> > I am wondering is there anything like this already built? >> > >> > -- >> > Chris Lu >> > ------------------------- >> > Instant Scalable Full-Text Search On Any Database/Application >> > site: http://www.dbsight.net >> > demo: http://search.dbsight.com >> > Lucene Database Search in 3 minutes: >> > >> > http://wiki.dbsight.com/index.php?=20 >> title=3DCreate_Lucene_Database_Search_in_3_minutes >> > >> > =20 >> --------------------------------------------------------------------- >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> > For additional commands, e-mail: java-user-help@lucene.apache.org >> > >> > >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org