Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 30834 invoked from network); 24 May 2006 11:08:54 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 24 May 2006 11:08:54 -0000 Received: (qmail 3451 invoked by uid 500); 24 May 2006 11:08:49 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 3045 invoked by uid 500); 24 May 2006 11:08:48 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 3027 invoked by uid 99); 24 May 2006 11:08:48 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 May 2006 04:08:48 -0700 X-ASF-Spam-Status: No, hits=3.4 required=10.0 tests=DNS_FROM_RFC_ABUSE,DRUGS_ERECTILE,DRUGS_ERECTILE_OBFU,FUZZY_VPILL X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy includes SPF record at spf.trusted-forwarder.org) Received: from [217.12.10.218] (HELO web26007.mail.ukl.yahoo.com) (217.12.10.218) by apache.org (qpsmtpd/0.29) with SMTP; Wed, 24 May 2006 04:08:46 -0700 Received: (qmail 25372 invoked by uid 60001); 24 May 2006 11:08:21 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.co.uk; h=Message-ID:Received:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=M7blk2cwp4s6pI77L1bVwwWeiz/gx5/eByMiFeMXCJSIt1L5CDRvhHIc1s8tzGt9BFmvdmLgWEipjiQxiJkCBOZn3YNOFfwCTJebaS1y9gQhaQGXE0tOO0J9BV85n8Fym3laVtH+EXb7ScCyLeGjU41Yu+gMDwVu9PxLOzod040= ; Message-ID: <20060524110821.25370.qmail@web26007.mail.ukl.yahoo.com> Received: from [193.36.230.96] by web26007.mail.ukl.yahoo.com via HTTP; Wed, 24 May 2006 12:08:21 BST Date: Wed, 24 May 2006 12:08:21 +0100 (BST) From: mark harwood Subject: RE: Can I do "Google Suggest" Like Search? - - - from - - -vikas To: java-user@lucene.apache.org In-Reply-To: <2B63BBC3F2F85B45A985FAD965DE969ECA1C80@ITPXCHCLN1.enterprise.veritas.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N >>What will happen if I send PrefixQuery A search returns a list of docs - you want a list of words which is why I suggested using the IndexReader "terms" APIs which PrefixQuery uses internally. If you are not in a position to try the more complex solution I outlined earlier (this bases suggestions on multiple terms in the query string) then you *could* use PrefixQuery as follows: IndexReader reader=IndexReader.open("/indexes/nasa"); String incompleteWord="Am"; PrefixQuery pq=new PrefixQuery(new Term("contents",incompleteWord.toLowerCase())); HashSet suggestedTerms=new HashSet(); pq.rewrite(reader).extractTerms(suggestedTerms); for (Iterator iter = suggestedTerms.iterator(); iter.hasNext();) { Term term = (Term) iter.next(); System.out.println(term.text()); } Using this technique on a completely free-text field (i.e. something other than the "country" field in your example) will probably make poorly informed suggestions and I would refer to my previous post for a better solution. Cheers Mark --- Vikas Khengare wrote: > > > Hi Mark > > > > You are right; I want suggestions from doc > content only not > general words. What will happen if I send > PrefixQuery in each char input > from user then I will get results [No problem about > number of hits to > show user] using AJAX. So when user type "a" Onkeyup > I will send query > through AJAX to search engine with prefixquery then > I will get results. > > e.g. Field("Country","America") > > Field("Country","Africa") > > Field("Country","Aegentina") > > > > So If search in "Country" for "a*" it will return me > all values which > are starting from "a" So I will get results as I > want. > > > > Is this one right? > > > > Or What is other way to do so? > > > > > > > > > > -----Original Message----- > From: mark harwood [mailto:markharw00d@yahoo.co.uk] > Sent: Wednesday, May 24, 2006 3:37 PM > To: java-user@lucene.apache.org > Subject: Re: Can I do "Google Suggest" Like Search? > - - - from - - > -vikas > > > > Tips: > > > > 1) Don't send to 3 mail lists when 1 will do please > > continue this conversation on java-user only. > > > > 2) Most "suggest" tools work off an index of > previous > > searches (not documents). Do you have a large set of > > searches? If not, making sensible suggestions based > on > > document content can be much more compute intensive. > > My assumption here is you are having to work with > doc > > content. > > > > 3) You don't need to go to the expense of running a > > query and ranking and scoring documents - look at > the > > lower level APIs terms() and termDocs() - use them > to > > find the matching terms > > > > 4) word suggestions ideally shouldn't be independent > > of each other - look at completed words in the query > > string and use them to inform the selection of > > suggestions for the incomplete term being typed. The > > termDocs()/termPositions() apis give you all the > data > > you need to establish what docs/positions exist for > > completed terms and these can be cross-referenced > with > > the list of docs/positions for the "alternative" > terms > > under consideration. A high proximity between > > completed term occurences and a suggested term's > > occurences makes a strong candidate. A fast way to > do > > proximity tests might be to compared sorted arrays > of > > numbers where each number represents a term using a > > function like: > > termspaceNumber=[DocNumber * maxNumTermsPerDoc]+ > > termPositionInDoc > > > > You could then compare long[]completedTermOccurences > > with long[]suggestedAlternativeTermOccurences > looking > > for matches where numbers differ by 1 or 2. > > > > A faster (rougher) comparison solution which ignored > > word proximity would be just to compare bitsets of > doc > > ids looking for high levels of > > overlap(intersection/union). > > > > You can use TermEnum.docFreq() to quickly rule out > > very rare words from your calculations. > > > > Cheers, > > Mark > > > > Send instant messages to your online friends > http://uk.messenger.yahoo.com > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: > java-user-unsubscribe@lucene.apache.org > > For additional commands, e-mail: > java-user-help@lucene.apache.org > > > > ======================================================================== > ========================== > > > > with best regards > > from ......... > > vikas r. khengare > > Veritas Software India Private Ltd. > > Symantec Corporation > > Pune, India > > > > [ Enjoy your life today.... > because === message truncated === ___________________________________________________________ All New Yahoo! Mail � Tired of Vi@gr@! come-ons? Let our SpamGuard protect you. http://uk.docs.yahoo.com/nowyoucan.html --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org