hi, munavalli, for the (1), (2), (3), it seems only proximity could solve this problem. and for (4), lucene has consider it with coordinate time of a document. in my idea, you are partially right for Proximarity search, since proximity consider the sequence of terms at the same time. On 7/14/05, Rajesh Munavalli wrote: > What if my intention was to find all three words in a document not > necessarily in one sentence? Here is my goal > > (1) All three words appearing together should be given Rank 1 > (2) Three words appearing somewhere in the sentence given Rank 2 > (3) Documents containing words in different sentences should be given > Rank 3 > (4) Documents missing one or more of query terms should be given Rank 4 > > Correct me if I am wrong... Proximity search is concerned about query > terms appearing closer to one another within a certain distance in the > document. > > Thanks, > > Rajesh Munavalli > > -----Original Message----- > From: Chen Wei Zhu [mailto:moonshotter@gmail.com] > Sent: Thursday, July 14, 2005 10:40 AM > To: general@lucene.apache.org > Subject: Re: n-gram and multiword query > > i remember lucene doesn't do anything for proximity. > > On 7/14/05, Rajesh Munavalli wrote: > > Consider a document with the following contents " Levenshtein distance > > > is named after the Russian scientist Vladimir Levenshtein and is also > > called edit distance" > > > > Possible bi-grams are (after removing the stop words in the beginning > > and end) "Levenshtein distance", "named after", "Russian scientist", > > "scientist Vladimir", "Vladimir Levenshtein" called edit", "edit > > distance" > > > > If my query term is "Vladimir levenshtein distance", how does Lucene > > compute the similarity to the indexed terms? Are query terms appearing > > > together given more importance? How does it account for gaps (caused > > by stop word removal) while matching multiword query? > > > > thanks, > > > > Rajesh Munavalli > > > > > > > -- > Thanks! > yours, WeiZhu Chen > -- Thanks! yours, WeiZhu Chen