lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Wei Zhu <moonshot...@gmail.com>
Subject Re: n-gram and multiword query
Date Thu, 14 Jul 2005 15:39:30 GMT
i remember lucene doesn't do anything for proximity.

On 7/14/05, Rajesh Munavalli <rajeshm@dessci.com> wrote:
> Consider a document with the following contents
> " Levenshtein distance is named after the Russian scientist Vladimir
> Levenshtein and is also called edit distance"
> 
> Possible bi-grams are (after removing the stop words in the beginning
> and end)
> "Levenshtein distance", "named after", "Russian scientist", "scientist
> Vladimir", "Vladimir Levenshtein" called edit", "edit distance"
> 
> If my query term is "Vladimir levenshtein distance", how does Lucene
> compute the similarity to the indexed terms? Are query terms appearing
> together given more importance? How does it account for gaps (caused by
> stop word removal) while matching multiword query?
> 
> thanks,
> 
> Rajesh Munavalli
> 
> 


-- 
Thanks!
yours, WeiZhu Chen

Mime
View raw message