lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: scoring adjacent terms without proximity search
Date Fri, 30 Oct 2009 15:28:56 GMT

On Oct 30, 2009, at 5:49 AM, Joel Halbert wrote:

> Hi,
> Without using a proximity search i.e. "cheese sandwich"~5
> What's the best way of up-scoring results in which the search terms  
> are
> closer to each other?

I'm not aware of any query technique to score based on proximity that  
doesn't, itself, use proximity information.

I suppose you could precompute the proximity associations by indexing  
n-grams (in this case, called Lucene calls them shingles), such that  
there is a single token in your index containing cheese_sandwich  

BTW, what's your concern about using a Phrase Query?  What requirement  
do you have that would prevent that particular query?  Or is there  
something in the way it is implemented that doesn't work for your  
needs (assuming your example here is for discussion purposes)

> E.g. so if I search for:
> content:cheese  content:sandwich
> How do you ensure that a document with content:
> "Toasted Cheese Sandwich"
> scores higher then:
> "Cheese and Potato, Tuna sandwich"
> Joel
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Grant Ingersoll

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message