lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: getting answers starting with a requested string first
Date Tue, 27 Sep 2011 22:20:42 GMT

: 1) giving NAME_ANALYZED a type where omitNorms=false: I thought this would
: give answers with shorter NAME_ANALYZED field a higher score. I've tested
: that solution, but it's not working. I guess this is because there is no
: score for fq parameter (all my answers have same score)

both of those statements are correct.  omitNorms=false will cause length 
normalization to apply, so with the default similarity, shorter field 
values will generally score higher, but norms are very coarse, so it 
won't be very precise; and "fq" queries filter the results, 
but do not affect the score.

: 2) sorting my answers by length desc, and I guess in this case I would need
: to store the length of NAME_ANALYZED field to avoid having to compute it on
: the fly. at this point, this is the only solution I can think of.

that will also be a good way to sort on the length of the field, and will 
give you a lot of precise control.

but sorting on length isn't what you asked about...

: > and I have different answers like
: >
: > Restaurant la tour Eiffel
: > Hotel la tour Eiffel
: > Tour Eiffel
: > Is there a way to get answers with NAME_ANALYZED beginning with "tour
: > Eiffel" first?

If you want to score documents higher because they appear at the begining 
of the field value, that is a differnet problem then scoring documents 
higher because they are shorter -- ie: "Tour Eiffel Tower By Helicopter" 
is longer then "Hotel la tour Eiffel", which one do you want to come 

If you want documents to score higher if they appear "early" in the field 
value, you can either index a "marker" token at the begining of the field 
(ie: "S_T_A_R_T Tour Eiffel") and then do all queries on that field as 
phrase queries including that token (shorter matches score higher in 
phrase queries); or you can look into using the "surround" QParser that 
was recently commited to the trunk.  the surround parser has special 
syntax for generting "Span" Queries, which support a "SpanFirst" query 
that scores documents higher based on how close to the begining of a field 
value the match is.


View raw message