lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Max Lynch <>
Subject Re: Term Boost Threshold
Date Sat, 14 Nov 2009 00:02:54 GMT
> > Now, I would like to know exactly what term was found.  For example, if a
> > result comes back from the query above, how do I know whether John Smith
> > was
> > found, or both John Smith and his company, or just John Smith
> Manufacturing
> > was found?
> In general, this is actually very hard.  Lucene does not even keep track
> itself
> of which terms in a given query matched a given document, but you really
> just need to know which terms matched in the final "top hits" you're
> showing
> to the user, right?  What is this information used for / why do you want to
> know which term hit?

Well I use results that have a name match as more important than ones with a
company match, and ones with both are the most important.  I was hoping term
boosting would help me mathematically detect these cases (for example, a
firstname + company match would have detectably higher score) without having
to use a highlighter for what is clearly not its purpose.  I also am not
using a traditional search display, so every result I find is important and
there is no pagination (it's a background search).

Is it possible to do this with term boosting?  Otherwise my highlighter
solution works for the time being, it's just slow.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message