lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Mannix <jake.man...@gmail.com>
Subject Re: Term Boost Threshold
Date Sat, 14 Nov 2009 00:17:12 GMT
On Fri, Nov 13, 2009 at 4:02 PM, Max Lynch <ihasmax@gmail.com> wrote:

> > > Now, I would like to know exactly what term was found.  For example, if
> a
> > > result comes back from the query above, how do I know whether John
> Smith
> > > was
> > > found, or both John Smith and his company, or just John Smith
> > Manufacturing
> > > was found?
> >
> >
> > In general, this is actually very hard.  Lucene does not even keep track
> > itself
> > of which terms in a given query matched a given document, but you really
> > just need to know which terms matched in the final "top hits" you're
> > showing
> > to the user, right?  What is this information used for / why do you want
> to
> > know which term hit?
>
>
> Well I use results that have a name match as more important than ones with
> a
> company match, and ones with both are the most important.  I was hoping
> term
> boosting would help me mathematically detect these cases (for example, a
> firstname + company match would have detectably higher score) without
> having
> to use a highlighter for what is clearly not its purpose.  I also am not
> using a traditional search display, so every result I find is important and
> there is no pagination (it's a background search).
>

Well already, without doing any boosting, documents matching more of the
terms
in your query will score higher.  If you really want to make this effect
more
pronounced, yes, you can boost the more important query terms higher.

  -jake

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message