lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: Custom Similarity
Date Sat, 08 Oct 2011 13:11:33 GMT
On Sat, Oct 8, 2011 at 3:37 AM, Joel Halbert <joel@su3analytics.com> wrote:
> Hi,
>
> Does anyone have a modified scoring (Similarity) function they would
> care to share?
>
> I'm searching web page documents and find the default Similarity seems
> to assign too much weight to documents with frequent occurrence of a
> single term from the query and not enough weight to documents that
> contain a greater overlap of the search query terms.
>
> I've been playing around with overriding the default but wondering if
> anyone has an implementation they have found to work well that they
> would care to share.
>

have a look at coord(), you might want to further punish documents
that don't contain all the query terms.

something like:

@Override
public float coord(int overlap, int maxOverlap) {
  return (overlap == maxOverlap)
  ? 1f
  : 0.5f * super.coord(overlap, maxOverlap);
}


-- 
lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message