lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Diviacco <>
Subject Re: Lucene nightly build: similarity score per field
Date Fri, 04 Mar 2011 18:18:03 GMT
All right.

So it is still not clear how to exactly implement it.

I have SimilarityA and SimilarityB subclasses.
So far, I know I can customize the similarity class for the searcher:
searcher.setSimilarity(new BoostingSimilarity());

When/how should I use get method ?
Similarity get(String field)


On 3 March 2011 16:34, Robert Muir <> wrote:

> On Thu, Mar 3, 2011 at 10:25 AM, Patrick Diviacco
> <> wrote:
> > I've downloaded Lucene nightly build because I need to customize the
> > similarity *per field*.
> >
> > However I don't see the field parameter passed to the methods to compute
> the
> > score such as "tf" and "idf"...
> >
> > how can I implement different similarities score per document field then
> ?
> >
> Hi, the way you set this up is to use SimilarityProvider to configure
> Similarities per-field: for example maybe field A, B, and C use
> Similarity1 and field D use Similarity2.
> So you just set your SimilarityProvider on the IndexWriter and
> IndexSearcher, and it must implement this factory method:
>  Similarity get(String field)
> Here are the reasons for this factory design (versus simply adding
> field to every method):
> 1. performance: up-front we ask the SimilarityProvider for the
> per-field Similarity. So you probably use a hashmap or something here
> to return the correct one. If you had to do this on every single call
> to tf(), this would slow down queries significantly.
> 2. flexibility: we are working to generalize Similarity, and maybe the
> existing stuff you see becomes TFIDFSimilarity. So in the future you
> might have field1 that uses TFIDF and field2 that uses something else
> (e.g. BM25), with a totally different API and scoring system.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message