lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: Lucene nightly build: similarity score per field
Date Thu, 03 Mar 2011 15:34:57 GMT
On Thu, Mar 3, 2011 at 10:25 AM, Patrick Diviacco
<patrick.diviacco@gmail.com> wrote:
> I've downloaded Lucene nightly build because I need to customize the
> similarity *per field*.
>
> However I don't see the field parameter passed to the methods to compute the
> score such as "tf" and "idf"...
>
> how can I implement different similarities score per document field then ?
>

Hi, the way you set this up is to use SimilarityProvider to configure
Similarities per-field: for example maybe field A, B, and C use
Similarity1 and field D use Similarity2.
So you just set your SimilarityProvider on the IndexWriter and
IndexSearcher, and it must implement this factory method:

  Similarity get(String field)

Here are the reasons for this factory design (versus simply adding
field to every method):
1. performance: up-front we ask the SimilarityProvider for the
per-field Similarity. So you probably use a hashmap or something here
to return the correct one. If you had to do this on every single call
to tf(), this would slow down queries significantly.
2. flexibility: we are working to generalize Similarity, and maybe the
existing stuff you see becomes TFIDFSimilarity. So in the future you
might have field1 that uses TFIDF and field2 that uses something else
(e.g. BM25), with a totally different API and scoring system.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message