lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Neil Hooey (Commented) (JIRA)" <>
Subject [jira] [Commented] (LUCENE-3749) javadocs and simplifications for 4.0
Date Sun, 04 Mar 2012 23:16:58 GMT


Neil Hooey commented on LUCENE-3749:

This change breaks per-field similarity configuration in Solr. Specifically with this commit:

commit 5d371928263d8d78d0e52781340ae95506bd9bf6
Author: Robert Muir <>
Date:   Mon Feb 6 12:48:01 2012 +0000

    LUCENE-3749: replace SimilarityProvider with PerFieldSimilarityWrapper
    git-svn-id: 13f79535-47bb-0310-9956-ffa450edef68

I have the following configuration in my schema.xml:

<fieldtype name="payloads" stored="false" indexed="true" class="solr.TextField" >
    <tokenizer class=""/>
    <filter class="solr.DelimitedPayloadTokenFilterFactory" encoder="float"/>
  <analyzer type="query">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  <similarity class="" />

But when I build against and use a version of a Solr with the commit mentioned above, my similarity
class is no longer executed. I've confirmed this by putting prints in the scorePayload(),
tf() and idf() functions and noticing they print before and don't print after including that

It seems this is intentional, based on Robert Muir's comments, but how can you get per-field
similarity to work in Solr with this new code?
> javadocs and simplifications for 4.0
> ----------------------------------------------------
>                 Key: LUCENE-3749
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Task
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>             Fix For: 4.0
>         Attachments: LUCENE-3749.patch, LUCENE-3749_part2.patch
> As part of adding additional scoring systems to lucene, we made a lower-level Similarity
> and the existing stuff became e.g. TFIDFSimilarity which extends it.
> However, I always feel bad about the complexity introduced here (though I do feel there
> are some "excuses", that its a difficult challenge).
> In order to try to mitigate this, we also exposed an easier API (SimilarityBase) on top
> it that makes some assumptions (and trades off some performance) to try to provide something

> consumable for e.g. experiments.
> Still, we can cleanup a few things with the low-level api: fix outdated documentation
> shoot for better/clearer naming etc.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message