lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: questions on PerFieldSimilarityWrapper
Date Wed, 07 Nov 2012 22:33:47 GMT
coord() and queryNorm() work on the query as a whole, which may span
multiple fields.

On Wed, Nov 7, 2012 at 5:23 PM, Joel Barry <jmb236@gmail.com> wrote:
> Hi folks,
>
> I have a question on PerFieldSimilarityWrapper.  It seems that it is
> not possible to get per-field behavior on queryNorm() and coord()...
>
> The documentation for PerFieldAnalyzerWrapper (lucene 4.0) says:
>
>   Subclasses should implement get(String) to return an appropriate
>   Similarity (for example, using field-specific parameter values) for
>   the field.
>
> This leads the user to believe that *only* get() needs to be
> overridden. However, I've found that I must override queryNorm() as
> well, otherwise Similarity.queryNorm() will be called (because
> PerFieldAnalyzerWrapper extends Similarity), not the user-supplied
> version.
>
> The test cases in lucene seem always to override queryNorm() and
> (coord() too), but I don't see tests for the per-field behavior of
> these. Indeed, there seems to be no way to get the field name from
> these methods. And that's the problem.  I'd like to have per-field
> behavior for queryNorm() and coord().
>
> Below is some code to illustrate the issue:
>
> class MyPerFieldSimilarity1 extends PerFieldSimilarityWrapper {
>     @Override
>     public Similarity get(String name) {
>         return new DefaultSimilarity();
>     }
> }
>
> class MyPerFieldSimilarity2 extends PerFieldSimilarityWrapper {
>     @Override
>     public Similarity get(String name) {
>         return new DefaultSimilarity();
>     }
>
>     @Override
>     public float queryNorm(float valueForNormalization) {
>         // Notice that I don't have access to the read field name here...
>         return get("dummy").queryNorm(valueForNormalization);
>     }
> }
>
> public class PerFieldSimilarityWrapperTest {
>     private float runTest(Similarity similarity) throws IOException {
>         IndexWriterConfig config = new
> IndexWriterConfig(Version.LUCENE_40, new
> WhitespaceAnalyzer(Version.LUCENE_40));
>         config.setSimilarity(similarity);
>         Directory dir = new RAMDirectory();
>         IndexWriter writer = new IndexWriter(dir, config);
>         Document doc = new Document();
>         String fieldName = "some_field";
>         doc.add(new TextField(fieldName, "some text", Store.YES));
>         writer.addDocument(doc);
>         writer.commit();
>
>         IndexReader reader = DirectoryReader.open(dir);
>         IndexSearcher searcher = new IndexSearcher(reader);
>         searcher.setSimilarity(similarity);
>         TermQuery query = new TermQuery(new Term(fieldName, "text"));
>         TopDocs topDocs = searcher.search(query, 1);
>         float score = topDocs.scoreDocs[0].score;
>         return score;
>     }
>
>     public static void main(String[] args) throws IOException {
>         PerFieldSimilarityWrapperTest that = new
> PerFieldSimilarityWrapperTest();
>         System.out.println(that.runTest(new DefaultSimilarity()));
>         System.out.println(that.runTest(new MyPerFieldSimilarity1()));
>         System.out.println(that.runTest(new MyPerFieldSimilarity2()));
>     }
> }
>
> Running this produces:
>
> 0.19178301
> 0.058849156
> 0.19178301
>
> Am I overlooking something here or is this a bug?
>
> Thanks,
> - Joel
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message