lucene-java-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dor...@apache.org
Subject svn commit: r544546 [2/2] - in /lucene/java/trunk: ./ src/java/org/apache/lucene/search/ src/java/org/apache/lucene/search/function/ src/java/org/apache/lucene/util/ src/test/org/apache/lucene/search/function/
Date Tue, 05 Jun 2007 16:29:42 GMT
Added: lucene/java/trunk/src/java/org/apache/lucene/search/function/package.html
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/search/function/package.html?view=auto&rev=544546
==============================================================================
--- lucene/java/trunk/src/java/org/apache/lucene/search/function/package.html (added)
+++ lucene/java/trunk/src/java/org/apache/lucene/search/function/package.html Tue Jun  5 09:29:35
2007
@@ -0,0 +1,197 @@
+<HTML>
+ <!--
+/**
+ * Copyright 2005 The Apache Software Foundation
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+ -->
+<HEAD>
+  <TITLE>org.apache.lucene.search.function</TITLE>
+</HEAD>
+<BODY>
+<DIV>
+  Programmatic control over documents scores.
+</DIV>
+<DIV>
+  The <code>function</code> package provides tight control over documents scores.
+</DIV>
+<DIV>
+<font color="#FF0000">
+WARNING: The status of the <b>search.function</b> package is experimental. The
APIs
+introduced here might change in the future and will not be supported anymore
+in such a case.
+</font>
+</DIV>
+<DIV>
+  Two types of queries are available in this package:
+</DIV>
+<DIV>
+  <ol>
+     <li>
+        <b>Custom Score queries</b> - allowing to set the score
+        of a matching document as a mathematical expression over scores
+        of that document by contained (sub) queries.
+     </li>
+     <li>
+        <b>Field score queries</b> - allowing to base the score of a
+        document on <b>numeric values</b> of <b>indexed fields</b>.
+     </li>
+  </ol>
+</DIV>
+<DIV>&nbsp;</DIV>
+<DIV>
+  <b>Some possible uses of these queries:</b>
+</DIV>
+<DIV>
+  <ol>
+     <li>
+        Normalizing the document scores by values indexed in a special field -
+        for instance, experimenting with a different doc length normalization.
+     </li>
+     <li>
+        Introducing some static scoring element, to the score of a document, -
+        for instance using some topological attribute of the links to/from a document.
+     </li>
+     <li>
+        Computing the score of a matching document as an arbitrary odd function of
+        its score by a certain query.
+     </li>
+  </ol>
+</DIV>
+<DIV>
+  <b>Performance and Quality Considerations:</b>
+</DIV>
+<DIV>
+  <ol>
+     <li>
+       When scoring by values of indexed fields,
+       these values are loaded into memory.
+       Unlike the regular scoring, where the required information is read from
+       disk as necessary, here field values are loaded once and cached by Lucene in memory
+       for further use, anticipating reuse by further queries. While all this is carefully
+       cached with performance in mind, it is recommended to
+       use these features only when the default Lucene scoring does
+       not match your "special" application needs.
+     </li>
+     <li>
+        Use only with carefully selected fields, because in most cases,
+        search quality with regular Lucene scoring
+        would outperform that of scoring by field values.
+     </li>
+     <li>
+        Values of fields used for scoring should match.
+        Do not apply on a field containing arbitrary (long) text.
+        Do not mix values in the same field if that field is used for scoring.
+     </li>
+     <li>
+        Smaller (shorter) field tokens means less RAM (something always desired).
+        When using <a href=FieldScoreQuery.html>FieldScoreQuery</a>,
+        select the shortest <a href=FieldScoreQuery.html#Type>FieldScoreQuery.Type</a>
+        that is sufficient for the used field values.
+     </li>
+     <li>
+        Reusing IndexReaders/IndexSearchers is essential, because the caching of field tokens
+        is based on an IndexReader. Whenever a new IndexReader is used, values currently
in the cache
+        cannot be used and new values must be loaded from disk. So replace/refresh readers/searchers
in
+        a controlled manner.
+     </li>
+  </ol>
+</DIV>
+<DIV>
+  <b>History and Credits:</b>
+  <ul>
+    <li>
+       A large part of the code of this package was originated from Yonik's FunctionQuery
code that was
+       imported from <a href="http://lucene.apache.org/solr">Solr</a>
+       (see <a href="http://issues.apache.org/jira/browse/LUCENE-446">LUCENE-446</a>).
+    </li>
+    <li>
+       The idea behind CustomScoreQurey is borrowed from
+       the "Easily create queries that transform sub-query scores arbitrarily" contribution
by Mike Klaas
+       (see <a href="http://issues.apache.org/jira/browse/LUCENE-850">LUCENE-850</a>)
+       though the implementation and API here are different.
+    </li>
+  </ul>
+</DIV>
+<DIV>
+ <b>Code sample:</b>
+ <P>
+ Note: code snippets here should work, but they were never really compiled... so,
+ tests sources under TestCustomScoreQuery, TestFieldScoreQuery and TestOrdValues
+ may also be useful.
+ <ol>
+  <li>
+    Using field (byte) values to as scores:
+    <p>
+    Indexing:
+    <pre>
+      f = new Field("score", "7", Field.Store.NO, Field.Index.UN_TOKENIZED);
+      f.setOmitNorms(true);
+      d1.add(f);
+    </pre>
+    <p>
+    Search:
+    <pre>
+      Query q = new FieldScoreQuery("score", FieldScoreQuery.Type.BYTE);
+    </pre>
+    Document d1 above would get a score of 7.
+  </li>
+  <p>
+  <li>
+    Manipulating scores
+    <p>
+    Dividing the original score of each document by a square root of its docid
+    (just to demonstrate what it takes to manipulate scores this way)
+    <pre>
+      Query q = queryParser.parse("my query text");
+      CustomScoreQuery customQ = new CustomScoreQuery(q) {
+        public float customScore(int doc, float subQueryScore, float valSrcScore) {
+          return subQueryScore / Math.sqrt(docid);
+        }
+      };
+    </pre>
+        <p>
+        For more informative debug info on the custom query, also override the name() method:
+        <pre>
+      CustomScoreQuery customQ = new CustomScoreQuery(q) {
+        public float customScore(int doc, float subQueryScore, float valSrcScore) {
+          return subQueryScore / Math.sqrt(docid);
+        }
+        public String name() {
+          return "1/sqrt(docid)";
+        }
+      };
+    </pre>
+        <p>
+        Taking the square root of the original score and multiplying it by a "short field
driven score", ie, the
+        short value that was indexed for the scored doc in a certain field:
+        <pre>
+      Query q = queryParser.parse("my query text");
+      FieldScoreQuery qf = new FieldScoreQuery("shortScore", FieldScoreQuery.Type.SHORT);
+      CustomScoreQuery customQ = new CustomScoreQuery(q,qf) {
+        public float customScore(int doc, float subQueryScore, float valSrcScore) {
+          return Math.sqrt(subQueryScore) * valSrcScore;
+        }
+        public String name() {
+          return "shortVal*sqrt(score)";
+        }
+      };
+    </pre>
+
+  </li>
+ </ol>
+</DIV>
+</BODY>
+</HTML>
\ No newline at end of file

Propchange: lucene/java/trunk/src/java/org/apache/lucene/search/function/package.html
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/java/trunk/src/java/org/apache/lucene/search/function/package.html
------------------------------------------------------------------------------
    svn:executable = *

Modified: lucene/java/trunk/src/java/org/apache/lucene/util/ToStringUtils.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/util/ToStringUtils.java?view=diff&rev=544546&r1=544545&r2=544546
==============================================================================
--- lucene/java/trunk/src/java/org/apache/lucene/util/ToStringUtils.java (original)
+++ lucene/java/trunk/src/java/org/apache/lucene/util/ToStringUtils.java Tue Jun  5 09:29:35
2007
@@ -18,9 +18,11 @@
  */
 
 public class ToStringUtils {
+  /** for printing boost only if not 1.0 */ 
   public static String boost(float boost) {
     if (boost != 1.0f) {
       return "^" + Float.toString(boost);
     } else return "";
   }
+
 }

Added: lucene/java/trunk/src/test/org/apache/lucene/search/function/FunctionTestSetup.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/test/org/apache/lucene/search/function/FunctionTestSetup.java?view=auto&rev=544546
==============================================================================
--- lucene/java/trunk/src/test/org/apache/lucene/search/function/FunctionTestSetup.java (added)
+++ lucene/java/trunk/src/test/org/apache/lucene/search/function/FunctionTestSetup.java Tue
Jun  5 09:29:35 2007
@@ -0,0 +1,152 @@
+package org.apache.lucene.search.function;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import org.apache.lucene.analysis.Analyzer;
+import org.apache.lucene.analysis.standard.StandardAnalyzer;
+import org.apache.lucene.document.Document;
+import org.apache.lucene.document.Field;
+import org.apache.lucene.document.Fieldable;
+import org.apache.lucene.index.IndexWriter;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.store.RAMDirectory;
+
+import junit.framework.TestCase;
+
+/**
+ * Setup for function tests
+ */
+public abstract class FunctionTestSetup extends TestCase {
+
+  /**
+   * Actual score computation order is slightly different than assumptios
+   * this allows for a small amount of variation
+   */
+  public static float TEST_SCORE_TOLERANCE_DELTA = 0.00005f;
+  
+  protected static final boolean DBG = false; // change to true for logging to print
+
+  protected static final int N_DOCS = 17; // select a primary number > 2
+
+  protected static final String ID_FIELD = "id";
+  protected static final String TEXT_FIELD = "text";
+  protected static final String INT_FIELD = "iii";
+  protected static final String FLOAT_FIELD = "fff";
+  
+  private static final String DOC_TEXT_LINES[] = {
+    // from a public first aid info at http://firstaid.ie.eu.org 
+    "Well it may be a little dramatic but sometimes it true. ",
+    "If you call the emergency medical services to an incident, ",
+    "your actions have started the chain of survival. ",
+    "You have acted to help someone you may not even know. ",
+    "First aid is helping, first aid is making that call, ",
+    "putting a Band-Aid on a small wound, controlling bleeding in large ",
+    "wounds or providing CPR for a collapsed person whose not breathing ",
+    "and heart has stopped beating. You can help yourself, your loved ",
+    "ones and the stranger whose life may depend on you being in the ",
+    "right place at the right time with the right knowledge.",
+  };
+  
+  protected Directory dir;
+  protected Analyzer anlzr;
+  
+  /* @override constructor */
+  public FunctionTestSetup(String name) {
+    super(name);
+  }
+
+  /* @override */
+  protected void tearDown() throws Exception {
+    super.tearDown();
+    dir = null;
+    anlzr = null;
+  }
+
+  /* @override */
+  protected void setUp() throws Exception {
+    // prepare a small index with just a few documents.  
+    super.setUp();
+    dir = new RAMDirectory();
+    anlzr = new StandardAnalyzer();
+    IndexWriter iw = new IndexWriter(dir,anlzr);
+    // add docs not exactly in natural ID order, to verify we do check the order of docs
by scores
+    int remaining = N_DOCS;
+    boolean done[] = new boolean[N_DOCS];
+    int i = 0;
+    while (remaining>0) {
+      if (done[i]) {
+        throw new Exception("to set this test correctly N_DOCS="+N_DOCS+" must be primary
and greater than 2!");
+      }
+      addDoc(iw,i);
+      done[i] = true;
+      i = (i+4)%N_DOCS;
+      remaining --;
+    }
+    iw.close();
+  }
+
+  private void addDoc(IndexWriter iw, int i) throws Exception {
+    Document d = new Document();
+    Fieldable f;
+    int scoreAndID = i+1;
+    
+    f = new Field(ID_FIELD,id2String(scoreAndID),Field.Store.YES,Field.Index.UN_TOKENIZED);
// for debug purposes
+    f.setOmitNorms(true);
+    d.add(f);
+    
+    f = new Field(TEXT_FIELD,"text of doc"+scoreAndID+textLine(i),Field.Store.NO,Field.Index.TOKENIZED);
// for regular search
+    f.setOmitNorms(true);
+    d.add(f);
+    
+    f = new Field(INT_FIELD,""+scoreAndID,Field.Store.NO,Field.Index.UN_TOKENIZED); // for
function scoring
+    f.setOmitNorms(true);
+    d.add(f);
+    
+    f = new Field(FLOAT_FIELD,scoreAndID+".000",Field.Store.NO,Field.Index.UN_TOKENIZED);
// for function scoring
+    f.setOmitNorms(true);
+    d.add(f);
+
+    iw.addDocument(d);
+    log("added: "+d);
+  }
+
+  // 17 --> ID00017
+  protected String id2String(int scoreAndID) {
+    String s = "000000000"+scoreAndID;
+    int n = (""+N_DOCS).length() + 3;
+    int k = s.length() - n; 
+    return "ID"+s.substring(k);
+  }
+  
+  // some text line for regular search
+  private String textLine(int docNum) {
+    return DOC_TEXT_LINES[docNum % DOC_TEXT_LINES.length];
+  }
+
+  // extract expected doc score from its ID Field: "ID7" --> 7.0
+  protected float expectedFieldScore(String docIDFieldVal) {
+    return Float.parseFloat(docIDFieldVal.substring(2)); 
+  }
+  
+  // debug messages (change DBG to true for anything to print) 
+  protected void log (Object o) {
+    if (DBG) {
+      System.out.println(o.toString());
+    }
+  }
+}

Propchange: lucene/java/trunk/src/test/org/apache/lucene/search/function/FunctionTestSetup.java
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/java/trunk/src/test/org/apache/lucene/search/function/FunctionTestSetup.java
------------------------------------------------------------------------------
    svn:executable = *

Added: lucene/java/trunk/src/test/org/apache/lucene/search/function/TestCustomScoreQuery.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/test/org/apache/lucene/search/function/TestCustomScoreQuery.java?view=auto&rev=544546
==============================================================================
--- lucene/java/trunk/src/test/org/apache/lucene/search/function/TestCustomScoreQuery.java
(added)
+++ lucene/java/trunk/src/test/org/apache/lucene/search/function/TestCustomScoreQuery.java
Tue Jun  5 09:29:35 2007
@@ -0,0 +1,240 @@
+package org.apache.lucene.search.function;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.Iterator;
+
+import org.apache.lucene.index.CorruptIndexException;
+import org.apache.lucene.queryParser.QueryParser;
+import org.apache.lucene.search.Explanation;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.search.QueryUtils;
+import org.apache.lucene.search.TopDocs;
+
+/**
+ * Test CustomScoreQuery search.
+ */
+public class TestCustomScoreQuery extends FunctionTestSetup {
+
+  /* @override constructor */
+  public TestCustomScoreQuery(String name) {
+    super(name);
+  }
+
+  /* @override */
+  protected void tearDown() throws Exception {
+    super.tearDown();
+  }
+
+  /* @override */
+  protected void setUp() throws Exception {
+    // prepare a small index with just a few documents.  
+    super.setUp();
+  }
+
+  /** Test that CustomScoreQuery of Type.BYTE returns the expected scores. */
+  public void testCustomScoreByte () throws CorruptIndexException, Exception {
+    // INT field values are small enough to be parsed as byte
+    doTestCustomScore(INT_FIELD,FieldScoreQuery.Type.BYTE,1.0);
+    doTestCustomScore(INT_FIELD,FieldScoreQuery.Type.BYTE,2.0);
+  }
+
+  /** Test that CustomScoreQuery of Type.SHORT returns the expected scores. */
+  public void testCustomScoreShort () throws CorruptIndexException, Exception {
+    // INT field values are small enough to be parsed as short
+    doTestCustomScore(INT_FIELD,FieldScoreQuery.Type.SHORT,1.0);
+    doTestCustomScore(INT_FIELD,FieldScoreQuery.Type.SHORT,3.0);
+  }
+
+  /** Test that CustomScoreQuery of Type.INT returns the expected scores. */
+  public void testCustomScoreInt () throws CorruptIndexException, Exception {
+    doTestCustomScore(INT_FIELD,FieldScoreQuery.Type.INT,1.0);
+    doTestCustomScore(INT_FIELD,FieldScoreQuery.Type.INT,4.0);
+  }
+
+  /** Test that CustomScoreQuery of Type.FLOAT returns the expected scores. */
+  public void testCustomScoreFloat () throws CorruptIndexException, Exception {
+    // INT field can be parsed as float
+    doTestCustomScore(INT_FIELD,FieldScoreQuery.Type.FLOAT,1.0);
+    doTestCustomScore(INT_FIELD,FieldScoreQuery.Type.FLOAT,5.0);
+    // same values, but in flot format
+    doTestCustomScore(FLOAT_FIELD,FieldScoreQuery.Type.FLOAT,1.0);
+    doTestCustomScore(FLOAT_FIELD,FieldScoreQuery.Type.FLOAT,6.0);
+  }
+
+  // Test that FieldScoreQuery returns docs with expected score.
+  private void doTestCustomScore (String field, FieldScoreQuery.Type tp, double dboost) throws
CorruptIndexException, Exception {
+    float boost = (float) dboost;
+    IndexSearcher s = new IndexSearcher(dir);
+    FieldScoreQuery qValSrc = new FieldScoreQuery(field,tp); // a query that would score
by the field
+    QueryParser qp = new QueryParser(TEXT_FIELD,anlzr); 
+    String qtxt = "bleeding person chain knowledge"; // from the doc texts in FunctionQuerySetup.
+    
+    // regular (boolean) query.
+    Query q1 = qp.parse(qtxt); 
+    log(q1);
+    
+    // custom query, that should score the same as q1.
+    CustomScoreQuery q2CustomNeutral = new CustomScoreQuery(q1);
+    q2CustomNeutral.setBoost(boost);
+    log(q2CustomNeutral);
+    
+    // custom query, that should (by default) multiply the scores of q1 by that of the field
+    CustomScoreQuery q3CustomMul = new CustomScoreQuery(q1,qValSrc);
+    q3CustomMul.setStrict(true);
+    q3CustomMul.setBoost(boost);
+    log(q3CustomMul);
+    
+    // custom query, that should add the scores of q1 to that of the field
+    CustomScoreQuery q4CustomAdd = new CustomScoreQuery(q1,qValSrc) {
+      /*(non-Javadoc) @see org.apache.lucene.search.function.CustomScoreQuery#name() */
+      public String name() {
+        return "customAdd";
+      }
+      /*(non-Javadoc) @see org.apache.lucene.search.function.CustomScoreQuery#customScore(int,
float, float) */
+      public float customScore(int doc, float subQueryScore, float valSrcScore) {
+        return subQueryScore + valSrcScore;
+      }
+      /* (non-Javadoc)@see org.apache.lucene.search.function.CustomScoreQuery#customExplain(int,
org.apache.lucene.search.Explanation, org.apache.lucene.search.Explanation)*/
+      public Explanation customExplain(int doc, Explanation subQueryExpl, Explanation valSrcExpl)
{
+        float valSrcScore = valSrcExpl==null ? 0 : valSrcExpl.getValue();
+        Explanation exp = new Explanation( valSrcScore + subQueryExpl.getValue(), "custom
score: sum of:");
+        exp.addDetail(subQueryExpl);
+        if (valSrcExpl != null) {
+          exp.addDetail(valSrcExpl);
+        }
+        return exp;      
+      } 
+    };
+    q4CustomAdd.setStrict(true);
+    q4CustomAdd.setBoost(boost);
+    log(q4CustomAdd);
+
+    // custom query, that multiplies and adds the field score to that of q1
+    CustomScoreQuery q5CustomMulAdd = new CustomScoreQuery(q1,qValSrc) {
+      /*(non-Javadoc) @see org.apache.lucene.search.function.CustomScoreQuery#name() */
+      public String name() {
+        return "customMulAdd";
+      }
+      /*(non-Javadoc) @see org.apache.lucene.search.function.CustomScoreQuery#customScore(int,
float, float) */
+      public float customScore(int doc, float subQueryScore, float valSrcScore) {
+        return (1 + subQueryScore) * valSrcScore;
+      } 
+      /* (non-Javadoc)@see org.apache.lucene.search.function.CustomScoreQuery#customExplain(int,
org.apache.lucene.search.Explanation, org.apache.lucene.search.Explanation)*/
+      public Explanation customExplain(int doc, Explanation subQueryExpl, Explanation valSrcExpl)
{
+        Explanation exp = new Explanation(1 + subQueryExpl.getValue(), "sum of:");
+        exp.addDetail(subQueryExpl);
+        exp.addDetail(new Explanation(1,"const 1"));
+        if (valSrcExpl == null) {
+          exp.setDescription("CustomMulAdd, sum of:");
+          return exp;
+        }
+        Explanation exp2 = new Explanation(valSrcExpl.getValue() * exp.getValue(), "custom
score: product of:");
+        exp2.addDetail(valSrcExpl);
+        exp2.addDetail(exp);
+        return exp2;      
+      } 
+    };
+    q5CustomMulAdd.setStrict(true);
+    q5CustomMulAdd.setBoost(boost);
+    log(q5CustomMulAdd);
+
+    // do al the searches 
+    TopDocs td1 = s.search(q1,null,1000);
+    TopDocs td2CustomNeutral = s.search(q2CustomNeutral,null,1000);
+    TopDocs td3CustomMul = s.search(q3CustomMul,null,1000);
+    TopDocs td4CustomAdd = s.search(q4CustomAdd,null,1000);
+    TopDocs td5CustomMulAdd = s.search(q5CustomMulAdd,null,1000);
+    
+    // put results in map so we can verify the scores although they have changed
+    HashMap h1 = topDocsToMap(td1);
+    HashMap h2CustomNeutral = topDocsToMap(td2CustomNeutral);
+    HashMap h3CustomMul = topDocsToMap(td3CustomMul);
+    HashMap h4CustomAdd = topDocsToMap(td4CustomAdd);
+    HashMap h5CustomMulAdd = topDocsToMap(td5CustomMulAdd);
+    
+    verifyResults(boost, s, 
+        h1, h2CustomNeutral, h3CustomMul, h4CustomAdd, h5CustomMulAdd,
+        q1, q2CustomNeutral, q3CustomMul, q4CustomAdd, q5CustomMulAdd);
+  }
+  
+  // verify results are as expected.
+  private void verifyResults(float boost, IndexSearcher s, 
+      HashMap h1, HashMap h2customNeutral, HashMap h3CustomMul, HashMap h4CustomAdd, HashMap
h5CustomMulAdd,
+      Query q1, Query q2, Query q3, Query q4, Query q5) throws Exception {
+    
+    // verify numbers of matches
+    log("#hits = "+h1.size());
+    assertEquals("queries should have same #hits",h1.size(),h2customNeutral.size());
+    assertEquals("queries should have same #hits",h1.size(),h3CustomMul.size());
+    assertEquals("queries should have same #hits",h1.size(),h4CustomAdd.size());
+    assertEquals("queries should have same #hits",h1.size(),h5CustomMulAdd.size());
+    
+    // verify scores ratios
+    for (Iterator it = h1.keySet().iterator(); it.hasNext();) {
+      Integer x = (Integer) it.next();
+
+      int doc =  x.intValue();
+      log("doc = "+doc);
+
+      float fieldScore = expectedFieldScore(s.getIndexReader().document(doc).get(ID_FIELD));
+      log("fieldScore = "+fieldScore);
+      assertTrue("fieldScore should not be 0",fieldScore>0);
+
+      float score1 = ((Float)h1.get(x)).floatValue();
+      logResult("score1=", s, q1, doc, score1);
+      
+      float score2 = ((Float)h2customNeutral.get(x)).floatValue();
+      logResult("score2=", s, q2, doc, score2);
+      assertEquals("same score (just boosted) for neutral", boost * score1, score2, TEST_SCORE_TOLERANCE_DELTA);
+
+      float score3 = ((Float)h3CustomMul.get(x)).floatValue();
+      logResult("score3=", s, q3, doc, score3);
+      assertEquals("new score for custom mul", boost * fieldScore * score1, score3, TEST_SCORE_TOLERANCE_DELTA);
+      
+      float score4 = ((Float)h4CustomAdd.get(x)).floatValue();
+      logResult("score4=", s, q4, doc, score4);
+      assertEquals("new score for custom add", boost * (fieldScore + score1), score4, TEST_SCORE_TOLERANCE_DELTA);
+      
+      float score5 = ((Float)h5CustomMulAdd.get(x)).floatValue();
+      logResult("score5=", s, q5, doc, score5);
+      assertEquals("new score for custom mul add", boost * fieldScore * (score1 + 1), score5,
TEST_SCORE_TOLERANCE_DELTA);
+    }
+  }
+
+  private void logResult(String msg, IndexSearcher s, Query q, int doc, float score1) throws
IOException {
+    QueryUtils.check(q,s);
+    log(msg+" "+score1);
+    log("Explain by: "+q);
+    log(s.explain(q,doc));
+  }
+
+  // since custom scoring modifies the order of docs, map results 
+  // by doc ids so that we can later compare/verify them 
+  private HashMap topDocsToMap(TopDocs td) {
+    HashMap h = new HashMap(); 
+    for (int i=0; i<td.totalHits; i++) {
+      h.put(new Integer(td.scoreDocs[i].doc), new Float(td.scoreDocs[i].score));
+    }
+    return h;
+  }
+
+}

Propchange: lucene/java/trunk/src/test/org/apache/lucene/search/function/TestCustomScoreQuery.java
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/java/trunk/src/test/org/apache/lucene/search/function/TestCustomScoreQuery.java
------------------------------------------------------------------------------
    svn:executable = *

Added: lucene/java/trunk/src/test/org/apache/lucene/search/function/TestFieldScoreQuery.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/test/org/apache/lucene/search/function/TestFieldScoreQuery.java?view=auto&rev=544546
==============================================================================
--- lucene/java/trunk/src/test/org/apache/lucene/search/function/TestFieldScoreQuery.java
(added)
+++ lucene/java/trunk/src/test/org/apache/lucene/search/function/TestFieldScoreQuery.java
Tue Jun  5 09:29:35 2007
@@ -0,0 +1,203 @@
+package org.apache.lucene.search.function;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import java.util.HashMap;
+
+import org.apache.lucene.index.CorruptIndexException;
+import org.apache.lucene.search.Hits;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.search.QueryUtils;
+import org.apache.lucene.search.ScoreDoc;
+import org.apache.lucene.search.TopDocs;
+
+/**
+ * Test FieldScoreQuery search.
+ * <p>
+ * Tests here create an index with a few documents, each having
+ * an int value indexed  field and a float value indexed field.
+ * The values of these fields are later used for scoring.
+ * <p>
+ * The rank tests use Hits to verify that docs are ordered (by score) as expected.
+ * <p>
+ * The exact score tests use TopDocs top to verify the exact score.  
+ */
+public class TestFieldScoreQuery extends FunctionTestSetup {
+
+  /* @override constructor */
+  public TestFieldScoreQuery(String name) {
+    super(name);
+  }
+
+  /* @override */
+  protected void tearDown() throws Exception {
+    super.tearDown();
+  }
+
+  /* @override */
+  protected void setUp() throws Exception {
+    // prepare a small index with just a few documents.  
+    super.setUp();
+  }
+
+  /** Test that FieldScoreQuery of Type.BYTE returns docs in expected order. */
+  public void testRankByte () throws CorruptIndexException, Exception {
+    // INT field values are small enough to be parsed as byte
+    doTestRank(INT_FIELD,FieldScoreQuery.Type.BYTE);
+  }
+
+  /** Test that FieldScoreQuery of Type.SHORT returns docs in expected order. */
+  public void testRankShort () throws CorruptIndexException, Exception {
+    // INT field values are small enough to be parsed as short
+    doTestRank(INT_FIELD,FieldScoreQuery.Type.SHORT);
+  }
+
+  /** Test that FieldScoreQuery of Type.INT returns docs in expected order. */
+  public void testRankInt () throws CorruptIndexException, Exception {
+    doTestRank(INT_FIELD,FieldScoreQuery.Type.INT);
+  }
+
+  /** Test that FieldScoreQuery of Type.FLOAT returns docs in expected order. */
+  public void testRankFloat () throws CorruptIndexException, Exception {
+    // INT field can be parsed as float
+    doTestRank(INT_FIELD,FieldScoreQuery.Type.FLOAT);
+    // same values, but in flot format
+    doTestRank(FLOAT_FIELD,FieldScoreQuery.Type.FLOAT);
+  }
+
+  // Test that FieldScoreQuery returns docs in expected order.
+  private void doTestRank (String field, FieldScoreQuery.Type tp) throws CorruptIndexException,
Exception {
+    IndexSearcher s = new IndexSearcher(dir);
+    Query q = new FieldScoreQuery(field,tp);
+    log("test: "+q);
+    QueryUtils.check(q,s);
+    Hits h = s.search(q);
+    assertEquals("All docs should be matched!",N_DOCS,h.length());
+    String prevID = "ID"+(N_DOCS+1); // greater than all ids of docs in this test
+    for (int i=0; i<h.length(); i++) {
+      String resID = h.doc(i).get(ID_FIELD);
+      log(i+".   score="+h.score(i)+"  -  "+resID);
+      log(s.explain(q,h.id(i)));
+      assertTrue("res id "+resID+" should be < prev res id "+prevID, resID.compareTo(prevID)<0);
+      prevID = resID;
+    }
+  }
+
+  /** Test that FieldScoreQuery of Type.BYTE returns the expected scores. */
+  public void testExactScoreByte () throws CorruptIndexException, Exception {
+    // INT field values are small enough to be parsed as byte
+    doTestExactScore(INT_FIELD,FieldScoreQuery.Type.BYTE);
+  }
+
+  /** Test that FieldScoreQuery of Type.SHORT returns the expected scores. */
+  public void testExactScoreShort () throws CorruptIndexException, Exception {
+    // INT field values are small enough to be parsed as short
+    doTestExactScore(INT_FIELD,FieldScoreQuery.Type.SHORT);
+  }
+
+  /** Test that FieldScoreQuery of Type.INT returns the expected scores. */
+  public void testExactScoreInt () throws CorruptIndexException, Exception {
+    doTestExactScore(INT_FIELD,FieldScoreQuery.Type.INT);
+  }
+
+  /** Test that FieldScoreQuery of Type.FLOAT returns the expected scores. */
+  public void testExactScoreFloat () throws CorruptIndexException, Exception {
+    // INT field can be parsed as float
+    doTestExactScore(INT_FIELD,FieldScoreQuery.Type.FLOAT);
+    // same values, but in flot format
+    doTestExactScore(FLOAT_FIELD,FieldScoreQuery.Type.FLOAT);
+  }
+
+  // Test that FieldScoreQuery returns docs with expected score.
+  private void doTestExactScore (String field, FieldScoreQuery.Type tp) throws CorruptIndexException,
Exception {
+    IndexSearcher s = new IndexSearcher(dir);
+    Query q = new FieldScoreQuery(field,tp);
+    TopDocs td = s.search(q,null,1000);
+    assertEquals("All docs should be matched!",N_DOCS,td.totalHits);
+    ScoreDoc sd[] = td.scoreDocs;
+    for (int i=0; i<sd.length; i++) {
+      float score = sd[i].score;
+      log(s.explain(q,sd[i].doc));
+      String id = s.getIndexReader().document(sd[i].doc).get(ID_FIELD);
+      float expectedScore = expectedFieldScore(id); // "ID7" --> 7.0
+      assertEquals("score of "+id+" shuould be "+expectedScore+" != "+score, expectedScore,
score, TEST_SCORE_TOLERANCE_DELTA);
+    }
+  }
+
+  /** Test that FieldScoreQuery of Type.BYTE caches/reuses loaded values and consumes the
proper RAM resources. */
+  public void testCachingByte () throws CorruptIndexException, Exception {
+    // INT field values are small enough to be parsed as byte
+    doTestCaching(INT_FIELD,FieldScoreQuery.Type.BYTE);
+  }
+
+  /** Test that FieldScoreQuery of Type.SHORT caches/reuses loaded values and consumes the
proper RAM resources. */
+  public void testCachingShort () throws CorruptIndexException, Exception {
+    // INT field values are small enough to be parsed as short
+    doTestCaching(INT_FIELD,FieldScoreQuery.Type.SHORT);
+  }
+
+  /** Test that FieldScoreQuery of Type.INT caches/reuses loaded values and consumes the
proper RAM resources. */
+  public void testCachingInt () throws CorruptIndexException, Exception {
+    doTestCaching(INT_FIELD,FieldScoreQuery.Type.INT);
+  }
+
+  /** Test that FieldScoreQuery of Type.FLOAT caches/reuses loaded values and consumes the
proper RAM resources. */
+  public void testCachingFloat () throws CorruptIndexException, Exception {
+    // INT field values can be parsed as float
+    doTestCaching(INT_FIELD,FieldScoreQuery.Type.FLOAT);
+    // same values, but in flot format
+    doTestCaching(FLOAT_FIELD,FieldScoreQuery.Type.FLOAT);
+  }
+
+  // Test that values loaded for FieldScoreQuery are cached properly and consumes the proper
RAM resources.
+  private void doTestCaching (String field, FieldScoreQuery.Type tp) throws CorruptIndexException,
Exception {
+    // prepare expected array types for comparison
+    HashMap expectedArrayTypes = new HashMap();
+    expectedArrayTypes.put(FieldScoreQuery.Type.BYTE, new byte[0]);
+    expectedArrayTypes.put(FieldScoreQuery.Type.SHORT, new short[0]);
+    expectedArrayTypes.put(FieldScoreQuery.Type.INT, new int[0]);
+    expectedArrayTypes.put(FieldScoreQuery.Type.FLOAT, new float[0]);
+    
+    IndexSearcher s = new IndexSearcher(dir);
+    Object innerArray = null;
+
+    for (int i=0; i<10; i++) {
+      FieldScoreQuery q = new FieldScoreQuery(field,tp);
+      Hits h = s.search(q);
+      assertEquals("All docs should be matched!",N_DOCS,h.length());
+      if (i==0) {
+        innerArray = q.valSrc.getValues(s.getIndexReader()).getInnerArray();
+        log(i+".  compare: "+innerArray.getClass()+" to "+expectedArrayTypes.get(tp).getClass());
+        assertEquals("field values should be cached in the correct array type!", innerArray.getClass(),expectedArrayTypes.get(tp).getClass());
+      } else {
+        log(i+".  compare: "+innerArray+" to "+q.valSrc.getValues(s.getIndexReader()).getInnerArray());
+        assertSame("field values should be cached and reused!", innerArray, q.valSrc.getValues(s.getIndexReader()).getInnerArray());
+      }
+    }
+    
+    // verify new values are reloaded (not reused) for a new reader
+    s = new IndexSearcher(dir);
+    FieldScoreQuery q = new FieldScoreQuery(field,tp);
+    Hits h = s.search(q);
+    assertEquals("All docs should be matched!",N_DOCS,h.length());
+    log("compare: "+innerArray+" to "+q.valSrc.getValues(s.getIndexReader()).getInnerArray());
+    assertNotSame("cached field values should not be reused if reader as changed!", innerArray,
q.valSrc.getValues(s.getIndexReader()).getInnerArray());
+  }
+
+}

Propchange: lucene/java/trunk/src/test/org/apache/lucene/search/function/TestFieldScoreQuery.java
------------------------------------------------------------------------------
    svn:eol-style = native

Propchange: lucene/java/trunk/src/test/org/apache/lucene/search/function/TestFieldScoreQuery.java
------------------------------------------------------------------------------
    svn:executable = *

Added: lucene/java/trunk/src/test/org/apache/lucene/search/function/TestOrdValues.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/test/org/apache/lucene/search/function/TestOrdValues.java?view=auto&rev=544546
==============================================================================
--- lucene/java/trunk/src/test/org/apache/lucene/search/function/TestOrdValues.java (added)
+++ lucene/java/trunk/src/test/org/apache/lucene/search/function/TestOrdValues.java Tue Jun
 5 09:29:35 2007
@@ -0,0 +1,202 @@
+package org.apache.lucene.search.function;
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import org.apache.lucene.index.CorruptIndexException;
+import org.apache.lucene.search.Hits;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.search.QueryUtils;
+import org.apache.lucene.search.ScoreDoc;
+import org.apache.lucene.search.TopDocs;
+
+/**
+ * Test search based on OrdFieldSource and ReverseOrdFieldSource.
+ * <p>
+ * Tests here create an index with a few documents, each having
+ * an indexed "id" field.
+ * The ord values of this field are later used for scoring.
+ * <p>
+ * The order tests use Hits to verify that docs are ordered as expected.
+ * <p>
+ * The exact score tests use TopDocs top to verify the exact score.  
+ */
+public class TestOrdValues extends FunctionTestSetup {
+
+  /* @override constructor */
+  public TestOrdValues(String name) {
+    super(name);
+  }
+
+  /* @override */
+  protected void tearDown() throws Exception {
+    super.tearDown();
+  }
+
+  /* @override */
+  protected void setUp() throws Exception {
+    // prepare a small index with just a few documents.  
+    super.setUp();
+  }
+
+  /** Test OrdFieldSource */
+  public void testOrdFieldRank () throws CorruptIndexException, Exception {
+    doTestRank(ID_FIELD,true);
+  }
+
+  /** Test ReverseOrdFieldSource */
+  public void testReverseOrdFieldRank () throws CorruptIndexException, Exception {
+    doTestRank(ID_FIELD,false);
+  }
+
+  // Test that queries based on reverse/ordFieldScore scores correctly
+  private void doTestRank (String field, boolean inOrder) throws CorruptIndexException, Exception
{
+    IndexSearcher s = new IndexSearcher(dir);
+    ValueSource vs;
+    if (inOrder) {
+      vs = new OrdFieldSource(field);
+    } else {
+      vs = new ReverseOrdFieldSource(field);
+    }
+        
+    Query q = new ValueSourceQuery(vs);
+    log("test: "+q);
+    QueryUtils.check(q,s);
+    Hits h = s.search(q);
+    assertEquals("All docs should be matched!",N_DOCS,h.length());
+    String prevID = inOrder
+      ? "IE"   // greater than all ids of docs in this test ("ID0001", etc.)
+      : "IC";  // smaller than all ids of docs in this test ("ID0001", etc.)
+          
+    for (int i=0; i<h.length(); i++) {
+      String resID = h.doc(i).get(ID_FIELD);
+      log(i+".   score="+h.score(i)+"  -  "+resID);
+      log(s.explain(q,h.id(i)));
+      if (inOrder) {
+        assertTrue("res id "+resID+" should be < prev res id "+prevID, resID.compareTo(prevID)<0);
+      } else {
+        assertTrue("res id "+resID+" should be > prev res id "+prevID, resID.compareTo(prevID)>0);
+      }
+      prevID = resID;
+    }
+  }
+
+  /** Test exact score for OrdFieldSource */
+  public void testOrdFieldExactScore () throws CorruptIndexException, Exception {
+    doTestExactScore(ID_FIELD,true);
+  }
+
+  /** Test exact score for ReverseOrdFieldSource */
+  public void testReverseOrdFieldExactScore () throws CorruptIndexException, Exception {
+    doTestExactScore(ID_FIELD,false);
+  }
+
+  
+  // Test that queries based on reverse/ordFieldScore returns docs with expected score.
+  private void doTestExactScore (String field, boolean inOrder) throws CorruptIndexException,
Exception {
+    IndexSearcher s = new IndexSearcher(dir);
+    ValueSource vs;
+    if (inOrder) {
+      vs = new OrdFieldSource(field);
+    } else {
+      vs = new ReverseOrdFieldSource(field);
+    }
+    Query q = new ValueSourceQuery(vs);
+    TopDocs td = s.search(q,null,1000);
+    assertEquals("All docs should be matched!",N_DOCS,td.totalHits);
+    ScoreDoc sd[] = td.scoreDocs;
+    for (int i=0; i<sd.length; i++) {
+      float score = sd[i].score;
+      String id = s.getIndexReader().document(sd[i].doc).get(ID_FIELD);
+      log("-------- "+i+". Explain doc "+id);
+      log(s.explain(q,sd[i].doc));
+      float expectedScore =  N_DOCS-i;
+      assertEquals("score of result "+i+" shuould be "+expectedScore+" != "+score, expectedScore,
score, TEST_SCORE_TOLERANCE_DELTA);
+      String expectedId =  inOrder 
+        ? id2String(N_DOCS-i) // in-order ==> larger  values first 
+        : id2String(i+1);     // reverse  ==> smaller values first 
+      assertTrue("id of result "+i+" shuould be "+expectedId+" != "+score, expectedId.equals(id));
+    }
+  }
+  
+  /** Test caching OrdFieldSource */
+  public void testCachingOrd () throws CorruptIndexException, Exception {
+    doTestCaching(ID_FIELD,true);
+  }
+  
+  /** Test caching for ReverseOrdFieldSource */
+  public void tesCachingReverseOrd () throws CorruptIndexException, Exception {
+    doTestCaching(ID_FIELD,false);
+  }
+
+  // Test that values loaded for FieldScoreQuery are cached properly and consumes the proper
RAM resources.
+  private void doTestCaching (String field, boolean inOrder) throws CorruptIndexException,
Exception {
+    IndexSearcher s = new IndexSearcher(dir);
+    Object innerArray = null;
+
+    for (int i=0; i<10; i++) {
+      ValueSource vs;
+      if (inOrder) {
+        vs = new OrdFieldSource(field);
+      } else {
+        vs = new ReverseOrdFieldSource(field);
+      }
+      ValueSourceQuery q = new ValueSourceQuery(vs);
+      Hits h = s.search(q);
+      assertEquals("All docs should be matched!",N_DOCS,h.length());
+      if (i==0) {
+        innerArray = q.valSrc.getValues(s.getIndexReader()).getInnerArray();
+      } else {
+        log(i+".  compare: "+innerArray+" to "+q.valSrc.getValues(s.getIndexReader()).getInnerArray());
+        assertSame("field values should be cached and reused!", innerArray, q.valSrc.getValues(s.getIndexReader()).getInnerArray());
+      }
+    }
+    
+    ValueSource vs;
+    ValueSourceQuery q;
+    Hits h;
+    
+    // verify that different values are loaded for a different field
+    String field2 = INT_FIELD;
+    assertFalse(field.equals(field2)); // otherwise this test is meaningless.
+    if (inOrder) {
+      vs = new OrdFieldSource(field2);
+    } else {
+      vs = new ReverseOrdFieldSource(field2);
+    }
+    q = new ValueSourceQuery(vs);
+    h = s.search(q);
+    assertEquals("All docs should be matched!",N_DOCS,h.length());
+    log("compare (should differ): "+innerArray+" to "+q.valSrc.getValues(s.getIndexReader()).getInnerArray());
+    assertNotSame("different values shuold be loaded for a different field!", innerArray,
q.valSrc.getValues(s.getIndexReader()).getInnerArray());
+
+    // verify new values are reloaded (not reused) for a new reader
+    s = new IndexSearcher(dir);
+    if (inOrder) {
+      vs = new OrdFieldSource(field);
+    } else {
+      vs = new ReverseOrdFieldSource(field);
+    }
+    q = new ValueSourceQuery(vs);
+    h = s.search(q);
+    assertEquals("All docs should be matched!",N_DOCS,h.length());
+    log("compare (should differ): "+innerArray+" to "+q.valSrc.getValues(s.getIndexReader()).getInnerArray());
+    assertNotSame("cached field values should not be reused if reader as changed!", innerArray,
q.valSrc.getValues(s.getIndexReader()).getInnerArray());
+  }
+
+}

Propchange: lucene/java/trunk/src/test/org/apache/lucene/search/function/TestOrdValues.java
------------------------------------------------------------------------------
    svn:eol-style = native



Mime
View raw message