lucenenet-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aro...@apache.org
Subject svn commit: r671402 [3/5] - in /incubator/lucene.net/trunk/C#/src/Lucene.Net/Search: ./ Function/ Payload/ Spans/
Date Wed, 25 Jun 2008 02:51:26 GMT
Added: incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Function/Package.html
URL: http://svn.apache.org/viewvc/incubator/lucene.net/trunk/C%23/src/Lucene.Net/Search/Function/Package.html?rev=671402&view=auto
==============================================================================
--- incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Function/Package.html (added)
+++ incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Function/Package.html Tue Jun 24 19:51:24 2008
@@ -0,0 +1,197 @@
+<HTML>
+ <!--
+/**
+ * Copyright 2005 The Apache Software Foundation
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+ -->
+<HEAD>
+  <TITLE>org.apache.lucene.search.function</TITLE>
+</HEAD>
+<BODY>
+<DIV>
+  Programmatic control over documents scores.
+</DIV>
+<DIV>
+  The <code>function</code> package provides tight control over documents scores.
+</DIV>
+<DIV>
+<font color="#FF0000">
+WARNING: The status of the <b>search.function</b> package is experimental. The APIs
+introduced here might change in the future and will not be supported anymore
+in such a case.
+</font>
+</DIV>
+<DIV>
+  Two types of queries are available in this package:
+</DIV>
+<DIV>
+  <ol>
+     <li>
+        <b>Custom Score queries</b> - allowing to set the score
+        of a matching document as a mathematical expression over scores
+        of that document by contained (sub) queries.
+     </li>
+     <li>
+        <b>Field score queries</b> - allowing to base the score of a
+        document on <b>numeric values</b> of <b>indexed fields</b>.
+     </li>
+  </ol>
+</DIV>
+<DIV>&nbsp;</DIV>
+<DIV>
+  <b>Some possible uses of these queries:</b>
+</DIV>
+<DIV>
+  <ol>
+     <li>
+        Normalizing the document scores by values indexed in a special field -
+        for instance, experimenting with a different doc length normalization.
+     </li>
+     <li>
+        Introducing some static scoring element, to the score of a document, -
+        for instance using some topological attribute of the links to/from a document.
+     </li>
+     <li>
+        Computing the score of a matching document as an arbitrary odd function of
+        its score by a certain query.
+     </li>
+  </ol>
+</DIV>
+<DIV>
+  <b>Performance and Quality Considerations:</b>
+</DIV>
+<DIV>
+  <ol>
+     <li>
+       When scoring by values of indexed fields,
+       these values are loaded into memory.
+       Unlike the regular scoring, where the required information is read from
+       disk as necessary, here field values are loaded once and cached by Lucene in memory
+       for further use, anticipating reuse by further queries. While all this is carefully
+       cached with performance in mind, it is recommended to
+       use these features only when the default Lucene scoring does
+       not match your "special" application needs.
+     </li>
+     <li>
+        Use only with carefully selected fields, because in most cases,
+        search quality with regular Lucene scoring
+        would outperform that of scoring by field values.
+     </li>
+     <li>
+        Values of fields used for scoring should match.
+        Do not apply on a field containing arbitrary (long) text.
+        Do not mix values in the same field if that field is used for scoring.
+     </li>
+     <li>
+        Smaller (shorter) field tokens means less RAM (something always desired).
+        When using <a href = FieldScoreQuery.html>FieldScoreQuery</a>,
+        select the shortest <a href = FieldScoreQuery.html#Type>FieldScoreQuery.Type</a>
+        that is sufficient for the used field values.
+     </li>
+     <li>
+        Reusing IndexReaders/IndexSearchers is essential, because the caching of field tokens
+        is based on an IndexReader. Whenever a new IndexReader is used, values currently in the cache
+        cannot be used and new values must be loaded from disk. So replace/refresh readers/searchers in
+        a controlled manner.
+     </li>
+  </ol>
+</DIV>
+<DIV>
+  <b>History and Credits:</b>
+  <ul>
+    <li>
+       A large part of the code of this package was originated from Yonik's FunctionQuery code that was
+       imported from <a href = "http://lucene.apache.org//solr">Solr</a>
+       (see <a href = "http://issues.apache.org//jira/browse/LUCENE-446">LUCENE-446</a>).
+    </li>
+    <li>
+       The idea behind CustomScoreQurey is borrowed from
+       the "Easily create queries that transform sub-query scores arbitrarily" contribution by Mike Klaas
+       (see <a href = "http://issues.apache.org//jira/browse/LUCENE-850">LUCENE-850</a>)
+       though the implementation and API here are different.
+    </li>
+  </ul>
+</DIV>
+<DIV>
+ <b>Code sample:</b>
+ <P>
+ Note: code snippets here should work, but they were never really compiled... so,
+ tests sources under TestCustomScoreQuery, TestFieldScoreQuery and TestOrdValues
+ may also be useful.
+ <ol>
+  <li>
+    Using field (byte) values to as scores:
+    <p>
+    Indexing:
+    <pre>
+      f = new Field("score", "7", Field.Store.NO, Field.Index.UN_TOKENIZED);
+      f.setOmitNorms(true);
+      d1.add(f);
+    </pre>
+    <p>
+    Search:
+    <pre>
+      Query q = new FieldScoreQuery("score", FieldScoreQuery.Type.BYTE);
+    </pre>
+    Document d1 above would get a score of 7.
+  </li>
+  <p>
+  <li>
+    Manipulating scores
+    <p>
+    Dividing the original score of each document by a square root of its docid
+    (just to demonstrate what it takes to manipulate scores this way)
+    <pre>
+      Query q = queryParser.parse("my query text");
+      CustomScoreQuery customQ = new CustomScoreQuery(q) {
+        public float customScore(int doc, float subQueryScore, float valSrcScore) {
+          return subQueryScore / Math.sqrt(docid);
+        }
+      };
+    </pre>
+        <p>
+        For more informative debug info on the custom query, also override the name() method:
+        <pre>
+      CustomScoreQuery customQ = new CustomScoreQuery(q) {
+        public float customScore(int doc, float subQueryScore, float valSrcScore) {
+          return subQueryScore / Math.sqrt(docid);
+        }
+        public String name() {
+          return "1/sqrt(docid)";
+        }
+      };
+    </pre>
+        <p>
+        Taking the square root of the original score and multiplying it by a "short field driven score", ie, the
+        short value that was indexed for the scored doc in a certain field:
+        <pre>
+      Query q = queryParser.parse("my query text");
+      FieldScoreQuery qf = new FieldScoreQuery("shortScore", FieldScoreQuery.Type.SHORT);
+      CustomScoreQuery customQ = new CustomScoreQuery(q,qf) {
+        public float customScore(int doc, float subQueryScore, float valSrcScore) {
+          return Math.sqrt(subQueryScore) * valSrcScore;
+        }
+        public String name() {
+          return "shortVal*sqrt(score)";
+        }
+      };
+    </pre>
+
+  </li>
+ </ol>
+</DIV>
+</BODY>
+</HTML>
\ No newline at end of file

Added: incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Function/ReverseOrdFieldSource.cs
URL: http://svn.apache.org/viewvc/incubator/lucene.net/trunk/C%23/src/Lucene.Net/Search/Function/ReverseOrdFieldSource.cs?rev=671402&view=auto
==============================================================================
--- incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Function/ReverseOrdFieldSource.cs (added)
+++ incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Function/ReverseOrdFieldSource.cs Tue Jun 24 19:51:24 2008
@@ -0,0 +1,152 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+using System;
+
+using IndexReader = Lucene.Net.Index.IndexReader;
+using FieldCache = Lucene.Net.Search.FieldCache;
+
+namespace Lucene.Net.Search.Function
+{
+	
+	/// <summary> Expert: obtains the ordinal of the field value from the default Lucene 
+	/// {@link Lucene.Net.Search.FieldCache FieldCache} using getStringIndex()
+	/// and reverses the order.
+	/// <p>
+	/// The native lucene index order is used to assign an ordinal value for each field value.
+	/// <p>
+	/// Field values (terms) are lexicographically ordered by unicode value, and numbered starting at 1.
+	/// <br>
+	/// Example of reverse ordinal (rord):
+	/// <br>If there were only three field values: "apple","banana","pear"
+	/// <br>then rord("apple")=3, rord("banana")=2, ord("pear")=1
+	/// <p>
+	/// WARNING: 
+	/// rord() depends on the position in an index and can thus change 
+	/// when other documents are inserted or deleted,
+	/// or if a MultiSearcher is used. 
+	/// 
+	/// <p><font color="#FF0000">
+	/// WARNING: The status of the <b>search.function</b> package is experimental. 
+	/// The APIs introduced here might change in the future and will not be 
+	/// supported anymore in such a case.</font>
+	/// 
+	/// </summary>
+	/// <author>  yonik
+	/// </author>
+	
+	[Serializable]
+	public class ReverseOrdFieldSource : ValueSource
+	{
+		private class AnonymousClassDocValues : DocValues
+		{
+			public AnonymousClassDocValues(int end, int[] arr, ReverseOrdFieldSource enclosingInstance)
+			{
+				InitBlock(end, arr, enclosingInstance);
+			}
+			private void  InitBlock(int end, int[] arr, ReverseOrdFieldSource enclosingInstance)
+			{
+				this.end = end;
+				this.arr = arr;
+				this.enclosingInstance = enclosingInstance;
+			}
+			private int end;
+			private int[] arr;
+			private ReverseOrdFieldSource enclosingInstance;
+			public ReverseOrdFieldSource Enclosing_Instance
+			{
+				get
+				{
+					return enclosingInstance;
+				}
+				
+			}
+			/*(non-Javadoc) @see Lucene.Net.Search.Function.DocValues#floatVal(int) */
+			public override float FloatVal(int doc)
+			{
+				return (float) (end - arr[doc]);
+			}
+			/* (non-Javadoc) @see Lucene.Net.Search.Function.DocValues#intVal(int) */
+			public override int IntVal(int doc)
+			{
+				return end - arr[doc];
+			}
+			/* (non-Javadoc) @see Lucene.Net.Search.Function.DocValues#strVal(int) */
+			public override System.String StrVal(int doc)
+			{
+				// the string value of the ordinal, not the string itself
+				return System.Convert.ToString(IntVal(doc));
+			}
+			/*(non-Javadoc) @see Lucene.Net.Search.Function.DocValues#toString(int) */
+			public override System.String ToString(int doc)
+			{
+				return Enclosing_Instance.Description() + '=' + StrVal(doc);
+			}
+			/*(non-Javadoc) @see Lucene.Net.Search.Function.DocValues#getInnerArray() */
+			internal override System.Object GetInnerArray()
+			{
+				return arr;
+			}
+		}
+		public System.String field;
+		
+		/// <summary> Contructor for a certain field.</summary>
+		/// <param name="field">field whose values reverse order is used.  
+		/// </param>
+		public ReverseOrdFieldSource(System.String field)
+		{
+			this.field = field;
+		}
+		
+		/*(non-Javadoc) @see Lucene.Net.Search.Function.ValueSource#description() */
+		public override System.String Description()
+		{
+			return "rord(" + field + ')';
+		}
+		
+		/*(non-Javadoc) @see Lucene.Net.Search.Function.ValueSource#getValues(Lucene.Net.Index.IndexReader) */
+		public override DocValues GetValues(IndexReader reader)
+		{
+			Lucene.Net.Search.StringIndex sindex = Lucene.Net.Search.FieldCache_Fields.DEFAULT.GetStringIndex(reader, field);
+			
+			int[] arr = sindex.order;
+			int end = sindex.lookup.Length;
+			
+			return new AnonymousClassDocValues(end, arr, this);
+		}
+		
+		/*(non-Javadoc) @see java.lang.Object#equals(java.lang.Object) */
+		public  override bool Equals(System.Object o)
+		{
+			if (o.GetType() != typeof(ReverseOrdFieldSource))
+				return false;
+			ReverseOrdFieldSource other = (ReverseOrdFieldSource) o;
+			return this.field.Equals(other.field);
+		}
+		
+		private static readonly int hcode;
+		
+		/*(non-Javadoc) @see java.lang.Object#hashCode() */
+		public override int GetHashCode()
+		{
+			return hcode + field.GetHashCode();
+		}
+		static ReverseOrdFieldSource()
+		{
+			hcode = typeof(ReverseOrdFieldSource).GetHashCode();
+		}
+	}
+}
\ No newline at end of file

Added: incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Function/ShortFieldSource.cs
URL: http://svn.apache.org/viewvc/incubator/lucene.net/trunk/C%23/src/Lucene.Net/Search/Function/ShortFieldSource.cs?rev=671402&view=auto
==============================================================================
--- incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Function/ShortFieldSource.cs (added)
+++ incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Function/ShortFieldSource.cs Tue Jun 24 19:51:24 2008
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+using System;
+
+using IndexReader = Lucene.Net.Index.IndexReader;
+using FieldCache = Lucene.Net.Search.FieldCache;
+
+namespace Lucene.Net.Search.Function
+{
+	
+	/// <summary> Expert: obtains short field values from the 
+	/// {@link Lucene.Net.Search.FieldCache FieldCache}
+	/// using <code>getShorts()</code> and makes those values 
+	/// available as other numeric types, casting as needed.
+	/// 
+	/// <p><font color="#FF0000">
+	/// WARNING: The status of the <b>search.function</b> package is experimental. 
+	/// The APIs introduced here might change in the future and will not be 
+	/// supported anymore in such a case.</font>
+	/// 
+	/// </summary>
+	/// <seealso cref="Lucene.Net.Search.Function.FieldCacheSource for requirements">
+	/// on the field.
+	/// </seealso>
+	[Serializable]
+	public class ShortFieldSource : FieldCacheSource
+	{
+		private class AnonymousClassDocValues : DocValues
+		{
+			public AnonymousClassDocValues(short[] arr, ShortFieldSource enclosingInstance)
+			{
+				InitBlock(arr, enclosingInstance);
+			}
+			private void  InitBlock(short[] arr, ShortFieldSource enclosingInstance)
+			{
+				this.arr = arr;
+				this.enclosingInstance = enclosingInstance;
+			}
+			private short[] arr;
+			private ShortFieldSource enclosingInstance;
+			public ShortFieldSource Enclosing_Instance
+			{
+				get
+				{
+					return enclosingInstance;
+				}
+				
+			}
+			/*(non-Javadoc) @see Lucene.Net.Search.Function.DocValues#floatVal(int) */
+			public override float FloatVal(int doc)
+			{
+				return (float) arr[doc];
+			}
+			/*(non-Javadoc) @see Lucene.Net.Search.Function.DocValues#intVal(int) */
+			public override int IntVal(int doc)
+			{
+				return arr[doc];
+			}
+			/*(non-Javadoc) @see Lucene.Net.Search.Function.DocValues#toString(int) */
+			public override System.String ToString(int doc)
+			{
+				return Enclosing_Instance.Description() + '=' + IntVal(doc);
+			}
+			/*(non-Javadoc) @see Lucene.Net.Search.Function.DocValues#getInnerArray() */
+			internal override System.Object GetInnerArray()
+			{
+				return arr;
+			}
+		}
+		private Lucene.Net.Search.ShortParser parser;
+		
+		/// <summary> Create a cached short field source with default string-to-short parser. </summary>
+		public ShortFieldSource(System.String field) : this(field, null)
+		{
+		}
+		
+		/// <summary> Create a cached short field source with a specific string-to-short parser. </summary>
+		public ShortFieldSource(System.String field, Lucene.Net.Search.ShortParser parser) : base(field)
+		{
+			this.parser = parser;
+		}
+		
+		/*(non-Javadoc) @see Lucene.Net.Search.Function.ValueSource#description() */
+		public override System.String Description()
+		{
+			return "short(" + base.Description() + ')';
+		}
+		
+		/*(non-Javadoc) @see Lucene.Net.Search.Function.FieldCacheSource#getCachedValues(Lucene.Net.Search.FieldCache, java.lang.String, Lucene.Net.Index.IndexReader) */
+		public override DocValues GetCachedFieldValues(FieldCache cache, System.String field, IndexReader reader)
+		{
+			short[] arr = (parser == null) ? cache.GetShorts(reader, field) : cache.GetShorts(reader, field, parser);
+			return new AnonymousClassDocValues(arr, this);
+		}
+		
+		/*(non-Javadoc) @see Lucene.Net.Search.Function.FieldCacheSource#cachedFieldSourceEquals(Lucene.Net.Search.Function.FieldCacheSource) */
+		public override bool CachedFieldSourceEquals(FieldCacheSource o)
+		{
+			if (o.GetType() != typeof(ShortFieldSource))
+			{
+				return false;
+			}
+			ShortFieldSource other = (ShortFieldSource) o;
+			return this.parser == null ? other.parser == null : this.parser.GetType() == other.parser.GetType();
+		}
+		
+		/*(non-Javadoc) @see Lucene.Net.Search.Function.FieldCacheSource#cachedFieldSourceHashCode() */
+		public override int CachedFieldSourceHashCode()
+		{
+			return parser == null ? typeof(System.Int16).GetHashCode() : parser.GetType().GetHashCode();
+		}
+	}
+}
\ No newline at end of file

Added: incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Function/ValueSource.cs
URL: http://svn.apache.org/viewvc/incubator/lucene.net/trunk/C%23/src/Lucene.Net/Search/Function/ValueSource.cs?rev=671402&view=auto
==============================================================================
--- incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Function/ValueSource.cs (added)
+++ incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Function/ValueSource.cs Tue Jun 24 19:51:24 2008
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+using System;
+
+using IndexReader = Lucene.Net.Index.IndexReader;
+
+namespace Lucene.Net.Search.Function
+{
+	
+	/// <summary> Expert: source of values for basic function queries.
+	/// <P>At its default/simplest form, values - one per doc - are used as the score of that doc.
+	/// <P>Values are instantiated as 
+	/// {@link Lucene.Net.Search.Function.DocValues DocValues} for a particular reader.
+	/// <P>ValueSource implementations differ in RAM requirements: it would always be a factor
+	/// of the number of documents, but for each document the number of bytes can be 1, 2, 4, or 8. 
+	/// 
+	/// <p><font color="#FF0000">
+	/// WARNING: The status of the <b>search.function</b> package is experimental. 
+	/// The APIs introduced here might change in the future and will not be 
+	/// supported anymore in such a case.</font>
+	/// 
+	/// 
+	/// </summary>
+	[Serializable]
+	public abstract class ValueSource
+	{
+		
+		/// <summary> Return the DocValues used by the function query.</summary>
+		/// <param name="reader">the IndexReader used to read these values.
+		/// If any caching is involved, that caching would also be IndexReader based.  
+		/// </param>
+		/// <throws>  IOException for any error. </throws>
+		public abstract DocValues GetValues(IndexReader reader);
+		
+		/// <summary> description of field, used in explain() </summary>
+		public abstract System.String Description();
+		
+		/* (non-Javadoc) @see java.lang.Object#toString() */
+		public override System.String ToString()
+		{
+			return Description();
+		}
+		
+		/// <summary> Needed for possible caching of query results - used by {@link ValueSourceQuery#equals(Object)}.</summary>
+		/// <seealso cref="Object.equals(Object)">
+		/// </seealso>
+		abstract public  override bool Equals(System.Object o);
+		
+		/// <summary> Needed for possible caching of query results - used by {@link ValueSourceQuery#hashCode()}.</summary>
+		/// <seealso cref="Object.hashCode()">
+		/// </seealso>
+		abstract public override int GetHashCode();
+	}
+}
\ No newline at end of file

Added: incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Function/ValueSourceQuery.cs
URL: http://svn.apache.org/viewvc/incubator/lucene.net/trunk/C%23/src/Lucene.Net/Search/Function/ValueSourceQuery.cs?rev=671402&view=auto
==============================================================================
--- incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Function/ValueSourceQuery.cs (added)
+++ incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Function/ValueSourceQuery.cs Tue Jun 24 19:51:24 2008
@@ -0,0 +1,257 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+using System;
+
+using IndexReader = Lucene.Net.Index.IndexReader;
+using ToStringUtils = Lucene.Net.Util.ToStringUtils;
+using Lucene.Net.Search;
+using Searchable = Lucene.Net.Search.Searchable;
+
+namespace Lucene.Net.Search.Function
+{
+	
+	/// <summary> Expert: A Query that sets the scores of document to the
+	/// values obtained from a {@link Lucene.Net.Search.Function.ValueSource ValueSource}.
+	/// <p>   
+	/// The value source can be based on a (cached) value of an indexd  field, but it
+	/// can also be based on an external source, e.g. values read from an external database. 
+	/// <p>
+	/// Score is set as: Score(doc,query) = query.getBoost()<sup>2</sup> * valueSource(doc).  
+	/// 
+	/// <p><font color="#FF0000">
+	/// WARNING: The status of the <b>search.function</b> package is experimental. 
+	/// The APIs introduced here might change in the future and will not be 
+	/// supported anymore in such a case.</font>
+	/// 
+	/// </summary>
+	/// <author>  yonik
+	/// </author>
+	[Serializable]
+	public class ValueSourceQuery : Query
+	{
+		internal ValueSource valSrc;
+		
+		/// <summary> Create a value source query</summary>
+		/// <param name="valSrc">provides the values defines the function to be used for scoring
+		/// </param>
+		public ValueSourceQuery(ValueSource valSrc)
+		{
+			this.valSrc = valSrc;
+		}
+		
+		/*(non-Javadoc) @see Lucene.Net.Search.Query#rewrite(Lucene.Net.Index.IndexReader) */
+		public override Query Rewrite(IndexReader reader)
+		{
+			return this;
+		}
+		
+		/*(non-Javadoc) @see Lucene.Net.Search.Query#extractTerms(java.util.Set) */
+		public override void  ExtractTerms(System.Collections.Hashtable terms)
+		{
+			// no terms involved here
+		}
+		
+		[Serializable]
+		private class ValueSourceWeight : Weight
+		{
+			private void  InitBlock(ValueSourceQuery enclosingInstance)
+			{
+				this.enclosingInstance = enclosingInstance;
+			}
+			private ValueSourceQuery enclosingInstance;
+			public ValueSourceQuery Enclosing_Instance
+			{
+				get
+				{
+					return enclosingInstance;
+				}
+				
+			}
+			internal Similarity similarity;
+			internal float queryNorm;
+			internal float queryWeight;
+			
+			public ValueSourceWeight(ValueSourceQuery enclosingInstance, Searcher searcher)
+			{
+				InitBlock(enclosingInstance);
+				this.similarity = Enclosing_Instance.GetSimilarity(searcher);
+			}
+			
+			/*(non-Javadoc) @see Lucene.Net.Search.Weight#getQuery() */
+			public virtual Query GetQuery()
+			{
+				return Enclosing_Instance;
+			}
+			
+			/*(non-Javadoc) @see Lucene.Net.Search.Weight#getValue() */
+			public virtual float GetValue()
+			{
+				return queryWeight;
+			}
+			
+			/*(non-Javadoc) @see Lucene.Net.Search.Weight#sumOfSquaredWeights() */
+			public virtual float SumOfSquaredWeights()
+			{
+				queryWeight = Enclosing_Instance.GetBoost();
+				return queryWeight * queryWeight;
+			}
+			
+			/*(non-Javadoc) @see Lucene.Net.Search.Weight#normalize(float) */
+			public virtual void  Normalize(float norm)
+			{
+				this.queryNorm = norm;
+				queryWeight *= this.queryNorm;
+			}
+			
+			/*(non-Javadoc) @see Lucene.Net.Search.Weight#scorer(Lucene.Net.Index.IndexReader) */
+			public virtual Scorer Scorer(IndexReader reader)
+			{
+				return new ValueSourceScorer(enclosingInstance, similarity, reader, this);
+			}
+			
+			/*(non-Javadoc) @see Lucene.Net.Search.Weight#explain(Lucene.Net.Index.IndexReader, int) */
+			public virtual Explanation Explain(IndexReader reader, int doc)
+			{
+				return Scorer(reader).Explain(doc);
+			}
+		}
+		
+		/// <summary> A scorer that (simply) matches all documents, and scores each document with 
+		/// the value of the value soure in effect. As an example, if the value source 
+		/// is a (cached) field source, then value of that field in that document will 
+		/// be used. (assuming field is indexed for this doc, with a single token.)   
+		/// </summary>
+		private class ValueSourceScorer : Scorer
+		{
+			private void  InitBlock(ValueSourceQuery enclosingInstance)
+			{
+				this.enclosingInstance = enclosingInstance;
+			}
+			private ValueSourceQuery enclosingInstance;
+			public ValueSourceQuery Enclosing_Instance
+			{
+				get
+				{
+					return enclosingInstance;
+				}
+				
+			}
+			private IndexReader reader;
+			private ValueSourceWeight weight;
+			private int maxDoc;
+			private float qWeight;
+			private int doc = - 1;
+			private DocValues vals;
+			
+			// constructor
+			internal ValueSourceScorer(ValueSourceQuery enclosingInstance, Similarity similarity, IndexReader reader, ValueSourceWeight w) : base(similarity)
+			{
+				InitBlock(enclosingInstance);
+				this.weight = w;
+				this.qWeight = w.GetValue();
+				this.reader = reader;
+				this.maxDoc = reader.MaxDoc();
+				// this is when/where the values are first created.
+				vals = Enclosing_Instance.valSrc.GetValues(reader);
+			}
+			
+			/*(non-Javadoc) @see Lucene.Net.Search.Scorer#next() */
+			public override bool Next()
+			{
+				for (; ; )
+				{
+					++doc;
+					if (doc >= maxDoc)
+					{
+						return false;
+					}
+					if (reader.IsDeleted(doc))
+					{
+						continue;
+					}
+					return true;
+				}
+			}
+			
+			/*(non-Javadoc) @see Lucene.Net.Search.Scorer#doc()
+			*/
+			public override int Doc()
+			{
+				return doc;
+			}
+			
+			/*(non-Javadoc) @see Lucene.Net.Search.Scorer#score() */
+			public override float Score()
+			{
+				return qWeight * vals.FloatVal(doc);
+			}
+			
+			/*(non-Javadoc) @see Lucene.Net.Search.Scorer#skipTo(int) */
+			public override bool SkipTo(int target)
+			{
+				doc = target - 1;
+				return Next();
+			}
+			
+			/*(non-Javadoc) @see Lucene.Net.Search.Scorer#explain(int) */
+			public override Explanation Explain(int doc)
+			{
+				float sc = qWeight * vals.FloatVal(doc);
+				
+				Explanation result = new ComplexExplanation(true, sc, Enclosing_Instance.ToString() + ", product of:");
+				
+				result.AddDetail(vals.Explain(doc));
+				result.AddDetail(new Explanation(Enclosing_Instance.GetBoost(), "boost"));
+				result.AddDetail(new Explanation(weight.queryNorm, "queryNorm"));
+				return result;
+			}
+		}
+		
+		/*(non-Javadoc) @see Lucene.Net.Search.Query#createWeight(Lucene.Net.Search.Searcher) */
+		protected internal override Weight CreateWeight(Searcher searcher)
+		{
+			return new ValueSourceQuery.ValueSourceWeight(this, searcher);
+		}
+		
+		/* (non-Javadoc) @see Lucene.Net.Search.Query#toString(java.lang.String) */
+		public override System.String ToString(System.String field)
+		{
+			return valSrc.ToString() + ToStringUtils.Boost(GetBoost());
+		}
+		
+		/// <summary>Returns true if <code>o</code> is equal to this. </summary>
+		public  override bool Equals(System.Object o)
+		{
+			if (GetType() != o.GetType())
+			{
+				return false;
+			}
+			ValueSourceQuery other = (ValueSourceQuery) o;
+			return this.GetBoost() == other.GetBoost() && this.valSrc.Equals(other.valSrc);
+		}
+		
+		/// <summary>Returns a hash code value for this object. </summary>
+		public override int GetHashCode()
+		{
+			return (GetType().GetHashCode() + valSrc.GetHashCode()) ^ BitConverter.ToInt32(BitConverter.GetBytes(GetBoost()), 0);
+		}
+		override public System.Object Clone()
+		{
+			return null;    /// {{Aroush-2.3.1}} Do we need this Clone() method?
+		}
+	}
+}
\ No newline at end of file

Modified: incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/FuzzyQuery.cs
URL: http://svn.apache.org/viewvc/incubator/lucene.net/trunk/C%23/src/Lucene.Net/Search/FuzzyQuery.cs?rev=671402&r1=671401&r2=671402&view=diff
==============================================================================
--- incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/FuzzyQuery.cs (original)
+++ incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/FuzzyQuery.cs Tue Jun 24 19:51:24 2008
@@ -108,24 +108,35 @@
 			FilteredTermEnum enumerator = GetEnum(reader);
 			int maxClauseCount = BooleanQuery.GetMaxClauseCount();
 			ScoreTermQueue stQueue = new ScoreTermQueue(maxClauseCount);
+			ScoreTerm reusableST = null;
 			
 			try
 			{
 				do 
 				{
-					float minScore = 0.0f;
 					float score = 0.0f;
 					Term t = enumerator.Term();
 					if (t != null)
 					{
 						score = enumerator.Difference();
-						// terms come in alphabetical order, therefore if queue is full and score
-						// not bigger than minScore, we can skip
-						if (stQueue.Size() < maxClauseCount || score > minScore)
+						if (reusableST == null)
 						{
-							stQueue.Insert(new ScoreTerm(t, score));
-							minScore = ((ScoreTerm) stQueue.Top()).score; // maintain minScore
+							reusableST = new ScoreTerm(t, score);
 						}
+						else if (score >= reusableST.score)
+						{
+							// reusableST holds the last "rejected" entry, so, if
+							// this new score is not better than that, there's no
+							// need to try inserting it
+							reusableST.score = score;
+							reusableST.term = t;
+						}
+						else
+						{
+							continue;
+						}
+						
+						reusableST = (ScoreTerm) stQueue.InsertWithOverflow(reusableST);
 					}
 				}
 				while (enumerator.Next());

Modified: incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/FuzzyTermEnum.cs
URL: http://svn.apache.org/viewvc/incubator/lucene.net/trunk/C%23/src/Lucene.Net/Search/FuzzyTermEnum.cs?rev=671402&r1=671401&r2=671402&view=diff
==============================================================================
--- incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/FuzzyTermEnum.cs (original)
+++ incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/FuzzyTermEnum.cs Tue Jun 24 19:51:24 2008
@@ -61,12 +61,12 @@
 		/// valid term if such a term exists. 
 		/// 
 		/// </summary>
-		/// <param name="">reader
+		/// <param name="reader">
 		/// </param>
-		/// <param name="">term
+		/// <param name="term">
 		/// </param>
 		/// <throws>  IOException </throws>
-		/// <seealso cref="Term, float, int)">
+		/// <seealso cref="FuzzyTermEnum(IndexReader, Term, float, int)">
 		/// </seealso>
 		public FuzzyTermEnum(IndexReader reader, Term term) : this(reader, term, FuzzyQuery.defaultMinSimilarity, FuzzyQuery.defaultPrefixLength)
 		{
@@ -78,14 +78,14 @@
 		/// valid term if such a term exists. 
 		/// 
 		/// </summary>
-		/// <param name="">reader
+		/// <param name="reader">
 		/// </param>
-		/// <param name="">term
+		/// <param name="term">
 		/// </param>
-		/// <param name="">minSimilarity
+		/// <param name="minSimilarity">
 		/// </param>
 		/// <throws>  IOException </throws>
-		/// <seealso cref="Term, float, int)">
+		/// <seealso cref="FuzzyTermEnum(IndexReader, Term, float, int)">
 		/// </seealso>
 		public FuzzyTermEnum(IndexReader reader, Term term, float minSimilarity) : this(reader, term, minSimilarity, FuzzyQuery.defaultPrefixLength)
 		{
@@ -126,7 +126,7 @@
 			//The prefix could be longer than the word.
 			//It's kind of silly though.  It means we must match the entire word.
 			int fullSearchTermLength = searchTerm.Text().Length;
-			int realPrefixLength = prefixLength > fullSearchTermLength?fullSearchTermLength:prefixLength;
+			int realPrefixLength = prefixLength > fullSearchTermLength ? fullSearchTermLength : prefixLength;
 			
 			this.text = searchTerm.Text().Substring(realPrefixLength);
 			this.prefix = searchTerm.Text().Substring(0, (realPrefixLength) - (0));

Modified: incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Hit.cs
URL: http://svn.apache.org/viewvc/incubator/lucene.net/trunk/C%23/src/Lucene.Net/Search/Hit.cs?rev=671402&r1=671401&r2=671402&view=diff
==============================================================================
--- incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Hit.cs (original)
+++ incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Hit.cs Tue Jun 24 19:51:24 2008
@@ -18,6 +18,7 @@
 using System;
 
 using Document = Lucene.Net.Documents.Document;
+using CorruptIndexException = Lucene.Net.Index.CorruptIndexException;
 
 namespace Lucene.Net.Search
 {
@@ -53,8 +54,10 @@
 		/// <summary> Returns document for this hit.
 		/// 
 		/// </summary>
-		/// <seealso cref="Hits#Doc(int)">
+		/// <seealso cref="Hits.Doc(int)">
 		/// </seealso>
+		/// <throws>  CorruptIndexException if the index is corrupt </throws>
+		/// <throws>  IOException if there is a low-level IO error </throws>
 		public virtual Document GetDocument()
 		{
 			if (!resolved)
@@ -65,7 +68,7 @@
 		/// <summary> Returns score for this hit.
 		/// 
 		/// </summary>
-		/// <seealso cref="Hits#Score(int)">
+		/// <seealso cref="Hits.Score(int)">
 		/// </seealso>
 		public virtual float GetScore()
 		{
@@ -93,8 +96,10 @@
 		/// <summary> Returns the boost factor for this hit on any field of the underlying document.
 		/// 
 		/// </summary>
-		/// <seealso cref="Document#GetBoost()">
+		/// <seealso cref="Document.GetBoost()">
 		/// </seealso>
+		/// <throws>  CorruptIndexException if the index is corrupt </throws>
+		/// <throws>  IOException if there is a low-level IO error </throws>
 		public virtual float GetBoost()
 		{
 			return GetDocument().GetBoost();
@@ -106,8 +111,10 @@
 		/// exist, returns null.
 		/// 
 		/// </summary>
-		/// <seealso cref="Document#Get(String)">
+		/// <seealso cref="Document.Get(String)">
 		/// </seealso>
+		/// <throws>  CorruptIndexException if the index is corrupt </throws>
+		/// <throws>  IOException if there is a low-level IO error </throws>
 		public virtual System.String Get(System.String name)
 		{
 			return GetDocument().Get(name);

Modified: incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/HitCollector.cs
URL: http://svn.apache.org/viewvc/incubator/lucene.net/trunk/C%23/src/Lucene.Net/Search/HitCollector.cs?rev=671402&r1=671401&r2=671402&view=diff
==============================================================================
--- incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/HitCollector.cs (original)
+++ incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/HitCollector.cs Tue Jun 24 19:51:24 2008
@@ -24,14 +24,14 @@
 	/// <br>HitCollectors are primarily meant to be used to implement queries,
 	/// sorting and filtering.
 	/// </summary>
-	/// <seealso cref="Searcher#Search(Query,HitCollector)">
+	/// <seealso cref="Searcher.Search(Query,HitCollector)">
 	/// </seealso>
-	/// <version>  $Id: HitCollector.java 472959 2006-11-09 16:21:50Z yonik $
+	/// <version>  $Id: HitCollector.java 596462 2007-11-19 22:03:22Z hossman $
 	/// </version>
 	public abstract class HitCollector
 	{
-		/// <summary>Called once for every non-zero scoring document, with the document number
-		/// and its score.
+		/// <summary>Called once for every document matching a query, with the document
+		/// number and its raw score.
 		/// 
 		/// <P>If, for example, an application wished to collect all of the hits for a
 		/// query in a BitSet, then it might:<pre>

Modified: incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Hits.cs
URL: http://svn.apache.org/viewvc/incubator/lucene.net/trunk/C%23/src/Lucene.Net/Search/Hits.cs?rev=671402&r1=671401&r2=671402&view=diff
==============================================================================
--- incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Hits.cs (original)
+++ incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Hits.cs Tue Jun 24 19:51:24 2008
@@ -18,11 +18,25 @@
 using System;
 
 using Document = Lucene.Net.Documents.Document;
+using CorruptIndexException = Lucene.Net.Index.CorruptIndexException;
 
 namespace Lucene.Net.Search
 {
 	
-	/// <summary>A ranked list of documents, used to hold search results. </summary>
+	/// <summary>A ranked list of documents, used to hold search results.
+	/// <p>
+	/// <b>Caution:</b> Iterate only over the hits needed.  Iterating over all
+	/// hits is generally not desirable and may be the source of
+	/// performance issues. If you need to iterate over many or all hits, consider
+	/// using the search method that takes a {@link HitCollector}.
+	/// </p>
+	/// <p><b>Note:</b> Deleting matching documents concurrently with traversing 
+	/// the hits, might, when deleting hits that were not yet retrieved, decrease
+	/// {@link #Length()}. In such case, 
+	/// {@link java.util.ConcurrentModificationException ConcurrentModificationException}
+	/// is thrown when accessing hit <code>n</code> &ge; current_{@link #Length()} 
+	/// (but <code>n</code> &lt; {@link #Length()}_at_start). 
+	/// </summary>
 	public sealed class Hits
 	{
 		private Weight weight;
@@ -38,12 +52,20 @@
 		private int numDocs = 0; // number cached
 		private int maxDocs = 200; // max to cache
 		
+		private int nDeletions; // # deleted docs in the index.    
+		private int lengthAtStart; // this is the number apps usually count on (although deletions can bring it down). 
+		private int nDeletedHits = 0; // # of already collected hits that were meanwhile deleted.
+		
+		internal bool debugCheckedForDeletions = false; // for test purposes.
+		
 		internal Hits(Searcher s, Query q, Filter f)
 		{
 			weight = q.Weight(s);
 			searcher = s;
 			filter = f;
+			nDeletions = CountDeletions(s);
 			GetMoreDocs(50); // retrieve 100 initially
+			lengthAtStart = length;
 		}
 		
 		internal Hits(Searcher s, Query q, Filter f, Sort o)
@@ -52,7 +74,20 @@
 			searcher = s;
 			filter = f;
 			sort = o;
+			nDeletions = CountDeletions(s);
 			GetMoreDocs(50); // retrieve 100 initially
+			lengthAtStart = length;
+		}
+		
+		// count # deletions, return -1 if unknown.
+		private int CountDeletions(Searcher s)
+		{
+			int cnt = - 1;
+			if (s is IndexSearcher)
+			{
+				cnt = s.MaxDoc() - ((IndexSearcher) s).GetIndexReader().NumDocs();
+			}
+			return cnt;
 		}
 		
 		/// <summary> Tries to add new documents to hitDocs.
@@ -67,6 +102,7 @@
 			
 			int n = min * 2; // double # retrieved
 			TopDocs topDocs = (sort == null) ? searcher.Search(weight, filter, n) : searcher.Search(weight, filter, n, sort);
+
 			length = topDocs.totalHits;
 			ScoreDoc[] scoreDocs = topDocs.scoreDocs;
 			
@@ -77,11 +113,41 @@
 				scoreNorm = 1.0f / topDocs.GetMaxScore();
 			}
 			
-			int end = scoreDocs.Length < length?scoreDocs.Length:length;
+			int start = hitDocs.Count - nDeletedHits;
+			
+			// any new deletions?
+			int nDels2 = CountDeletions(searcher);
+			debugCheckedForDeletions = false;
+			if (nDeletions < 0 || nDels2 > nDeletions)
+			{
+				// either we cannot count deletions, or some "previously valid hits" might have been deleted, so find exact start point
+				nDeletedHits = 0;
+				debugCheckedForDeletions = true;
+				int i2 = 0;
+				for (int i1 = 0; i1 < hitDocs.Count && i2 < scoreDocs.Length; i1++)
+				{
+					int id1 = ((HitDoc) hitDocs[i1]).id;
+					int id2 = scoreDocs[i2].doc;
+					if (id1 == id2)
+					{
+						i2++;
+					}
+					else
+					{
+						nDeletedHits++;
+					}
+				}
+				start = i2;
+			}
+			
+			int end = scoreDocs.Length < length ? scoreDocs.Length : length;
+			length += nDeletedHits;
 			for (int i = hitDocs.Count; i < end; i++)
 			{
 				hitDocs.Add(new HitDoc(scoreDocs[i].score * scoreNorm, scoreDocs[i].doc));
 			}
+			
+			nDeletions = nDels2;
 		}
 		
 		/// <summary>Returns the total number of hits available in this set. </summary>
@@ -92,8 +158,10 @@
 		
 		/// <summary>Returns the stored fields of the n<sup>th</sup> document in this set.
 		/// <p>Documents are cached, so that repeated requests for the same element may
-		/// return the same Document object. 
+		/// return the same Document object.
 		/// </summary>
+		/// <throws>  CorruptIndexException if the index is corrupt </throws>
+		/// <throws>  IOException if there is a low-level IO error </throws>
 		public Document Doc(int n)
 		{
 			HitDoc hitDoc = HitDoc(n);
@@ -117,24 +185,28 @@
 			return hitDoc.doc;
 		}
 		
-		/// <summary>Returns the score for the nth document in this set. </summary>
+		/// <summary>Returns the score for the n<sup>th</sup> document in this set. </summary>
 		public float Score(int n)
 		{
 			return HitDoc(n).score;
 		}
 		
-		/// <summary>Returns the id for the nth document in this set. </summary>
+		/// <summary>Returns the id for the n<sup>th</sup> document in this set.
+		/// Note that ids may change when the index changes, so you cannot
+		/// rely on the id to be stable.
+		/// </summary>
 		public int Id(int n)
 		{
 			return HitDoc(n).id;
 		}
 		
 		/// <summary> Returns a {@link HitIterator} to navigate the Hits.  Each item returned
-		/// from {@link Iterator#Next()} is a {@link Hit}.
+		/// from {@link Iterator#next()} is a {@link Hit}.
 		/// <p>
 		/// <b>Caution:</b> Iterate only over the hits needed.  Iterating over all
 		/// hits is generally not desirable and may be the source of
-		/// performance issues.
+		/// performance issues. If you need to iterate over many or all hits, consider
+		/// using a search method that takes a {@link HitCollector}.
 		/// </p>
 		/// </summary>
 		public System.Collections.IEnumerator Iterator()
@@ -154,6 +226,11 @@
 				GetMoreDocs(n);
 			}
 			
+			if (n >= length)
+			{
+				throw new System.Exception("Not a valid hit number: " + n);
+			}
+			
 			return (HitDoc) hitDocs[n];
 		}
 		

Modified: incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/IndexSearcher.cs
URL: http://svn.apache.org/viewvc/incubator/lucene.net/trunk/C%23/src/Lucene.Net/Search/IndexSearcher.cs?rev=671402&r1=671401&r2=671402&view=diff
==============================================================================
--- incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/IndexSearcher.cs (original)
+++ incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/IndexSearcher.cs Tue Jun 24 19:51:24 2008
@@ -17,10 +17,12 @@
 
 using System;
 
-using Directory = Lucene.Net.Store.Directory;
 using Document = Lucene.Net.Documents.Document;
+using FieldSelector = Lucene.Net.Documents.FieldSelector;
+using CorruptIndexException = Lucene.Net.Index.CorruptIndexException;
 using IndexReader = Lucene.Net.Index.IndexReader;
 using Term = Lucene.Net.Index.Term;
+using Directory = Lucene.Net.Store.Directory;
 
 namespace Lucene.Net.Search
 {
@@ -76,12 +78,16 @@
             get {   return reader;  }
         }
 
-        /// <summary>Creates a searcher searching the index in the named directory. </summary>
+		/// <summary>Creates a searcher searching the index in the named directory.</summary>
+		/// <throws>  CorruptIndexException if the index is corrupt </throws>
+		/// <throws>  IOException if there is a low-level IO error </throws>
 		public IndexSearcher(System.String path) : this(IndexReader.Open(path), true)
 		{
 		}
 		
-		/// <summary>Creates a searcher searching the index in the provided directory. </summary>
+		/// <summary>Creates a searcher searching the index in the provided directory.</summary>
+		/// <throws>  CorruptIndexException if the index is corrupt </throws>
+		/// <throws>  IOException if there is a low-level IO error </throws>
 		public IndexSearcher(Directory directory) : this(IndexReader.Open(directory), true)
 		{
 		}
@@ -127,6 +133,12 @@
 		}
 		
 		// inherit javadoc
+		public override Document Doc(int i, FieldSelector fieldSelector)
+		{
+			return reader.Document(i, fieldSelector);
+		}
+		
+		// inherit javadoc
 		public override int MaxDoc()
 		{
 			return reader.MaxDoc();

Modified: incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/MultiSearcher.cs
URL: http://svn.apache.org/viewvc/incubator/lucene.net/trunk/C%23/src/Lucene.Net/Search/MultiSearcher.cs?rev=671402&r1=671401&r2=671402&view=diff
==============================================================================
--- incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/MultiSearcher.cs (original)
+++ incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/MultiSearcher.cs Tue Jun 24 19:51:24 2008
@@ -18,6 +18,8 @@
 using System;
 
 using Document = Lucene.Net.Documents.Document;
+using FieldSelector = Lucene.Net.Documents.FieldSelector;
+using CorruptIndexException = Lucene.Net.Index.CorruptIndexException;
 using Term = Lucene.Net.Index.Term;
 
 namespace Lucene.Net.Search
@@ -67,10 +69,11 @@
 			private System.Collections.IDictionary dfMap; // Map from Terms to corresponding doc freqs
 			private int maxDoc; // document count
 			
-			public CachedDfSource(System.Collections.IDictionary dfMap, int maxDoc)
+			public CachedDfSource(System.Collections.IDictionary dfMap, int maxDoc, Similarity similarity)
 			{
 				this.dfMap = dfMap;
 				this.maxDoc = maxDoc;
+				SetSimilarity(similarity);
 			}
 			
 			public override int DocFreq(Term term)
@@ -121,6 +124,11 @@
 				throw new System.NotSupportedException();
 			}
 			
+			public override Document Doc(int i, FieldSelector fieldSelector)
+			{
+				throw new System.NotSupportedException();
+			}
+			
 			public override Explanation Explain(Weight weight, int doc)
 			{
 				throw new System.NotSupportedException();
@@ -143,7 +151,6 @@
 		}
 		
 		
-		
 		private Lucene.Net.Search.Searchable[] searchables;
 		private int[] starts;
 		private int maxDoc = 0;
@@ -195,6 +202,12 @@
 			return searchables[i].Doc(n - starts[i]); // dispatch to searcher
 		}
 		
+		// inherit javadoc
+		public override Document Doc(int n, FieldSelector fieldSelector)
+		{
+			int i = SubSearcher(n); // find searcher index
+			return searchables[i].Doc(n - starts[i], fieldSelector); // dispatch to searcher
+		}
 		
 		/// <summary>Returns index of the searcher for document <code>n</code> in the array
 		/// used to construct this searcher. 
@@ -384,7 +397,7 @@
 			
 			// step4
 			int numDocs = MaxDoc();
-			CachedDfSource cacheSim = new CachedDfSource(dfMap, numDocs);
+			CachedDfSource cacheSim = new CachedDfSource(dfMap, numDocs, GetSimilarity());
 			
 			return rewrittenQuery.Weight(cacheSim);
 		}

Modified: incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Package.html
URL: http://svn.apache.org/viewvc/incubator/lucene.net/trunk/C%23/src/Lucene.Net/Search/Package.html?rev=671402&r1=671401&r2=671402&view=diff
==============================================================================
--- incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Package.html (original)
+++ incubator/lucene.net/trunk/C#/src/Lucene.Net/Search/Package.html Tue Jun 24 19:51:24 2008
@@ -1,358 +1,358 @@
-<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
-<html>
-<head>
-   <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
-   <meta name="Author" content="Doug Cutting">
-   <meta content="Grant Ingersoll"  name="Author">
-</head>
-<body>
-<h2>Table Of Contents</h2>
-<p>
-    <ol>
-        <li><a href="#search">Search Basics</a></li>
-        <li><a href="#query">The Query Classes</a></li>
-        <li><a href="#scoring">Changing the Scoring</a></li>
-    </ol>
-</p>
-<a name="search"></a>
-<h2>Search</h2>
-<p>
-Search over indices.
-
-Applications usually call {@link
-Lucene.Net.Search.Searcher#search(Query)} or {@link
-Lucene.Net.Search.Searcher#search(Query,Filter)}.
-
-    <!-- FILL IN MORE HERE -->   
-</p>
-<a name="query"></a>
-<h2>Query Classes</h2>
-<h4>
-    <a href="TermQuery.html">TermQuery</a>
-</h4>
-
-<p>Of the various implementations of
-    <a href="Query.html">Query</a>, the
-    <a href="TermQuery.html">TermQuery</a>
-    is the easiest to understand and the most often used in applications. A <a
-        href="TermQuery.html">TermQuery</a> matches all the documents that contain the
-    specified
-    <a href="../index/Term.html">Term</a>,
-    which is a word that occurs in a certain
-    <a href="../document/Field.html">Field</a>.
-    Thus, a <a href="TermQuery.html">TermQuery</a> identifies and scores all
-    <a href="../document/Document.html">Document</a>s that have a <a
-        href="../document/Field.html">Field</a> with the specified string in it.
-    Constructing a <a
-        href="TermQuery.html">TermQuery</a>
-    is as simple as:
-    <pre>
-        TermQuery tq = new TermQuery(new Term("fieldName", "term");
-    </pre>In this example, the <a href="Query.html">Query</a> identifies all <a
-        href="../document/Document.html">Document</a>s that have the <a
-        href="../document/Field.html">Field</a> named <tt>"fieldName"</tt> and
-    contain the word <tt>"term"</tt>.
-</p>
-<h4>
-    <a href="BooleanQuery.html">BooleanQuery</a>
-</h4>
-
-<p>Things start to get interesting when one combines multiple
-    <a href="TermQuery.html">TermQuery</a> instances into a <a
-        href="BooleanQuery.html">BooleanQuery</a>.
-    A <a href="BooleanQuery.html">BooleanQuery</a> contains multiple
-    <a href="BooleanClause.html">BooleanClause</a>s,
-    where each clause contains a sub-query (<a href="Query.html">Query</a>
-    instance) and an operator (from <a
-        href="BooleanClause.Occur.html">BooleanClause.Occur</a>)
-    describing how that sub-query is combined with the other clauses:
-    <ol>
-
-        <li><p>SHOULD -- Use this operator when a clause can occur in the result set, but is not required.
-            If a query is made up of all SHOULD clauses, then every document in the result
-            set matches at least one of these clauses.</p></li>
-
-        <li><p>MUST -- Use this operator when a clause is required to occur in the result set. Every
-            document in the result set will match
-            all such clauses.</p></li>
-
-        <li><p>MUST NOT -- Use this operator when a
-            clause must not occur in the result set. No
-            document in the result set will match
-            any such clauses.</p></li>
-    </ol>
-    Boolean queries are constructed by adding two or more
-    <a href="BooleanClause.html">BooleanClause</a>
-    instances. If too many clauses are added, a <a href="BooleanQuery.TooManyClauses.html">TooManyClauses</a>
-    exception will be thrown during searching. This most often occurs
-    when a <a href="Query.html">Query</a>
-    is rewritten into a <a href="BooleanQuery.html">BooleanQuery</a> with many
-    <a href="TermQuery.html">TermQuery</a> clauses,
-    for example by <a href="WildcardQuery.html">WildcardQuery</a>.
-    The default setting for the maximum number
-    of clauses 1024, but this can be changed via the
-    static method <a href="BooleanQuery.html#setMaxClauseCount(int)">setMaxClauseCount</a>
-    in <a href="BooleanQuery.html">BooleanQuery</a>.
-</p>
-
-<h4>Phrases</h4>
-
-<p>Another common search is to find documents containing certain phrases. This
-    is handled in two different ways.
-    <ol>
-        <li>
-            <p><a href="PhraseQuery.html">PhraseQuery</a>
-                -- Matches a sequence of
-                <a href="../index/Term.html">Terms</a>.
-                <a href="PhraseQuery.html">PhraseQuery</a> uses a slop factor to determine
-                how many positions may occur between any two terms in the phrase and still be considered a match.</p>
-        </li>
-        <li>
-            <p><a href="spans/SpanNearQuery.html">SpanNearQuery</a>
-                -- Matches a sequence of other
-                <a href="spans/SpanQuery.html">SpanQuery</a>
-                instances. <a href="spans/SpanNearQuery.html">SpanNearQuery</a> allows for
-                much more
-                complicated phrase queries since it is constructed from other to <a
-                    href="spans/SpanQuery.html">SpanQuery</a>
-                instances, instead of only <a href="TermQuery.html">TermQuery</a>
-                instances.</p>
-        </li>
-    </ol>
-</p>
-<h4>
-    <a href="RangeQuery.html">RangeQuery</a>
-</h4>
-
-<p>The
-    <a href="RangeQuery.html">RangeQuery</a>
-    matches all documents that occur in the
-    exclusive range of a lower
-    <a href="../index/Term.html">Term</a>
-    and an upper
-    <a href="../index/Term.html">Term</a>.
-    For example, one could find all documents
-    that have terms beginning with the letters <tt>a</tt> through <tt>c</tt>. This type of <a
-        href="Query.html">Query</a> is frequently used to
-    find
-    documents that occur in a specific date range.
-</p>
-<h4>
-    <a href="PrefixQuery.html">PrefixQuery</a>,
-    <a href="WildcardQuery.html">WildcardQuery</a>
-</h4>
-
-<p>While the
-    <a href="PrefixQuery.html">PrefixQuery</a>
-    has a different implementation, it is essentially a special case of the
-    <a href="WildcardQuery.html">WildcardQuery</a>.
-    The <a href="PrefixQuery.html">PrefixQuery</a> allows an application
-    to identify all documents with terms that begin with a certain string. The <a
-        href="WildcardQuery.html">WildcardQuery</a> generalizes this by allowing
-    for the use of <tt>*</tt> (matches 0 or more characters) and <tt>?</tt> (matches exactly one character) wildcards.
-    Note that the <a href="WildcardQuery.html">WildcardQuery</a> can be quite slow. Also
-    note that
-    <a href="WildcardQuery.html">WildcardQuery</a> should
-    not start with <tt>*</tt> and <tt>?</tt>, as these are extremely slow. For tricks on how to search using a wildcard
-    at
-    the beginning of a term, see
-    <a href="http://www.gossamer-threads.com/lists/lucene/java-user/13373#13373">
-        Starts With x and Ends With x Queries</a>
-    from the Lucene users's mailing list.
-</p>
-<h4>
-    <a href="FuzzyQuery.html">FuzzyQuery</a>
-</h4>
-
-<p>A
-    <a href="FuzzyQuery.html">FuzzyQuery</a>
-    matches documents that contain terms similar to the specified term. Similarity is
-    determined using
-    <a href="http://en.wikipedia.org/wiki/Levenshtein">Levenshtein (edit) distance</a>.
-    This type of query can be useful when accounting for spelling variations in the collection.
-</p>
-<a name="changingSimilarity"></a>
-<h2>Changing Similarity</h2>
-
-<p>Chances are <a href="DefaultSimilarity.html">DefaultSimilarity</a> is sufficient for all
-    your searching needs.
-    However, in some applications it may be necessary to customize your <a
-        href="Similarity.html">Similarity</a> implementation. For instance, some
-    applications do not need to
-    distinguish between shorter and longer documents (see <a
-        href="http://www.gossamer-threads.com/lists/lucene/java-user/38967#38967">a "fair" similarity</a>).</p>
-
-<p>To change <a href="Similarity.html">Similarity</a>, one must do so for both indexing and
-    searching, and the changes must happen before
-    either of these actions take place. Although in theory there is nothing stopping you from changing mid-stream, it
-    just isn't well-defined what is going to happen.
-</p>
-
-<p>To make this change, implement your own <a href="Similarity.html">Similarity</a> (likely
-    you'll want to simply subclass
-    <a href="DefaultSimilarity.html">DefaultSimilarity</a>) and then use the new
-    class by calling
-    <a href="../index/IndexWriter.html#setSimilarity(Lucene.Net.Search.Similarity)">IndexWriter.setSimilarity</a>
-    before indexing and
-    <a href="Searcher.html#setSimilarity(Lucene.Net.Search.Similarity)">Searcher.setSimilarity</a>
-    before searching.
-</p>
-
-<p>
-    If you are interested in use cases for changing your similarity, see the Lucene users's mailing list at <a
-        href="http://www.nabble.com/Overriding-Similarity-tf2128934.html">Overriding Similarity</a>.
-    In summary, here are a few use cases:
-    <ol>
-        <li><p><a href="api/org/apache/lucene/misc/SweetSpotSimilarity.html">SweetSpotSimilarity</a> -- <a
-                href="api/org/apache/lucene/misc/SweetSpotSimilarity.html">SweetSpotSimilarity</a> gives small increases
-            as the frequency increases a small amount
-            and then greater increases when you hit the "sweet spot", i.e. where you think the frequency of terms is
-            more significant.</p></li>
-        <li><p>Overriding tf -- In some applications, it doesn't matter what the score of a document is as long as a
-            matching term occurs. In these
-            cases people have overridden Similarity to return 1 from the tf() method.</p></li>
-        <li><p>Changing Length Normalization -- By overriding <a
-                href="Similarity.html#lengthNorm(java.lang.String,%20int)">lengthNorm</a>,
-            it is possible to discount how the length of a field contributes
-            to a score. In <a href="DefaultSimilarity.html">DefaultSimilarity</a>,
-            lengthNorm = 1 / (numTerms in field)^0.5, but if one changes this to be
-            1 / (numTerms in field), all fields will be treated
-            <a href="http://www.gossamer-threads.com/lists/lucene/java-user/38967#38967">"fairly"</a>.</p></li>
-    </ol>
-    In general, Chris Hostetter sums it up best in saying (from <a
-        href="http://www.gossamer-threads.com/lists/lucene/java-user/39125#39125">the Lucene users's mailing list</a>):
-    <blockquote>[One would override the Similarity in] ... any situation where you know more about your data then just
-        that
-        it's "text" is a situation where it *might* make sense to to override your
-        Similarity method.</blockquote>
-</p>
-<a name="scoring"></a>
-<h2>Changing Scoring -- Expert Level</h2>
-
-<p>Changing scoring is an expert level task, so tread carefully and be prepared to share your code if
-    you want help.
-</p>
-
-<p>With the warning out of the way, it is possible to change a lot more than just the Similarity
-    when it comes to scoring in Lucene. Lucene's scoring is a complex mechanism that is grounded by
-    <span >three main classes</span>:
-    <ol>
-        <li>
-            <a href="Query.html">Query</a> -- The abstract object representation of the
-            user's information need.</li>
-        <li>
-            <a href="Weight.html">Weight</a> -- The internal interface representation of
-            the user's Query, so that Query objects may be reused.</li>
-        <li>
-            <a href="Scorer.html">Scorer</a> -- An abstract class containing common
-            functionality for scoring. Provides both scoring and explanation capabilities.</li>
-    </ol>
-    Details on each of these classes, and their children can be found in the subsections below.
-</p>
-<h4>The Query Class</h4>
-    <p>In some sense, the
-        <a href="Query.html">Query</a>
-        class is where it all begins. Without a Query, there would be
-        nothing to score. Furthermore, the Query class is the catalyst for the other scoring classes as it
-        is often responsible
-        for creating them or coordinating the functionality between them. The
-        <a href="Query.html">Query</a> class has several methods that are important for
-        derived classes:
-        <ol>
-            <li>createWeight(Searcher searcher) -- A
-                <a href="Weight.html">Weight</a> is the internal representation of the
-                Query, so each Query implementation must
-                provide an implementation of Weight. See the subsection on <a
-                    href="#The Weight Interface">The Weight Interface</a> below for details on implementing the Weight
-                interface.</li>
-            <li>rewrite(IndexReader reader) -- Rewrites queries into primitive queries. Primitive queries are:
-                <a href="TermQuery.html">TermQuery</a>,
-                <a href="BooleanQuery.html">BooleanQuery</a>, <span
-                    >OTHERS????</span></li>
-        </ol>
-    </p>
-<h4>The Weight Interface</h4>
-    <p>The
-        <a href="Weight.html">Weight</a>
-        interface provides an internal representation of the Query so that it can be reused. Any
-        <a href="Searcher.html">Searcher</a>
-        dependent state should be stored in the Weight implementation,
-        not in the Query class. The interface defines 6 methods that must be implemented:
-        <ol>
-            <li>
-                <a href="Weight.html#getQuery()">Weight#getQuery()</a> -- Pointer to the
-                Query that this Weight represents.</li>
-            <li>
-                <a href="Weight.html#getValue()">Weight#getValue()</a> -- The weight for
-                this Query. For example, the TermQuery.TermWeight value is
-                equal to the idf^2 * boost * queryNorm <!-- DOUBLE CHECK THIS --></li>
-            <li>
-                <a href="Weight.html#sumOfSquaredWeights()">
-                    Weight#sumOfSquaredWeights()</a> -- The sum of squared weights. Tor TermQuery, this is (idf *
-                boost)^2</li>
-            <li>
-                <a href="Weight.html#normalize(float)">
-                    Weight#normalize(float)</a> -- Determine the query normalization factor. The query normalization may
-                allow for comparing scores between queries.</li>
-            <li>
-                <a href="Weight.html#scorer(IndexReader)">
-                    Weight#scorer(IndexReader)</a> -- Construct a new
-                <a href="Scorer.html">Scorer</a>
-                for this Weight. See
-                <a href="#The Scorer Class">The Scorer Class</a>
-                below for help defining a Scorer. As the name implies, the
-                Scorer is responsible for doing the actual scoring of documents given the Query.
-            </li>
-            <li>
-                <a href="Weight.html#explain(IndexReader, int)">
-                    Weight#explain(IndexReader, int)</a> -- Provide a means for explaining why a given document was
-                scored
-                the way it was.</li>
-        </ol>
-    </p>
-<h4>The Scorer Class</h4>
-    <p>The
-        <a href="Scorer.html">Scorer</a>
-        abstract class provides common scoring functionality for all Scorer implementations and
-        is the heart of the Lucene scoring process. The Scorer defines the following abstract methods which
-        must be implemented:
-        <ol>
-            <li>
-                <a href="Scorer.html#next()">Scorer#next()</a> -- Advances to the next
-                document that matches this Query, returning true if and only
-                if there is another document that matches.</li>
-            <li>
-                <a href="Scorer.html#doc()">Scorer#doc()</a> -- Returns the id of the
-                <a href="../document/Document.html">Document</a>
-                that contains the match. Is not valid until next() has been called at least once.
-            </li>
-            <li>
-                <a href="Scorer.html#score()">Scorer#score()</a> -- Return the score of the
-                current document. This value can be determined in any
-                appropriate way for an application. For instance, the
-                <a href="http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/search/TermScorer.java?view=log">TermScorer</a>
-                returns the tf * Weight.getValue() * fieldNorm.
-            </li>
-            <li>
-                <a href="Scorer.html#skipTo(int)">Scorer#skipTo(int)</a> -- Skip ahead in
-                the document matches to the document whose id is greater than
-                or equal to the passed in value. In many instances, skipTo can be
-                implemented more efficiently than simply looping through all the matching documents until
-                the target document is identified.</li>
-            <li>
-                <a href="Scorer.html#explain(int)">Scorer#explain(int)</a> -- Provides
-                details on why the score came about.</li>
-        </ol>
-    </p>
-<h4>Why would I want to add my own Query?</h4>
-
-    <p>In a nutshell, you want to add your own custom Query implementation when you think that Lucene's
-        aren't appropriate for the
-        task that you want to do. You might be doing some cutting edge research or you need more information
-        back
-        out of Lucene (similar to Doug adding SpanQuery functionality).</p>
-<h4>Examples</h4>
-    <p >FILL IN HERE</p>
-
-</body>
-</html>
+<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
+<html>
+<head>
+   <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
+   <meta name="Author" content="Doug Cutting">
+   <meta content="Grant Ingersoll"  name="Author">
+</head>
+<body>
+Code to search indices.
+
+<h2>Table Of Contents</h2>
+<p>
+    <ol>
+        <li><a href = "#search">Search Basics</a></li>
+        <li><a href = "#query">The Query Classes</a></li>
+        <li><a href = "#scoring">Changing the Scoring</a></li>
+    </ol>
+</p>
+<a name = "search"></a>
+<h2>Search</h2>
+<p>
+Search over indices.
+
+Applications usually call {@link
+org.apache.lucene.search.Searcher#search(Query)} or {@link
+org.apache.lucene.search.Searcher#search(Query,Filter)}.
+
+    <!-- FILL IN MORE HERE -->   
+</p>
+<a name = "query"></a>
+<h2>Query Classes</h2>
+<h4>
+    <a href = "TermQuery.html">TermQuery</a>
+</h4>
+
+<p>Of the various implementations of
+    <a href = "Query.html">Query</a>, the
+    <a href = "TermQuery.html">TermQuery</a>
+    is the easiest to understand and the most often used in applications. A <a
+        href="TermQuery.html">TermQuery</a> matches all the documents that contain the
+    specified
+    <a href = "index/Term.html">Term</a>,
+    which is a word that occurs in a certain
+    <a href = "document/Field.html">Field</a>.
+    Thus, a <a href = "TermQuery.html">TermQuery</a> identifies and scores all
+    <a href = "document/Document.html">Document</a>s that have a <a
+        href="../document/Field.html">Field</a> with the specified string in it.
+    Constructing a <a
+        href="TermQuery.html">TermQuery</a>
+    is as simple as:
+    <pre>
+        TermQuery tq = new TermQuery(new Term("fieldName", "term"));
+    </pre>In this example, the <a href = "Query.html">Query</a> identifies all <a
+        href="../document/Document.html">Document</a>s that have the <a
+        href="../document/Field.html">Field</a> named <tt>"fieldName"</tt>
+    containing the word <tt>"term"</tt>.
+</p>
+<h4>
+    <a href = "BooleanQuery.html">BooleanQuery</a>
+</h4>
+
+<p>Things start to get interesting when one combines multiple
+    <a href = "TermQuery.html">TermQuery</a> instances into a <a
+        href="BooleanQuery.html">BooleanQuery</a>.
+    A <a href = "BooleanQuery.html">BooleanQuery</a> contains multiple
+    <a href = "BooleanClause.html">BooleanClause</a>s,
+    where each clause contains a sub-query (<a href = "Query.html">Query</a>
+    instance) and an operator (from <a
+        href="BooleanClause.Occur.html">BooleanClause.Occur</a>)
+    describing how that sub-query is combined with the other clauses:
+    <ol>
+
+        <li><p>SHOULD &mdash; Use this operator when a clause can occur in the result set, but is not required.
+            If a query is made up of all SHOULD clauses, then every document in the result
+            set matches at least one of these clauses.</p></li>
+
+        <li><p>MUST &mdash; Use this operator when a clause is required to occur in the result set. Every
+            document in the result set will match
+            all such clauses.</p></li>
+
+        <li><p>MUST NOT &mdash; Use this operator when a
+            clause must not occur in the result set. No
+            document in the result set will match
+            any such clauses.</p></li>
+    </ol>
+    Boolean queries are constructed by adding two or more
+    <a href = "BooleanClause.html">BooleanClause</a>
+    instances. If too many clauses are added, a <a href = "BooleanQuery.TooManyClauses.html">TooManyClauses</a>
+    exception will be thrown during searching. This most often occurs
+    when a <a href = "Query.html">Query</a>
+    is rewritten into a <a href = "BooleanQuery.html">BooleanQuery</a> with many
+    <a href = "TermQuery.html">TermQuery</a> clauses,
+    for example by <a href = "WildcardQuery.html">WildcardQuery</a>.
+    The default setting for the maximum number
+    of clauses 1024, but this can be changed via the
+    static method <a href = "BooleanQuery.html#setMaxClauseCount(int)">setMaxClauseCount</a>
+    in <a href = "BooleanQuery.html">BooleanQuery</a>.
+</p>
+
+<h4>Phrases</h4>
+
+<p>Another common search is to find documents containing certain phrases. This
+    is handled two different ways:
+    <ol>
+        <li>
+            <p><a href = "PhraseQuery.html">PhraseQuery</a>
+                &mdash; Matches a sequence of
+                <a href = "index/Term.html">Terms</a>.
+                <a href = "PhraseQuery.html">PhraseQuery</a> uses a slop factor to determine
+                how many positions may occur between any two terms in the phrase and still be considered a match.</p>
+        </li>
+        <li>
+            <p><a href = "spans/SpanNearQuery.html">SpanNearQuery</a>
+                &mdash; Matches a sequence of other
+                <a href = "spans/SpanQuery.html">SpanQuery</a>
+                instances. <a href = "spans/SpanNearQuery.html">SpanNearQuery</a> allows for
+                much more
+                complicated phrase queries since it is constructed from other <a
+                    href="spans/SpanQuery.html">SpanQuery</a>
+                instances, instead of only <a href = "TermQuery.html">TermQuery</a>
+                instances.</p>
+        </li>
+    </ol>
+</p>
+<h4>
+    <a href = "RangeQuery.html">RangeQuery</a>
+</h4>
+
+<p>The
+    <a href = "RangeQuery.html">RangeQuery</a>
+    matches all documents that occur in the
+    exclusive range of a lower
+    <a href = "index/Term.html">Term</a>
+    and an upper
+    <a href = "index/Term.html">Term</a>.
+    For example, one could find all documents
+    that have terms beginning with the letters <tt>a</tt> through <tt>c</tt>. This type of <a
+        href="Query.html">Query</a> is frequently used to
+    find
+    documents that occur in a specific date range.
+</p>
+<h4>
+    <a href = "PrefixQuery.html">PrefixQuery</a>,
+    <a href = "WildcardQuery.html">WildcardQuery</a>
+</h4>
+
+<p>While the
+    <a href = "PrefixQuery.html">PrefixQuery</a>
+    has a different implementation, it is essentially a special case of the
+    <a href = "WildcardQuery.html">WildcardQuery</a>.
+    The <a href = "PrefixQuery.html">PrefixQuery</a> allows an application
+    to identify all documents with terms that begin with a certain string. The <a
+        href="WildcardQuery.html">WildcardQuery</a> generalizes this by allowing
+    for the use of <tt>*</tt> (matches 0 or more characters) and <tt>?</tt> (matches exactly one character) wildcards.
+    Note that the <a href = "WildcardQuery.html">WildcardQuery</a> can be quite slow. Also
+    note that
+    <a href = "WildcardQuery.html">WildcardQuery</a> should
+    not start with <tt>*</tt> and <tt>?</tt>, as these are extremely slow. 
+	To remove this protection and allow a wildcard at the beginning of a term, see method
+	<a href = "queryParser/QueryParser.html#setAllowLeadingWildcard(boolean)">setAllowLeadingWildcard</a> in 
+	<a href = "queryParser/QueryParser.html">QueryParser</a>.
+</p>
+<h4>
+    <a href = "FuzzyQuery.html">FuzzyQuery</a>
+</h4>
+
+<p>A
+    <a href = "FuzzyQuery.html">FuzzyQuery</a>
+    matches documents that contain terms similar to the specified term. Similarity is
+    determined using
+    <a href = "http://en.wikipedia.org//wiki/Levenshtein">Levenshtein (edit) distance</a>.
+    This type of query can be useful when accounting for spelling variations in the collection.
+</p>
+<a name = "changingSimilarity"></a>
+<h2>Changing Similarity</h2>
+
+<p>Chances are <a href = "DefaultSimilarity.html">DefaultSimilarity</a> is sufficient for all
+    your searching needs.
+    However, in some applications it may be necessary to customize your <a
+        href="Similarity.html">Similarity</a> implementation. For instance, some
+    applications do not need to
+    distinguish between shorter and longer documents (see <a
+        href="http://www.gossamer-threads.com/lists/lucene/java-user/38967#38967">a "fair" similarity</a>).</p>
+
+<p>To change <a href = "Similarity.html">Similarity</a>, one must do so for both indexing and
+    searching, and the changes must happen before
+    either of these actions take place. Although in theory there is nothing stopping you from changing mid-stream, it
+    just isn't well-defined what is going to happen.
+</p>
+
+<p>To make this change, implement your own <a href = "Similarity.html">Similarity</a> (likely
+    you'll want to simply subclass
+    <a href = "DefaultSimilarity.html">DefaultSimilarity</a>) and then use the new
+    class by calling
+    <a href = "index/IndexWriter.html#setSimilarity(org.apache.lucene.search.Similarity)">IndexWriter.setSimilarity</a>
+    before indexing and
+    <a href = "Searcher.html#setSimilarity(org.apache.lucene.search.Similarity)">Searcher.setSimilarity</a>
+    before searching.
+</p>
+
+<p>
+    If you are interested in use cases for changing your similarity, see the Lucene users's mailing list at <a
+        href="http://www.nabble.com/Overriding-Similarity-tf2128934.html">Overriding Similarity</a>.
+    In summary, here are a few use cases:
+    <ol>
+        <li><p><a href = "api/org/apache/lucene/misc/SweetSpotSimilarity.html">SweetSpotSimilarity</a> &mdash; <a
+                href="api/org/apache/lucene/misc/SweetSpotSimilarity.html">SweetSpotSimilarity</a> gives small increases
+            as the frequency increases a small amount
+            and then greater increases when you hit the "sweet spot", i.e. where you think the frequency of terms is
+            more significant.</p></li>
+        <li><p>Overriding tf &mdash; In some applications, it doesn't matter what the score of a document is as long as a
+            matching term occurs. In these
+            cases people have overridden Similarity to return 1 from the tf() method.</p></li>
+        <li><p>Changing Length Normalization &mdash; By overriding <a
+                href="Similarity.html#lengthNorm(java.lang.String,%20int)">lengthNorm</a>,
+            it is possible to discount how the length of a field contributes
+            to a score. In <a href = "DefaultSimilarity.html">DefaultSimilarity</a>,
+            lengthNorm = 1 / (numTerms in field)^0.5, but if one changes this to be
+            1 / (numTerms in field), all fields will be treated
+            <a href = "http://www.gossamer-threads.com//lists/lucene/java-user/38967#38967">"fairly"</a>.</p></li>
+    </ol>
+    In general, Chris Hostetter sums it up best in saying (from <a
+        href="http://www.gossamer-threads.com/lists/lucene/java-user/39125#39125">the Lucene users's mailing list</a>):
+    <blockquote>[One would override the Similarity in] ... any situation where you know more about your data then just
+        that
+        it's "text" is a situation where it *might* make sense to to override your
+        Similarity method.</blockquote>
+</p>
+<a name = "scoring"></a>
+<h2>Changing Scoring &mdash; Expert Level</h2>
+
+<p>Changing scoring is an expert level task, so tread carefully and be prepared to share your code if
+    you want help.
+</p>
+
+<p>With the warning out of the way, it is possible to change a lot more than just the Similarity
+    when it comes to scoring in Lucene. Lucene's scoring is a complex mechanism that is grounded by
+    <span >three main classes</span>:
+    <ol>
+        <li>
+            <a href = "Query.html">Query</a> &mdash; The abstract object representation of the
+            user's information need.</li>
+        <li>
+            <a href = "Weight.html">Weight</a> &mdash; The internal interface representation of
+            the user's Query, so that Query objects may be reused.</li>
+        <li>
+            <a href = "Scorer.html">Scorer</a> &mdash; An abstract class containing common
+            functionality for scoring. Provides both scoring and explanation capabilities.</li>
+    </ol>
+    Details on each of these classes, and their children, can be found in the subsections below.
+</p>
+<h4>The Query Class</h4>
+    <p>In some sense, the
+        <a href = "Query.html">Query</a>
+        class is where it all begins. Without a Query, there would be
+        nothing to score. Furthermore, the Query class is the catalyst for the other scoring classes as it
+        is often responsible
+        for creating them or coordinating the functionality between them. The
+        <a href = "Query.html">Query</a> class has several methods that are important for
+        derived classes:
+        <ol>
+            <li>createWeight(Searcher searcher) &mdash; A
+                <a href = "Weight.html">Weight</a> is the internal representation of the
+                Query, so each Query implementation must
+                provide an implementation of Weight. See the subsection on <a
+                    href="#The Weight Interface">The Weight Interface</a> below for details on implementing the Weight
+                interface.</li>
+            <li>rewrite(IndexReader reader) &mdash; Rewrites queries into primitive queries. Primitive queries are:
+                <a href = "TermQuery.html">TermQuery</a>,
+                <a href = "BooleanQuery.html">BooleanQuery</a>, <span
+                    >OTHERS????</span></li>
+        </ol>
+    </p>
+<h4>The Weight Interface</h4>
+    <p>The
+        <a href = "Weight.html">Weight</a>
+        interface provides an internal representation of the Query so that it can be reused. Any
+        <a href = "Searcher.html">Searcher</a>
+        dependent state should be stored in the Weight implementation,
+        not in the Query class. The interface defines six methods that must be implemented:
+        <ol>
+            <li>
+                <a href = "Weight.html#getQuery()">Weight#getQuery()</a> &mdash; Pointer to the
+                Query that this Weight represents.</li>
+            <li>
+                <a href = "Weight.html#getValue()">Weight#getValue()</a> &mdash; The weight for
+                this Query. For example, the TermQuery.TermWeight value is
+                equal to the idf^2 * boost * queryNorm <!-- DOUBLE CHECK THIS --></li>
+            <li>
+                <a href = "Weight.html#sumOfSquaredWeights()">
+                    Weight#sumOfSquaredWeights()</a> &mdash; The sum of squared weights. For TermQuery, this is (idf *
+                boost)^2</li>
+            <li>
+                <a href = "Weight.html#normalize(float)">
+                    Weight#normalize(float)</a> &mdash; Determine the query normalization factor. The query normalization may
+                allow for comparing scores between queries.</li>
+            <li>
+                <a href = "Weight.html#scorer(IndexReader)">
+                    Weight#scorer(IndexReader)</a> &mdash; Construct a new
+                <a href = "Scorer.html">Scorer</a>
+                for this Weight. See
+                <a href = "#The Scorer Class">The Scorer Class</a>
+                below for help defining a Scorer. As the name implies, the
+                Scorer is responsible for doing the actual scoring of documents given the Query.
+            </li>
+            <li>
+                <a href = "Weight.html#explain(IndexReader, int)">
+                    Weight#explain(IndexReader, int)</a> &mdash; Provide a means for explaining why a given document was
+                scored
+                the way it was.</li>
+        </ol>
+    </p>
+<h4>The Scorer Class</h4>
+    <p>The
+        <a href = "Scorer.html">Scorer</a>
+        abstract class provides common scoring functionality for all Scorer implementations and
+        is the heart of the Lucene scoring process. The Scorer defines the following abstract methods which
+        must be implemented:
+        <ol>
+            <li>
+                <a href = "Scorer.html#next()">Scorer#next()</a> &mdash; Advances to the next
+                document that matches this Query, returning true if and only
+                if there is another document that matches.</li>
+            <li>
+                <a href = "Scorer.html#doc()">Scorer#doc()</a> &mdash; Returns the id of the
+                <a href = "document/Document.html">Document</a>
+                that contains the match. It is not valid until next() has been called at least once.
+            </li>
+            <li>
+                <a href = "Scorer.html#score()">Scorer#score()</a> &mdash; Return the score of the
+                current document. This value can be determined in any
+                appropriate way for an application. For instance, the
+                <a href = "http://svn.apache.org//viewvc/lucene/java/trunk/src/java/org/apache/lucene/search/TermScorer.java?view=log">TermScorer</a>
+                returns the tf * Weight.getValue() * fieldNorm.
+            </li>
+            <li>
+                <a href = "Scorer.html#skipTo(int)">Scorer#skipTo(int)</a> &mdash; Skip ahead in
+                the document matches to the document whose id is greater than
+                or equal to the passed in value. In many instances, skipTo can be
+                implemented more efficiently than simply looping through all the matching documents until
+                the target document is identified.</li>
+            <li>
+                <a href = "Scorer.html#explain(int)">Scorer#explain(int)</a> &mdash; Provides
+                details on why the score came about.</li>
+        </ol>
+    </p>
+<h4>Why would I want to add my own Query?</h4>
+
+    <p>In a nutshell, you want to add your own custom Query implementation when you think that Lucene's
+        aren't appropriate for the
+        task that you want to do. You might be doing some cutting edge research or you need more information
+        back
+        out of Lucene (similar to Doug adding SpanQuery functionality).</p>
+<h4>Examples</h4>
+    <p >FILL IN HERE</p>
+
+</body>
+</html>



Mime
View raw message