Return-Path: Delivered-To: apmail-lucene-lucene-net-commits-archive@www.apache.org Received: (qmail 19495 invoked from network); 25 Feb 2010 16:32:42 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 25 Feb 2010 16:32:42 -0000 Received: (qmail 4905 invoked by uid 500); 25 Feb 2010 16:32:42 -0000 Delivered-To: apmail-lucene-lucene-net-commits-archive@lucene.apache.org Received: (qmail 4862 invoked by uid 500); 25 Feb 2010 16:32:42 -0000 Mailing-List: contact lucene-net-commits-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: lucene-net-dev@lucene.apache.org Delivered-To: mailing list lucene-net-commits@lucene.apache.org Received: (qmail 4855 invoked by uid 99); 25 Feb 2010 16:32:42 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Feb 2010 16:32:42 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Feb 2010 16:32:32 +0000 Received: by eris.apache.org (Postfix, from userid 65534) id 4611223889D7; Thu, 25 Feb 2010 16:32:12 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: svn commit: r916340 [2/3] - in /lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net: ./ FastVectorHighlighter.Net/ FastVectorHighlighter.Net/Properties/ Test/ Test/Properties/ Date: Thu, 25 Feb 2010 16:32:11 -0000 To: lucene-net-commits@lucene.apache.org From: digy@apache.org X-Mailer: svnmailer-1.0.8 Message-Id: <20100225163212.4611223889D7@eris.apache.org> Added: lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net/FastVectorHighlighter.Net/package.html URL: http://svn.apache.org/viewvc/lucene/lucene.net/trunk/C%23/contrib/FastVectorHighlighter.Net/FastVectorHighlighter.Net/package.html?rev=916340&view=auto ============================================================================== --- lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net/FastVectorHighlighter.Net/package.html (added) +++ lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net/FastVectorHighlighter.Net/package.html Thu Feb 25 16:32:11 2010 @@ -0,0 +1,144 @@ + + + + +This is an another highlighter implementation. + +

Features

+
    +
  • fast for large docs
  • +
  • support N-gram fields
  • +
  • support phrase-unit highlighting with slops
  • +
  • need Java 1.5
  • +
  • highlight fields need to be TermVector.WITH_POSITIONS_OFFSETS
  • +
  • take into account query boost to score fragments
  • +
  • support colored highlight tags
  • +
  • pluggable FragListBuilder
  • +
  • pluggable FragmentsBuilder
  • +
+ +

Algorithm

+

To explain the algorithm, let's use the following sample text + (to be highlighted) and user query:

+ + + + + + + + + + +
Sample TextLucene is a search engine library.
User QueryLucene^2 OR "search library"~1
+ +

The user query is a BooleanQuery that consists of TermQuery("Lucene") +with boost of 2 and PhraseQuery("search library") with slop of 1.

+

For your convenience, here is the offsets and positions info of the +sample text.

+ +
++--------+-----------------------------------+
+|        |          1111111111222222222233333|
+|  offset|01234567890123456789012345678901234|
++--------+-----------------------------------+
+|document|Lucene is a search engine library. |
++--------*-----------------------------------+
+|position|0      1  2 3      4      5        |
++--------*-----------------------------------+
+
+ +

Step 1.

+

In Step 1, Fast Vector Highlighter generates {@link org.apache.lucene.search.vectorhighlight.FieldQuery.QueryPhraseMap} from the user query. +QueryPhraseMap consists of the following members:

+
+public class QueryPhraseMap {
+  boolean terminal;
+  int slop;   // valid if terminal == true and phraseHighlight == true
+  float boost;  // valid if terminal == true
+  Map<String, QueryPhraseMap> subMap;
+} 
+
+

QueryPhraseMap has subMap. The key of the subMap is a term +text in the user query and the value is a subsequent QueryPhraseMap. +If the query is a term (not phrase), then the subsequent QueryPhraseMap +is marked as terminal. If the query is a phrase, then the subsequent QueryPhraseMap +is not a terminal and it has the next term text in the phrase.

+ +

From the sample user query, the following QueryPhraseMap +will be generated:

+
+   QueryPhraseMap
++--------+-+  +-------+-+
+|"Lucene"|o+->|boost=2|*|  * : terminal
++--------+-+  +-------+-+
+
++--------+-+  +---------+-+  +-------+------+-+
+|"search"|o+->|"library"|o+->|boost=1|slop=1|*|
++--------+-+  +---------+-+  +-------+------+-+
+
+ +

Step 2.

+

In Step 2, Fast Vector Highlighter generates {@link org.apache.lucene.search.vectorhighlight.FieldTermStack}. Fast Vector Highlighter uses {@link org.apache.lucene.index.TermFreqVector} data +(must be stored {@link org.apache.lucene.document.Field.TermVector#WITH_POSITIONS_OFFSETS}) +to generate it. FieldTermStack keeps the terms in the user query. +Therefore, in this sample case, Fast Vector Highlighter generates the following FieldTermStack:

+
+   FieldTermStack
++------------------+
+|"Lucene"(0,6,0)   |
++------------------+
+|"search"(12,18,3) |
++------------------+
+|"library"(26,33,5)|
++------------------+
+where : "termText"(startOffset,endOffset,position)
+
+

Step 3.

+

In Step 3, Fast Vector Highlighter generates {@link org.apache.lucene.search.vectorhighlight.FieldPhraseList} +by reference to QueryPhraseMap and FieldTermStack.

+
+   FieldPhraseList
++----------------+-----------------+---+
+|"Lucene"        |[(0,6)]          |w=2|
++----------------+-----------------+---+
+|"search library"|[(12,18),(26,33)]|w=1|
++----------------+-----------------+---+
+
+

The type of each entry is WeightedPhraseInfo that consists of +an array of terms offsets and weight. The weight (Fast Vector Highlighter uses query boost to +calculate the weight) will be taken into account when Fast Vector Highlighter creates +{@link org.apache.lucene.search.vectorhighlight.FieldFragList} in the next step.

+

Step 4.

+

In Step 4, Fast Vector Highlighter creates FieldFragList by reference to +FieldPhraseList. In this sample case, the following +FieldFragList will be generated:

+
+   FieldFragList
++---------------------------------+
+|"Lucene"[(0,6)]                  |
+|"search library"[(12,18),(26,33)]|
+|totalBoost=3                     |
++---------------------------------+
+
+

Step 5.

+

In Step 5, by using FieldFragList and the field stored data, +Fast Vector Highlighter creates highlighted snippets!

+ + + Added: lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net/FastVectorHighlighter.sln URL: http://svn.apache.org/viewvc/lucene/lucene.net/trunk/C%23/contrib/FastVectorHighlighter.Net/FastVectorHighlighter.sln?rev=916340&view=auto ============================================================================== --- lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net/FastVectorHighlighter.sln (added) +++ lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net/FastVectorHighlighter.sln Thu Feb 25 16:32:11 2010 @@ -0,0 +1,26 @@ + +Microsoft Visual Studio Solution File, Format Version 10.00 +# Visual C# Express 2008 +Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "FastVectorHighlighter.Net", "FastVectorHighlighter.Net\FastVectorHighlighter.Net.csproj", "{9D2E3153-076F-49C5-B83D-FB2573536B5F}" +EndProject +Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "Test", "Test\Test.csproj", "{33ED01FD-A87C-4208-BA49-2586EFE32974}" +EndProject +Global + GlobalSection(SolutionConfigurationPlatforms) = preSolution + Debug|Any CPU = Debug|Any CPU + Release|Any CPU = Release|Any CPU + EndGlobalSection + GlobalSection(ProjectConfigurationPlatforms) = postSolution + {9D2E3153-076F-49C5-B83D-FB2573536B5F}.Debug|Any CPU.ActiveCfg = Debug|Any CPU + {9D2E3153-076F-49C5-B83D-FB2573536B5F}.Debug|Any CPU.Build.0 = Debug|Any CPU + {9D2E3153-076F-49C5-B83D-FB2573536B5F}.Release|Any CPU.ActiveCfg = Release|Any CPU + {9D2E3153-076F-49C5-B83D-FB2573536B5F}.Release|Any CPU.Build.0 = Release|Any CPU + {33ED01FD-A87C-4208-BA49-2586EFE32974}.Debug|Any CPU.ActiveCfg = Debug|Any CPU + {33ED01FD-A87C-4208-BA49-2586EFE32974}.Debug|Any CPU.Build.0 = Debug|Any CPU + {33ED01FD-A87C-4208-BA49-2586EFE32974}.Release|Any CPU.ActiveCfg = Release|Any CPU + {33ED01FD-A87C-4208-BA49-2586EFE32974}.Release|Any CPU.Build.0 = Release|Any CPU + EndGlobalSection + GlobalSection(SolutionProperties) = preSolution + HideSolutionNode = FALSE + EndGlobalSection +EndGlobal Added: lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net/Test/AbstractTestCase.cs URL: http://svn.apache.org/viewvc/lucene/lucene.net/trunk/C%23/contrib/FastVectorHighlighter.Net/Test/AbstractTestCase.cs?rev=916340&view=auto ============================================================================== --- lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net/Test/AbstractTestCase.cs (added) +++ lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net/Test/AbstractTestCase.cs Thu Feb 25 16:32:11 2010 @@ -0,0 +1,400 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +using System; +using System.Collections.Generic; +using System.Text; + + +using Lucene.Net.Analysis; +using Lucene.Net.Analysis.Tokenattributes; +using Lucene.Net.Search; +using Lucene.Net.Documents; +using Lucene.Net.QueryParsers; +using Lucene.Net.Store; +using Lucene.Net.Index; +using Lucene.Net.Util; + +using NUnit.Framework; + +namespace Lucene.Net.Search.Vectorhighlight +{ + public abstract class AbstractTestCase + { + + protected String F = "f"; + protected String F1 = "f1"; + protected String F2 = "f2"; + protected Directory dir; + protected Analyzer analyzerW; + protected Analyzer analyzerB; + protected IndexReader reader; + protected QueryParser paW; + protected QueryParser paB; + + protected static String[] shortMVValues = { + "a b c", + "", // empty data in multi valued field + "d e" + }; + + protected static String[] longMVValues = { + "Followings are the examples of customizable parameters and actual examples of customization:", + "The most search engines use only one of these methods. Even the search engines that says they can use the both methods basically" + }; + + // test data for LUCENE-1448 bug + protected static String[] biMVValues = { + "\nLucene/Solr does not require such additional hardware.", + "\nWhen you talk about processing speed, the" + }; + + [SetUp] + public void SetUp() + { + analyzerW = new WhitespaceAnalyzer(); + analyzerB = new BigramAnalyzer(); + paW = new QueryParser(F, analyzerW); + paB = new QueryParser(F, analyzerB); + dir = new RAMDirectory(); + } + + [TearDown] + public void TearDown() + { + if (reader != null) + { + reader.Close(); + reader = null; + } + } + + protected Query Tq(String text) + { + return Tq(1F, text); + } + + protected Query Tq(float boost, String text) + { + return Tq(boost, F, text); + } + + protected Query Tq(String field, String text) + { + return Tq(1F, field, text); + } + + protected Query Tq(float boost, String field, String text) + { + Query query = new TermQuery(new Term(field, text)); + query.SetBoost(boost); + return query; + } + + protected Query PqF(params String[] texts) + { + return PqF(1F, texts); + } + + //protected Query pqF(String[] texts) + //{ + // return pqF(1F, texts); + //} + + protected Query PqF(float boost, params String[] texts) + { + return pqF(boost, 0, texts); + } + + protected Query pqF(float boost, int slop, params String[] texts) + { + return Pq(boost, slop, F, texts); + } + + protected Query Pq(String field, params String[] texts) + { + return Pq(1F, 0, field, texts); + } + + protected Query Pq(float boost, String field, params String[] texts) + { + return Pq(boost, 0, field, texts); + } + + protected Query Pq(float boost, int slop, String field, params String[] texts) + { + PhraseQuery query = new PhraseQuery(); + foreach (String text in texts) + { + query.Add(new Term(field, text)); + } + query.SetBoost(boost); + query.SetSlop(slop); + return query; + } + + protected void AssertCollectionQueries(Dictionary actual, params Query[] expected) + { + + Assert.AreEqual(expected.Length, actual.Count); + foreach (Query query in expected) + { + Assert.IsTrue(actual.ContainsKey(query)); + } + } + + class BigramAnalyzer : Analyzer + { + public override TokenStream TokenStream(String fieldName, System.IO.TextReader reader) + { + return new BasicNGramTokenizer(reader); + } + } + + class BasicNGramTokenizer : Tokenizer + { + + public static int DEFAULT_N_SIZE = 2; + public static String DEFAULT_DELIMITERS = " \t\n.,"; + private int n; + private String delimiters; + private int startTerm; + private int lenTerm; + private int startOffset; + private int nextStartOffset; + private int ch; + private String snippet; + private StringBuilder snippetBuffer; + private static int BUFFER_SIZE = 4096; + private char[] charBuffer; + private int charBufferIndex; + private int charBufferLen; + + public BasicNGramTokenizer(System.IO.TextReader inReader): this(inReader, DEFAULT_N_SIZE) + { + } + + public BasicNGramTokenizer(System.IO.TextReader inReader, int n): this(inReader, n, DEFAULT_DELIMITERS) + { + } + + public BasicNGramTokenizer(System.IO.TextReader inReader, String delimiters) : this(inReader, DEFAULT_N_SIZE, delimiters) + { + } + + public BasicNGramTokenizer(System.IO.TextReader inReader, int n, String delimiters) : base(inReader) + { + this.n = n; + this.delimiters = delimiters; + startTerm = 0; + nextStartOffset = 0; + snippet = null; + snippetBuffer = new StringBuilder(); + charBuffer = new char[BUFFER_SIZE]; + charBufferIndex = BUFFER_SIZE; + charBufferLen = 0; + ch = 0; + + Init(); + } + + void Init() + { + termAtt = (TermAttribute)AddAttribute(typeof(TermAttribute)); + offsetAtt = (OffsetAttribute)AddAttribute(typeof(OffsetAttribute)); + } + + TermAttribute termAtt = null; + OffsetAttribute offsetAtt = null; + + public override bool IncrementToken() + { + if (!GetNextPartialSnippet()) + return false; + ClearAttributes(); + termAtt.SetTermBuffer(snippet, startTerm, lenTerm); + offsetAtt.SetOffset(CorrectOffset(startOffset), CorrectOffset(startOffset + lenTerm)); + return true; + } + + private int GetFinalOffset() + { + return nextStartOffset; + } + + public override void End() + { + offsetAtt.SetOffset(GetFinalOffset(), GetFinalOffset()); + } + + protected bool GetNextPartialSnippet() + { + if (snippet != null && snippet.Length >= startTerm + 1 + n) + { + startTerm++; + startOffset++; + lenTerm = n; + return true; + } + return GetNextSnippet(); + } + + protected bool GetNextSnippet() + { + startTerm = 0; + startOffset = nextStartOffset; + snippetBuffer.Remove(0, snippetBuffer.Length); + while (true) + { + if (ch != -1) + ch = ReadCharFromBuffer(); + if (ch == -1) break; + else if (!IsDelimiter(ch)) + snippetBuffer.Append((char)ch); + else if (snippetBuffer.Length > 0) + break; + else + startOffset++; + } + if (snippetBuffer.Length == 0) + return false; + snippet = snippetBuffer.ToString(); + lenTerm = snippet.Length >= n ? n : snippet.Length; + return true; + } + + protected int ReadCharFromBuffer() + { + if (charBufferIndex >= charBufferLen) + { + charBufferLen = input.Read(charBuffer,0,charBuffer.Length); + if (charBufferLen <= 0) + { + return -1; + } + charBufferIndex = 0; + } + int c = (int)charBuffer[charBufferIndex++]; + nextStartOffset++; + return c; + } + + protected bool IsDelimiter(int c) + { + return delimiters.IndexOf(Convert.ToChar(c) ) >= 0; + } + } + + protected void Make1d1fIndex(String value) + { + Make1dmfIndex( value ); + } + + protected void Make1d1fIndexB(String value) + { + Make1dmfIndexB( value ); + } + + protected void Make1dmfIndex(params String[] values) + { + Make1dmfIndex(analyzerW, values); + } + + protected void Make1dmfIndexB(params String[] values) + { + Make1dmfIndex(analyzerB, values); + } + + protected void Make1dmfIndex(Analyzer analyzer, params String[] values) + { + IndexWriter writer = new IndexWriter(dir, analyzer, true, IndexWriter.MaxFieldLength.LIMITED); + Document doc = new Document(); + foreach (String value in values) + doc.Add(new Field(F, value, Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.WITH_POSITIONS_OFFSETS)); + writer.AddDocument(doc); + writer.Close(); + + reader = IndexReader.Open(dir); + } + + protected void MakeIndexShortMV() + { + + // 012345 + // "a b c" + // 0 1 2 + + // "" + + // 6789 + // "d e" + // 3 4 + Make1dmfIndex(shortMVValues); + } + + protected void MakeIndexLongMV() + { + // 11111111112222222222333333333344444444445555555555666666666677777777778888888888999 + // 012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012 + // Followings are the examples of customizable parameters and actual examples of customization: + // 0 1 2 3 4 5 6 7 8 9 10 11 + + // 1 2 + // 999999900000000001111111111222222222233333333334444444444555555555566666666667777777777888888888899999999990000000000111111111122 + // 345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901 + // The most search engines use only one of these methods. Even the search engines that says they can use the both methods basically + // 12 13 (14) (15) 16 17 18 19 20 21 22 23 (24) (25) 26 27 28 29 30 31 32 33 34 + + Make1dmfIndex(longMVValues); + } + + protected void MakeIndexLongMVB() + { + // "*" [] LF + + // 1111111111222222222233333333334444444444555555 + // 01234567890123456789012345678901234567890123456789012345 + // *Lucene/Solr does not require such additional hardware. + // Lu 0 do 10 re 15 su 21 na 31 + // uc 1 oe 11 eq 16 uc 22 al 32 + // ce 2 es 12 qu 17 ch 23 ha 33 + // en 3 no 13 ui 18 ad 24 ar 34 + // ne 4 ot 14 ir 19 dd 25 rd 35 + // e/ 5 re 20 di 26 dw 36 + // /S 6 it 27 wa 37 + // So 7 ti 28 ar 38 + // ol 8 io 29 re 39 + // lr 9 on 30 + + // 5555666666666677777777778888888888999999999 + // 6789012345678901234567890123456789012345678 + // *When you talk about processing speed, the + // Wh 40 ab 48 es 56 th 65 + // he 41 bo 49 ss 57 he 66 + // en 42 ou 50 si 58 + // yo 43 ut 51 in 59 + // ou 44 pr 52 ng 60 + // ta 45 ro 53 sp 61 + // al 46 oc 54 pe 62 + // lk 47 ce 55 ee 63 + // ed 64 + + Make1dmfIndexB(biMVValues); + } + } +} Added: lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net/Test/FieldPhraseListTest.cs URL: http://svn.apache.org/viewvc/lucene/lucene.net/trunk/C%23/contrib/FastVectorHighlighter.Net/Test/FieldPhraseListTest.cs?rev=916340&view=auto ============================================================================== --- lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net/Test/FieldPhraseListTest.cs (added) +++ lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net/Test/FieldPhraseListTest.cs Thu Feb 25 16:32:11 2010 @@ -0,0 +1,226 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +using System; +using System.Collections.Generic; +using System.Text; + +using Lucene.Net.Documents; +using Lucene.Net.Search; + +using NUnit.Framework; + +namespace Lucene.Net.Search.Vectorhighlight +{ + [TestFixture] + public class FieldPhraseListTest : AbstractTestCase + { + [Test] + public void Test1TermIndex() + { + Make1d1fIndex("a"); + + FieldQuery fq = new FieldQuery(Tq("a"), true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + FieldPhraseList fpl = new FieldPhraseList(stack, fq); + Assert.AreEqual(1, fpl.phraseList.Count); + Assert.AreEqual("a(1.0)((0,1))", fpl.phraseList.First.Value.ToString()); + + fq = new FieldQuery(Tq("b"), true, true); + stack = new FieldTermStack(reader, 0, F, fq); + fpl = new FieldPhraseList(stack, fq); + Assert.AreEqual(0, fpl.phraseList.Count); + } + + [Test] + public void Test2TermsIndex() + { + Make1d1fIndex("a a"); + + FieldQuery fq = new FieldQuery(Tq("a"), true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + FieldPhraseList fpl = new FieldPhraseList(stack, fq); + Assert.AreEqual(2, fpl.phraseList.Count); + Assert.AreEqual("a(1.0)((0,1))", fpl.phraseList.First.Value.ToString()); + Assert.AreEqual("a(1.0)((2,3))", fpl.phraseList.First.Next.Value.ToString()); + } + + [Test] + public void Test1PhraseIndex() + { + Make1d1fIndex("a b"); + + FieldQuery fq = new FieldQuery(PqF("a", "b"), true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + FieldPhraseList fpl = new FieldPhraseList(stack, fq); + Assert.AreEqual(1, fpl.phraseList.Count); + Assert.AreEqual("ab(1.0)((0,3))", fpl.phraseList.First.Value.ToString()); + + fq = new FieldQuery(Tq("b"), true, true); + stack = new FieldTermStack(reader, 0, F, fq); + fpl = new FieldPhraseList(stack, fq); + Assert.AreEqual(1, fpl.phraseList.Count); + Assert.AreEqual("b(1.0)((2,3))", fpl.phraseList.First.Value.ToString()); + } + + [Test] + public void Test1PhraseIndexB() + { + // 01 12 23 34 45 56 67 78 (offsets) + // bb|bb|ba|ac|cb|ba|ab|bc + // 0 1 2 3 4 5 6 7 (positions) + Make1d1fIndexB("bbbacbabc"); + + FieldQuery fq = new FieldQuery(PqF("ba", "ac"), true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + FieldPhraseList fpl = new FieldPhraseList(stack, fq); + Assert.AreEqual(1, fpl.phraseList.Count); + Assert.AreEqual("baac(1.0)((2,5))", fpl.phraseList.First.Value.ToString()); + } + + [Test] + public void Test2ConcatTermsIndexB() + { + // 01 12 23 (offsets) + // ab|ba|ab + // 0 1 2 (positions) + Make1d1fIndexB("abab"); + + FieldQuery fq = new FieldQuery(Tq("ab"), true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + FieldPhraseList fpl = new FieldPhraseList(stack, fq); + Assert.AreEqual(2, fpl.phraseList.Count); + Assert.AreEqual("ab(1.0)((0,2))", fpl.phraseList.First.Value.ToString()); + Assert.AreEqual("ab(1.0)((2,4))", fpl.phraseList.First.Next.Value.ToString()); + } + + [Test] + public void Test2Terms1PhraseIndex() + { + Make1d1fIndex("c a a b"); + + // phraseHighlight = true + FieldQuery fq = new FieldQuery(PqF("a", "b"), true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + FieldPhraseList fpl = new FieldPhraseList(stack, fq); + Assert.AreEqual(1, fpl.phraseList.Count); + Assert.AreEqual("ab(1.0)((4,7))", fpl.phraseList.First.Value.ToString()); + + // phraseHighlight = false + fq = new FieldQuery(PqF("a", "b"), false, true); + stack = new FieldTermStack(reader, 0, F, fq); + fpl = new FieldPhraseList(stack, fq); + Assert.AreEqual(2, fpl.phraseList.Count); + Assert.AreEqual("a(1.0)((2,3))", fpl.phraseList.First.Value.ToString()); + Assert.AreEqual("ab(1.0)((4,7))", fpl.phraseList.First.Next.Value.ToString()); + } + + [Test] + public void TestPhraseSlop() + { + Make1d1fIndex("c a a b c"); + + FieldQuery fq = new FieldQuery(pqF(2F, 1, "a", "c"), true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + FieldPhraseList fpl = new FieldPhraseList(stack, fq); + Assert.AreEqual(1, fpl.phraseList.Count); + Assert.AreEqual("ac(2.0)((4,5)(8,9))", fpl.phraseList.First.Value.ToString()); + Assert.AreEqual(4, fpl.phraseList.First.Value.GetStartOffset()); + Assert.AreEqual(9, fpl.phraseList.First.Value.GetEndOffset()); + } + + [Test] + public void Test2PhrasesOverlap() + { + Make1d1fIndex("d a b c d"); + + BooleanQuery query = new BooleanQuery(); + query.Add(PqF("a", "b"), Lucene.Net.Search.BooleanClause.Occur.SHOULD); + query.Add(PqF("b", "c"), Lucene.Net.Search.BooleanClause.Occur.SHOULD); + FieldQuery fq = new FieldQuery(query, true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + FieldPhraseList fpl = new FieldPhraseList(stack, fq); + Assert.AreEqual(1, fpl.phraseList.Count); + Assert.AreEqual("abc(1.0)((2,7))", fpl.phraseList.First.Value.ToString()); + } + + [Test] + public void Test3TermsPhrase() + { + Make1d1fIndex("d a b a b c d"); + + FieldQuery fq = new FieldQuery(PqF("a", "b", "c"), true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + FieldPhraseList fpl = new FieldPhraseList(stack, fq); + Assert.AreEqual(1, fpl.phraseList.Count); + Assert.AreEqual("abc(1.0)((6,11))", fpl.phraseList.First.Value.ToString()); + } + + [Test] + public void TestSearchLongestPhrase() + { + Make1d1fIndex("d a b d c a b c"); + + BooleanQuery query = new BooleanQuery(); + query.Add(PqF("a", "b"), Lucene.Net.Search.BooleanClause.Occur.SHOULD); + query.Add(PqF("a", "b", "c"), Lucene.Net.Search.BooleanClause.Occur.SHOULD); + FieldQuery fq = new FieldQuery(query, true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + FieldPhraseList fpl = new FieldPhraseList(stack, fq); + Assert.AreEqual(2, fpl.phraseList.Count); + Assert.AreEqual("ab(1.0)((2,5))", fpl.phraseList.First.Value.ToString()); + Assert.AreEqual("abc(1.0)((10,15))", fpl.phraseList.First.Next.Value.ToString()); + } + + [Test] + public void Test1PhraseShortMV() + { + MakeIndexShortMV(); + + FieldQuery fq = new FieldQuery(Tq("d"), true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + FieldPhraseList fpl = new FieldPhraseList(stack, fq); + Assert.AreEqual(1, fpl.phraseList.Count); + Assert.AreEqual("d(1.0)((6,7))", fpl.phraseList.First.Value.ToString()); + } + + [Test] + public void Test1PhraseLongMV() + { + MakeIndexLongMV(); + + FieldQuery fq = new FieldQuery(PqF("search", "engines"), true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + FieldPhraseList fpl = new FieldPhraseList(stack, fq); + Assert.AreEqual(2, fpl.phraseList.Count); + Assert.AreEqual("searchengines(1.0)((102,116))", fpl.phraseList.First.Value.ToString()); + Assert.AreEqual("searchengines(1.0)((157,171))", fpl.phraseList.First.Next.Value.ToString()); + } + + [Test] + public void Test1PhraseLongMVB() + { + MakeIndexLongMVB(); + + FieldQuery fq = new FieldQuery(PqF("sp", "pe", "ee", "ed"), true, true); // "speed" -(2gram)-> "sp","pe","ee","ed" + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + FieldPhraseList fpl = new FieldPhraseList(stack, fq); + Assert.AreEqual(1, fpl.phraseList.Count); + Assert.AreEqual("sppeeeed(1.0)((88,93))", fpl.phraseList.First.Value.ToString()); + } + } +} Added: lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net/Test/FieldQueryTest.cs URL: http://svn.apache.org/viewvc/lucene/lucene.net/trunk/C%23/contrib/FastVectorHighlighter.Net/Test/FieldQueryTest.cs?rev=916340&view=auto ============================================================================== --- lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net/Test/FieldQueryTest.cs (added) +++ lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net/Test/FieldQueryTest.cs Thu Feb 25 16:32:11 2010 @@ -0,0 +1,879 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +using System; +using System.Collections.Generic; +using System.Text; + +using Lucene.Net.Search; +using Occur = Lucene.Net.Search.BooleanClause.Occur; + +using QueryPhraseMap = Lucene.Net.Search.Vectorhighlight.FieldQuery.QueryPhraseMap; +using TermInfo = Lucene.Net.Search.Vectorhighlight.FieldTermStack.TermInfo; + +using NUnit.Framework; + +namespace Lucene.Net.Search.Vectorhighlight +{ + [TestFixture] + public class FieldQueryTest : AbstractTestCase + { + [Test] + public void TestFlattenBoolean() + { + Query query = paW.Parse("A AND B OR C NOT (D AND E)"); + FieldQuery fq = new FieldQuery(query, true, true); + HashSet flatQueries = new HashSet(); + fq.flatten(query, flatQueries); + AssertCollectionQueries(flatQueries, Tq("A"), Tq("B"), Tq("C")); + } + + [Test] + public void TestFlattenTermAndPhrase() + { + Query query = paW.Parse("A AND \"B C\""); + FieldQuery fq = new FieldQuery(query, true, true); + HashSet flatQueries = new HashSet(); + fq.flatten(query, flatQueries); + AssertCollectionQueries(flatQueries, Tq("A"), PqF("B", "C")); + } + + [Test] + public void TestFlattenTermAndPhrase2gram() + { + Query query = paB.Parse("AA AND BCD OR EFGH"); + FieldQuery fq = new FieldQuery(query, true, true); + HashSet flatQueries = new HashSet(); + fq.flatten(query, flatQueries); + AssertCollectionQueries(flatQueries, Tq("AA"), PqF("BC", "CD" ), PqF("EF", "FG", "GH")); + } + + [Test] + public void TestFlatten1TermPhrase() + { + Query query = PqF("A"); + FieldQuery fq = new FieldQuery(query, true, true); + HashSet flatQueries = new HashSet(); + fq.flatten(query, flatQueries); + AssertCollectionQueries(flatQueries, Tq("A")); + } + + [Test] + public void TestExpand() + { + Query dummy = PqF("DUMMY"); + FieldQuery fq = new FieldQuery(dummy, true, true); + + // "a b","b c" => "a b","b c","a b c" + HashSet flatQueries = new HashSet(); + flatQueries.Add(PqF("a", "b")); + flatQueries.Add(PqF( "b", "c" )); + AssertCollectionQueries(fq.expand(flatQueries), + PqF("a", "b"), PqF("b", "c"), PqF("a", "b", "c")); + + // "a b","b c d" => "a b","b c d","a b c d" + flatQueries = new HashSet(); + flatQueries.Add(PqF("a", "b")); + flatQueries.Add(PqF("b", "c", "d")); + AssertCollectionQueries(fq.expand(flatQueries), + PqF("a", "b"), PqF("b", "c", "d"), PqF("a", "b", "c", "d")); + + // "a b c","b c d" => "a b c","b c d","a b c d" + flatQueries = new HashSet(); + flatQueries.Add(PqF("a", "b", "c")); + flatQueries.Add(PqF("b", "c", "d")); + AssertCollectionQueries(fq.expand(flatQueries), + PqF("a", "b", "c"), PqF("b", "c", "d"), PqF("a", "b", "c", "d")); + + // "a b c","c d e" => "a b c","c d e","a b c d e" + flatQueries = new HashSet(); + flatQueries.Add(PqF("a", "b", "c")); + flatQueries.Add(PqF("c", "d", "e")); + AssertCollectionQueries(fq.expand(flatQueries), + PqF("a", "b", "c"), PqF("c", "d", "e"), PqF("a", "b", "c", "d", "e")); + + // "a b c d","b c" => "a b c d","b c" + flatQueries = new HashSet(); + flatQueries.Add(PqF("a", "b", "c", "d")); + flatQueries.Add(PqF("b", "c")); + AssertCollectionQueries(fq.expand(flatQueries), + PqF("a", "b", "c", "d"), PqF("b", "c")); + + // "a b b","b c" => "a b b","b c","a b b c" + flatQueries = new HashSet(); + flatQueries.Add(PqF("a", "b", "b")); + flatQueries.Add(PqF("b", "c")); + AssertCollectionQueries(fq.expand(flatQueries), + PqF("a", "b", "b"), PqF("b", "c"), PqF("a", "b", "b", "c")); + + // "a b","b a" => "a b","b a","a b a", "b a b" + flatQueries = new HashSet(); + flatQueries.Add(PqF("a", "b")); + flatQueries.Add(PqF("b", "a")); + AssertCollectionQueries(fq.expand(flatQueries), + PqF("a", "b"), PqF("b", "a"), PqF("a", "b", "a"), PqF("b", "a", "b")); + + // "a b","a b c" => "a b","a b c" + flatQueries = new HashSet(); + flatQueries.Add(PqF("a", "b")); + flatQueries.Add(PqF("a", "b", "c")); + AssertCollectionQueries(fq.expand(flatQueries), + PqF("a", "b"), PqF("a", "b", "c")); + } + + [Test] + public void TestNoExpand() + { + Query dummy = PqF("DUMMY"); + FieldQuery fq = new FieldQuery(dummy, true, true); + + // "a b","c d" => "a b","c d" + HashSet flatQueries = new HashSet(); + flatQueries.Add(PqF("a", "b")); + flatQueries.Add(PqF("c", "d")); + AssertCollectionQueries(fq.expand(flatQueries), + PqF("a", "b"), PqF("c", "d")); + + // "a","a b" => "a", "a b" + flatQueries = new HashSet(); + flatQueries.Add(Tq("a")); + flatQueries.Add(PqF("a", "b")); + AssertCollectionQueries(fq.expand(flatQueries), + Tq("a"), PqF("a", "b")); + + // "a b","b" => "a b", "b" + flatQueries = new HashSet(); + flatQueries.Add(PqF("a", "b")); + flatQueries.Add(Tq("b")); + AssertCollectionQueries(fq.expand(flatQueries), + PqF("a", "b"), Tq("b")); + + // "a b c","b c" => "a b c","b c" + flatQueries = new HashSet(); + flatQueries.Add(PqF("a", "b", "c")); + flatQueries.Add(PqF("b", "c")); + AssertCollectionQueries(fq.expand(flatQueries), + PqF("a", "b", "c"), PqF("b", "c")); + + // "a b","a b c" => "a b","a b c" + flatQueries = new HashSet(); + flatQueries.Add(PqF("a", "b")); + flatQueries.Add(PqF("a", "b", "c")); + AssertCollectionQueries(fq.expand(flatQueries), + PqF("a", "b"), PqF("a", "b", "c")); + + // "a b c","b d e" => "a b c","b d e" + flatQueries = new HashSet(); + flatQueries.Add(PqF("a", "b", "c")); + flatQueries.Add(PqF("b", "d", "e")); + AssertCollectionQueries(fq.expand(flatQueries), + PqF("a", "b", "c"), PqF("b", "d", "e")); + } + + [Test] + public void TestExpandNotFieldMatch() + { + Query dummy = PqF("DUMMY"); + FieldQuery fq = new FieldQuery(dummy, true, false); + + // f1:"a b",f2:"b c" => f1:"a b",f2:"b c",f1:"a b c" + HashSet flatQueries = new HashSet(); + flatQueries.Add(Pq(F1, "a", "b")); + flatQueries.Add(Pq(F2, "b", "c")); + AssertCollectionQueries(fq.expand(flatQueries), + Pq(F1, "a", "b"), Pq(F2, "b", "c"), Pq(F1, "a", "b", "c")); + } + + [Test] + public void TestGetFieldTermMap() + { + Query query = Tq("a"); + FieldQuery fq = new FieldQuery(query, true, true); + + QueryPhraseMap pqm = fq.GetFieldTermMap(F, "a"); + Assert.NotNull(pqm); + Assert.IsTrue(pqm.IsTerminal()); + + pqm = fq.GetFieldTermMap(F, "b"); + Assert.Null(pqm); + + pqm = fq.GetFieldTermMap(F1, "a"); + Assert.Null(pqm); + } + + [Test] + public void TestGetRootMap() + { + Query dummy = PqF("DUMMY"); + FieldQuery fq = new FieldQuery(dummy, true, true); + + QueryPhraseMap rootMap1 = fq.getRootMap(Tq("a")); + QueryPhraseMap rootMap2 = fq.getRootMap(Tq("a")); + Assert.IsTrue(rootMap1 == rootMap2); + QueryPhraseMap rootMap3 = fq.getRootMap(Tq("b")); + Assert.IsTrue(rootMap1 == rootMap3); + QueryPhraseMap rootMap4 = fq.getRootMap(Tq(F1, "b")); + Assert.IsFalse(rootMap4 == rootMap3); + } + + [Test] + public void TestGetRootMapNotFieldMatch() + { + Query dummy = PqF("DUMMY"); + FieldQuery fq = new FieldQuery(dummy, true, false); + + QueryPhraseMap rootMap1 = fq.getRootMap(Tq("a")); + QueryPhraseMap rootMap2 = fq.getRootMap(Tq("a")); + Assert.IsTrue(rootMap1 == rootMap2); + QueryPhraseMap rootMap3 = fq.getRootMap(Tq("b")); + Assert.IsTrue(rootMap1 == rootMap3); + QueryPhraseMap rootMap4 = fq.getRootMap(Tq(F1, "b")); + Assert.IsTrue(rootMap4 == rootMap3); + } + + [Test] + public void TestGetTermSet() + { + Query query = paW.Parse("A AND B OR x:C NOT (D AND E)"); + FieldQuery fq = new FieldQuery(query, true, true); + Assert.AreEqual(2, fq.termSetMap.Count); + List termSet = fq.getTermSet(F); + Assert.AreEqual(2, termSet.Count); + Assert.IsTrue(termSet.Contains("A")); + Assert.IsTrue(termSet.Contains("B")); + termSet = fq.getTermSet("x"); + Assert.AreEqual(1, termSet.Count); + Assert.IsTrue(termSet.Contains("C")); + termSet = fq.getTermSet("y"); + Assert.Null(termSet); + } + + [Test] + public void TestQueryPhraseMap1Term() + { + Query query = Tq("a"); + + // phraseHighlight = true, fieldMatch = true + FieldQuery fq = new FieldQuery(query, true, true); + HashMap map = fq.rootMaps; + Assert.AreEqual(1, map.Count); + Assert.Null(map.Get(null)); + Assert.NotNull(map.Get(F)); + QueryPhraseMap qpm = map.Get(F); + Assert.AreEqual(1, qpm.subMap.Count); + Assert.IsTrue(qpm.subMap.Get("a") != null); + Assert.IsTrue(qpm.subMap.Get("a").terminal); + Assert.AreEqual(1F, qpm.subMap.Get("a").boost); + + // phraseHighlight = true, fieldMatch = false + fq = new FieldQuery(query, true, false); + map = fq.rootMaps; + Assert.AreEqual(1, map.Count); + Assert.Null(map.Get(F)); + Assert.NotNull(map.Get(null)); + qpm = map.Get(null); + Assert.AreEqual(1, qpm.subMap.Count); + Assert.IsTrue(qpm.subMap.Get("a") != null); + Assert.IsTrue(qpm.subMap.Get("a").terminal); + Assert.AreEqual(1F, qpm.subMap.Get("a").boost); + + // phraseHighlight = false, fieldMatch = true + fq = new FieldQuery(query, false, true); + map = fq.rootMaps; + Assert.AreEqual(1, map.Count); + Assert.Null(map.Get(null)); + Assert.NotNull(map.Get(F)); + qpm = map.Get(F); + Assert.AreEqual(1, qpm.subMap.Count); + Assert.IsTrue(qpm.subMap.Get("a") != null); + Assert.IsTrue(qpm.subMap.Get("a").terminal); + Assert.AreEqual(1F, qpm.subMap.Get("a").boost); + + // phraseHighlight = false, fieldMatch = false + fq = new FieldQuery(query, false, false); + map = fq.rootMaps; + Assert.AreEqual(1, map.Count); + Assert.Null(map.Get(F)); + Assert.NotNull(map.Get(null)); + qpm = map.Get(null); + Assert.AreEqual(1, qpm.subMap.Count); + Assert.IsTrue(qpm.subMap.Get("a") != null); + Assert.IsTrue(qpm.subMap.Get("a").terminal); + Assert.AreEqual(1F, qpm.subMap.Get("a").boost); + + // boost != 1 + query = Tq(2, "a"); + fq = new FieldQuery(query, true, true); + map = fq.rootMaps; + qpm = map.Get(F); + Assert.AreEqual(2F, qpm.subMap.Get("a").boost); + } + + [Test] + public void TestQueryPhraseMap1Phrase() + { + Query query = PqF("a", "b"); + + // phraseHighlight = true, fieldMatch = true + FieldQuery fq = new FieldQuery(query, true, true); + HashMap map = fq.rootMaps; + Assert.AreEqual(1, map.Count); + Assert.Null(map.Get(null)); + Assert.NotNull(map.Get(F)); + QueryPhraseMap qpm = map.Get(F); + Assert.AreEqual(1, qpm.subMap.Count); + Assert.NotNull(qpm.subMap.Get("a")); + QueryPhraseMap qpm2 = qpm.subMap.Get("a"); + Assert.IsFalse(qpm2.terminal); + Assert.AreEqual(1, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("b")); + QueryPhraseMap qpm3 = qpm2.subMap.Get("b"); + Assert.IsTrue(qpm3.terminal); + Assert.AreEqual(1F, qpm3.boost); + + // phraseHighlight = true, fieldMatch = false + fq = new FieldQuery(query, true, false); + map = fq.rootMaps; + Assert.AreEqual(1, map.Count); + Assert.Null(map.Get(F)); + Assert.NotNull(map.Get(null)); + qpm = map.Get(null); + Assert.AreEqual(1, qpm.subMap.Count); + Assert.NotNull(qpm.subMap.Get("a")); + qpm2 = qpm.subMap.Get("a"); + Assert.IsFalse(qpm2.terminal); + Assert.AreEqual(1, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("b")); + qpm3 = qpm2.subMap.Get("b"); + Assert.IsTrue(qpm3.terminal); + Assert.AreEqual(1F, qpm3.boost); + + // phraseHighlight = false, fieldMatch = true + fq = new FieldQuery(query, false, true); + map = fq.rootMaps; + Assert.AreEqual(1, map.Count); + Assert.Null(map.Get(null)); + Assert.NotNull(map.Get(F)); + qpm = map.Get(F); + Assert.AreEqual(2, qpm.subMap.Count); + Assert.NotNull(qpm.subMap.Get("a")); + qpm2 = qpm.subMap.Get("a"); + Assert.IsTrue(qpm2.terminal); + Assert.AreEqual(1F, qpm2.boost); + Assert.AreEqual(1, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("b")); + qpm3 = qpm2.subMap.Get("b"); + Assert.IsTrue(qpm3.terminal); + Assert.AreEqual(1F, qpm3.boost); + + Assert.NotNull(qpm.subMap.Get("b")); + qpm2 = qpm.subMap.Get("b"); + Assert.IsTrue(qpm2.terminal); + Assert.AreEqual(1F, qpm2.boost); + + // phraseHighlight = false, fieldMatch = false + fq = new FieldQuery(query, false, false); + map = fq.rootMaps; + Assert.AreEqual(1, map.Count); + Assert.Null(map.Get(F)); + Assert.NotNull(map.Get(null)); + qpm = map.Get(null); + Assert.AreEqual(2, qpm.subMap.Count); + Assert.NotNull(qpm.subMap.Get("a")); + qpm2 = qpm.subMap.Get("a"); + Assert.IsTrue(qpm2.terminal); + Assert.AreEqual(1F, qpm2.boost); + Assert.AreEqual(1, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("b")); + qpm3 = qpm2.subMap.Get("b"); + Assert.IsTrue(qpm3.terminal); + Assert.AreEqual(1F, qpm3.boost); + + Assert.NotNull(qpm.subMap.Get("b")); + qpm2 = qpm.subMap.Get("b"); + Assert.IsTrue(qpm2.terminal); + Assert.AreEqual(1F, qpm2.boost); + + // boost != 1 + query = PqF(2, "a", "b"); + // phraseHighlight = false, fieldMatch = false + fq = new FieldQuery(query, false, false); + map = fq.rootMaps; + qpm = map.Get(null); + qpm2 = qpm.subMap.Get("a"); + Assert.AreEqual(2F, qpm2.boost); + qpm3 = qpm2.subMap.Get("b"); + Assert.AreEqual(2F, qpm3.boost); + qpm2 = qpm.subMap.Get("b"); + Assert.AreEqual(2F, qpm2.boost); + } + + [Test] + public void TestQueryPhraseMap1PhraseAnother() + { + Query query = PqF("search", "engines"); + + // phraseHighlight = true, fieldMatch = true + FieldQuery fq = new FieldQuery(query, true, true); + Dictionary map = fq.rootMaps; + Assert.AreEqual(1, map.Count); + Assert.Null(map.Get(null)); + Assert.NotNull(map.Get(F)); + QueryPhraseMap qpm = map.Get(F); + Assert.AreEqual(1, qpm.subMap.Count); + Assert.NotNull(qpm.subMap.Get("search")); + QueryPhraseMap qpm2 = qpm.subMap.Get("search"); + Assert.IsFalse(qpm2.terminal); + Assert.AreEqual(1, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("engines")); + QueryPhraseMap qpm3 = qpm2.subMap.Get("engines"); + Assert.IsTrue(qpm3.terminal); + Assert.AreEqual(1F, qpm3.boost); + } + + [Test] + public void TestQueryPhraseMap2Phrases() + { + BooleanQuery query = new BooleanQuery(); + query.Add(PqF("a", "b"), Occur.SHOULD); + query.Add(PqF(2, "c", "d"), Occur.SHOULD); + + // phraseHighlight = true, fieldMatch = true + FieldQuery fq = new FieldQuery(query, true, true); + Dictionary map = fq.rootMaps; + Assert.AreEqual(1, map.Count); + Assert.Null(map.Get(null)); + Assert.NotNull(map.Get(F)); + QueryPhraseMap qpm = map.Get(F); + Assert.AreEqual(2, qpm.subMap.Count); + + // "a b" + Assert.NotNull(qpm.subMap.Get("a")); + QueryPhraseMap qpm2 = qpm.subMap.Get("a"); + Assert.IsFalse(qpm2.terminal); + Assert.AreEqual(1, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("b")); + QueryPhraseMap qpm3 = qpm2.subMap.Get("b"); + Assert.IsTrue(qpm3.terminal); + Assert.AreEqual(1F, qpm3.boost); + + // "c d"^2 + Assert.NotNull(qpm.subMap.Get("c")); + qpm2 = qpm.subMap.Get("c"); + Assert.IsFalse(qpm2.terminal); + Assert.AreEqual(1, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("d")); + qpm3 = qpm2.subMap.Get("d"); + Assert.IsTrue(qpm3.terminal); + Assert.AreEqual(2F, qpm3.boost); + } + + [Test] + public void TestQueryPhraseMap2PhrasesFields() + { + BooleanQuery query = new BooleanQuery(); + query.Add(Pq(F1, "a", "b"), Occur.SHOULD); + query.Add(Pq(2F, F2, "c", "d"), Occur.SHOULD); + + // phraseHighlight = true, fieldMatch = true + FieldQuery fq = new FieldQuery(query, true, true); + HashMap map = fq.rootMaps; + Assert.AreEqual(2, map.Count); + Assert.Null(map.Get(null)); + + // "a b" + Assert.NotNull(map.Get(F1)); + QueryPhraseMap qpm = map.Get(F1); + Assert.AreEqual(1, qpm.subMap.Count); + Assert.NotNull(qpm.subMap.Get("a")); + QueryPhraseMap qpm2 = qpm.subMap.Get("a"); + Assert.IsFalse(qpm2.terminal); + Assert.AreEqual(1, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("b")); + QueryPhraseMap qpm3 = qpm2.subMap.Get("b"); + Assert.IsTrue(qpm3.terminal); + Assert.AreEqual(1F, qpm3.boost); + + // "c d"^2 + Assert.NotNull(map.Get(F2)); + qpm = map.Get(F2); + Assert.AreEqual(1, qpm.subMap.Count); + Assert.NotNull(qpm.subMap.Get("c")); + qpm2 = qpm.subMap.Get("c"); + Assert.IsFalse(qpm2.terminal); + Assert.AreEqual(1, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("d")); + qpm3 = qpm2.subMap.Get("d"); + Assert.IsTrue(qpm3.terminal); + Assert.AreEqual(2F, qpm3.boost); + + // phraseHighlight = true, fieldMatch = false + fq = new FieldQuery(query, true, false); + map = fq.rootMaps; + Assert.AreEqual(1, map.Count); + Assert.Null(map.Get(F1)); + Assert.Null(map.Get(F2)); + Assert.NotNull(map.Get(null)); + qpm = map.Get(null); + Assert.AreEqual(2, qpm.subMap.Count); + + // "a b" + Assert.NotNull(qpm.subMap.Get("a")); + qpm2 = qpm.subMap.Get("a"); + Assert.IsFalse(qpm2.terminal); + Assert.AreEqual(1, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("b")); + qpm3 = qpm2.subMap.Get("b"); + Assert.IsTrue(qpm3.terminal); + Assert.AreEqual(1F, qpm3.boost); + + // "c d"^2 + Assert.NotNull(qpm.subMap.Get("c")); + qpm2 = qpm.subMap.Get("c"); + Assert.IsFalse(qpm2.terminal); + Assert.AreEqual(1, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("d")); + qpm3 = qpm2.subMap.Get("d"); + Assert.IsTrue(qpm3.terminal); + Assert.AreEqual(2F, qpm3.boost); + } + + /* + * ...terminal + * + * a-b-c- + * +-d- + * b-c-d- + * +-d- + */ + [Test] + public void TestQueryPhraseMapOverlapPhrases() + { + BooleanQuery query = new BooleanQuery(); + query.Add(PqF("a", "b", "c"), Occur.SHOULD); + query.Add(PqF(2, "b", "c", "d"), Occur.SHOULD); + query.Add(PqF(3, "b", "d"), Occur.SHOULD); + + // phraseHighlight = true, fieldMatch = true + FieldQuery fq = new FieldQuery(query, true, true); + Dictionary map = fq.rootMaps; + Assert.AreEqual(1, map.Count); + Assert.Null(map.Get(null)); + Assert.NotNull(map.Get(F)); + QueryPhraseMap qpm = map.Get(F); + Assert.AreEqual(2, qpm.subMap.Count); + + // "a b c" + Assert.NotNull(qpm.subMap.Get("a")); + QueryPhraseMap qpm2 = qpm.subMap.Get("a"); + Assert.IsFalse(qpm2.terminal); + Assert.AreEqual(1, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("b")); + QueryPhraseMap qpm3 = qpm2.subMap.Get("b"); + Assert.IsFalse(qpm3.terminal); + Assert.AreEqual(1, qpm3.subMap.Count); + Assert.NotNull(qpm3.subMap.Get("c")); + QueryPhraseMap qpm4 = qpm3.subMap.Get("c"); + Assert.IsTrue(qpm4.terminal); + Assert.AreEqual(1F, qpm4.boost); + Assert.NotNull(qpm4.subMap.Get("d")); + QueryPhraseMap qpm5 = qpm4.subMap.Get("d"); + Assert.IsTrue(qpm5.terminal); + Assert.AreEqual(1F, qpm5.boost); + + // "b c d"^2, "b d"^3 + Assert.NotNull(qpm.subMap.Get("b")); + qpm2 = qpm.subMap.Get("b"); + Assert.IsFalse(qpm2.terminal); + Assert.AreEqual(2, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("c")); + qpm3 = qpm2.subMap.Get("c"); + Assert.IsFalse(qpm3.terminal); + Assert.AreEqual(1, qpm3.subMap.Count); + Assert.NotNull(qpm3.subMap.Get("d")); + qpm4 = qpm3.subMap.Get("d"); + Assert.IsTrue(qpm4.terminal); + Assert.AreEqual(2F, qpm4.boost); + Assert.NotNull(qpm2.subMap.Get("d")); + qpm3 = qpm2.subMap.Get("d"); + Assert.IsTrue(qpm3.terminal); + Assert.AreEqual(3F, qpm3.boost); + } + + /* + * ...terminal + * + * a-b- + * +-c- + */ + [Test] + public void TestQueryPhraseMapOverlapPhrases2() + { + BooleanQuery query = new BooleanQuery(); + query.Add(PqF("a", "b"), Occur.SHOULD); + query.Add(PqF(2, "a", "b", "c"), Occur.SHOULD); + + // phraseHighlight = true, fieldMatch = true + FieldQuery fq = new FieldQuery(query, true, true); + Dictionary map = fq.rootMaps; + Assert.AreEqual(1, map.Count); + Assert.Null(map.Get(null)); + Assert.NotNull(map.Get(F)); + QueryPhraseMap qpm = map.Get(F); + Assert.AreEqual(1, qpm.subMap.Count); + + // "a b" + Assert.NotNull(qpm.subMap.Get("a")); + QueryPhraseMap qpm2 = qpm.subMap.Get("a"); + Assert.IsFalse(qpm2.terminal); + Assert.AreEqual(1, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("b")); + QueryPhraseMap qpm3 = qpm2.subMap.Get("b"); + Assert.IsTrue(qpm3.terminal); + Assert.AreEqual(1F, qpm3.boost); + + // "a b c"^2 + Assert.AreEqual(1, qpm3.subMap.Count); + Assert.NotNull(qpm3.subMap.Get("c")); + QueryPhraseMap qpm4 = qpm3.subMap.Get("c"); + Assert.IsTrue(qpm4.terminal); + Assert.AreEqual(2F, qpm4.boost); + } + + /* + * ...terminal + * + * a-a-a- + * +-a- + * +-a- + * +-a- + */ + [Test] + public void TestQueryPhraseMapOverlapPhrases3() + { + BooleanQuery query = new BooleanQuery(); + query.Add(PqF("a", "a", "a", "a"), Occur.SHOULD); + query.Add(PqF(2, "a", "a", "a"), Occur.SHOULD); + + // phraseHighlight = true, fieldMatch = true + FieldQuery fq = new FieldQuery(query, true, true); + Dictionary map = fq.rootMaps; + Assert.AreEqual(1, map.Count); + Assert.Null(map.Get(null)); + Assert.NotNull(map.Get(F)); + QueryPhraseMap qpm = map.Get(F); + Assert.AreEqual(1, qpm.subMap.Count); + + // "a a a" + Assert.NotNull(qpm.subMap.Get("a")); + QueryPhraseMap qpm2 = qpm.subMap.Get("a"); + Assert.IsFalse(qpm2.terminal); + Assert.AreEqual(1, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("a")); + QueryPhraseMap qpm3 = qpm2.subMap.Get("a"); + Assert.IsFalse(qpm3.terminal); + Assert.AreEqual(1, qpm3.subMap.Count); + Assert.NotNull(qpm3.subMap.Get("a")); + QueryPhraseMap qpm4 = qpm3.subMap.Get("a"); + Assert.IsTrue(qpm4.terminal); + + // "a a a a" + Assert.AreEqual(1, qpm4.subMap.Count); + Assert.NotNull(qpm4.subMap.Get("a")); + QueryPhraseMap qpm5 = qpm4.subMap.Get("a"); + Assert.IsTrue(qpm5.terminal); + + // "a a a a a" + Assert.AreEqual(1, qpm5.subMap.Count); + Assert.NotNull(qpm5.subMap.Get("a")); + QueryPhraseMap qpm6 = qpm5.subMap.Get("a"); + Assert.IsTrue(qpm6.terminal); + + // "a a a a a a" + Assert.AreEqual(1, qpm6.subMap.Count); + Assert.NotNull(qpm6.subMap.Get("a")); + QueryPhraseMap qpm7 = qpm6.subMap.Get("a"); + Assert.IsTrue(qpm7.terminal); + } + + [Test] + public void TestQueryPhraseMapOverlap2gram() + { + Query query = paB.Parse("abc AND bcd"); + + // phraseHighlight = true, fieldMatch = true + FieldQuery fq = new FieldQuery(query, true, true); + Dictionary map = fq.rootMaps; + Assert.AreEqual(1, map.Count); + Assert.Null(map.Get(null)); + Assert.NotNull(map.Get(F)); + QueryPhraseMap qpm = map.Get(F); + Assert.AreEqual(2, qpm.subMap.Count); + + // "ab bc" + Assert.NotNull(qpm.subMap.Get("ab")); + QueryPhraseMap qpm2 = qpm.subMap.Get("ab"); + Assert.IsFalse(qpm2.terminal); + Assert.AreEqual(1, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("bc")); + QueryPhraseMap qpm3 = qpm2.subMap.Get("bc"); + Assert.IsTrue(qpm3.terminal); + Assert.AreEqual(1F, qpm3.boost); + + // "ab bc cd" + Assert.AreEqual(1, qpm3.subMap.Count); + Assert.NotNull(qpm3.subMap.Get("cd")); + QueryPhraseMap qpm4 = qpm3.subMap.Get("cd"); + Assert.IsTrue(qpm4.terminal); + Assert.AreEqual(1F, qpm4.boost); + + // "bc cd" + Assert.NotNull(qpm.subMap.Get("bc")); + qpm2 = qpm.subMap.Get("bc"); + Assert.IsFalse(qpm2.terminal); + Assert.AreEqual(1, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("cd")); + qpm3 = qpm2.subMap.Get("cd"); + Assert.IsTrue(qpm3.terminal); + Assert.AreEqual(1F, qpm3.boost); + + // phraseHighlight = false, fieldMatch = true + fq = new FieldQuery(query, false, true); + map = fq.rootMaps; + Assert.AreEqual(1, map.Count); + Assert.Null(map.Get(null)); + Assert.NotNull(map.Get(F)); + qpm = map.Get(F); + Assert.AreEqual(3, qpm.subMap.Count); + + // "ab bc" + Assert.NotNull(qpm.subMap.Get("ab")); + qpm2 = qpm.subMap.Get("ab"); + Assert.IsTrue(qpm2.terminal); + Assert.AreEqual(1F, qpm2.boost); + Assert.AreEqual(1, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("bc")); + qpm3 = qpm2.subMap.Get("bc"); + Assert.IsTrue(qpm3.terminal); + Assert.AreEqual(1F, qpm3.boost); + + // "ab bc cd" + Assert.AreEqual(1, qpm3.subMap.Count); + Assert.NotNull(qpm3.subMap.Get("cd")); + qpm4 = qpm3.subMap.Get("cd"); + Assert.IsTrue(qpm4.terminal); + Assert.AreEqual(1F, qpm4.boost); + + // "bc cd" + Assert.NotNull(qpm.subMap.Get("bc")); + qpm2 = qpm.subMap.Get("bc"); + Assert.IsTrue(qpm2.terminal); + Assert.AreEqual(1F, qpm2.boost); + Assert.AreEqual(1, qpm2.subMap.Count); + Assert.NotNull(qpm2.subMap.Get("cd")); + qpm3 = qpm2.subMap.Get("cd"); + Assert.IsTrue(qpm3.terminal); + Assert.AreEqual(1F, qpm3.boost); + + // "cd" + Assert.NotNull(qpm.subMap.Get("cd")); + qpm2 = qpm.subMap.Get("cd"); + Assert.IsTrue(qpm2.terminal); + Assert.AreEqual(1F, qpm2.boost); + Assert.AreEqual(0, qpm2.subMap.Count); + } + + [Test] + public void TestSearchPhrase() + { + Query query = PqF("a", "b", "c"); + + // phraseHighlight = true, fieldMatch = true + FieldQuery fq = new FieldQuery(query, true, true); + + // "a" + List phraseCandidate = new List(); + phraseCandidate.Add(new TermInfo("a", 0, 1, 0)); + Assert.Null(fq.SearchPhrase(F, phraseCandidate)); + // "a b" + phraseCandidate.Add(new TermInfo("b", 2, 3, 1)); + Assert.Null(fq.SearchPhrase(F, phraseCandidate)); + // "a b c" + phraseCandidate.Add(new TermInfo("c", 4, 5, 2)); + Assert.NotNull(fq.SearchPhrase(F, phraseCandidate)); + Assert.Null(fq.SearchPhrase("x", phraseCandidate)); + + // phraseHighlight = true, fieldMatch = false + fq = new FieldQuery(query, true, false); + + // "a b c" + Assert.NotNull(fq.SearchPhrase(F, phraseCandidate)); //{{DIGY - Failing test.}} + Assert.NotNull(fq.SearchPhrase("x", phraseCandidate)); //{{DIGY - Failing test.}} + //{{DIGY- this may be related with the difference of List implemantation between Java & .NET + //Java version accepts "null" as a value. It is not a show stopper.}} + + // phraseHighlight = false, fieldMatch = true + fq = new FieldQuery(query, false, true); + + // "a" + phraseCandidate.Clear(); + phraseCandidate.Add(new TermInfo("a", 0, 1, 0)); + Assert.NotNull(fq.SearchPhrase(F, phraseCandidate)); + // "a b" + phraseCandidate.Add(new TermInfo("b", 2, 3, 1)); + Assert.Null(fq.SearchPhrase(F, phraseCandidate)); + // "a b c" + phraseCandidate.Add(new TermInfo("c", 4, 5, 2)); + Assert.NotNull(fq.SearchPhrase(F, phraseCandidate)); + Assert.Null(fq.SearchPhrase("x", phraseCandidate)); + } + + [Test] + public void TestSearchPhraseSlop() + { + // "a b c"~0 + Query query = PqF("a", "b", "c"); + + // phraseHighlight = true, fieldMatch = true + FieldQuery fq = new FieldQuery(query, true, true); + + // "a b c" w/ position-gap = 2 + List phraseCandidate = new List(); + phraseCandidate.Add(new TermInfo("a", 0, 1, 0)); + phraseCandidate.Add(new TermInfo("b", 2, 3, 2)); + phraseCandidate.Add(new TermInfo("c", 4, 5, 4)); + Assert.Null(fq.SearchPhrase(F, phraseCandidate)); + + // "a b c"~1 + query = pqF(1F, 1, "a", "b", "c"); + + // phraseHighlight = true, fieldMatch = true + fq = new FieldQuery(query, true, true); + + // "a b c" w/ position-gap = 2 + Assert.NotNull(fq.SearchPhrase(F, phraseCandidate)); + + // "a b c" w/ position-gap = 3 + phraseCandidate.Clear(); + phraseCandidate.Add(new TermInfo("a", 0, 1, 0)); + phraseCandidate.Add(new TermInfo("b", 2, 3, 3)); + phraseCandidate.Add(new TermInfo("c", 4, 5, 6)); + Assert.Null(fq.SearchPhrase(F, phraseCandidate)); + } + } + +} Added: lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net/Test/FieldTermStackTest.cs URL: http://svn.apache.org/viewvc/lucene/lucene.net/trunk/C%23/contrib/FastVectorHighlighter.Net/Test/FieldTermStackTest.cs?rev=916340&view=auto ============================================================================== --- lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net/Test/FieldTermStackTest.cs (added) +++ lucene/lucene.net/trunk/C#/contrib/FastVectorHighlighter.Net/Test/FieldTermStackTest.cs Thu Feb 25 16:32:11 2010 @@ -0,0 +1,191 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +using System; +using System.Collections.Generic; +using System.Text; + +using Lucene.Net.Documents; +using Lucene.Net.Search; + +using NUnit.Framework; + + +namespace Lucene.Net.Search.Vectorhighlight +{ + [TestFixture] + public class FieldTermStackTest : AbstractTestCase + { + [Test] + public void Test1Term() + { + MakeIndex(); + + FieldQuery fq = new FieldQuery(Tq("a"), true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + Assert.AreEqual(6, stack.termList.Count); + Assert.AreEqual("a(0,1,0)", stack.Pop().ToString()); + Assert.AreEqual("a(2,3,1)", stack.Pop().ToString()); + Assert.AreEqual("a(4,5,2)", stack.Pop().ToString()); + Assert.AreEqual("a(12,13,6)", stack.Pop().ToString()); + Assert.AreEqual("a(28,29,14)", stack.Pop().ToString()); + Assert.AreEqual("a(32,33,16)", stack.Pop().ToString()); + } + + [Test] + public void Test2Terms() + { + MakeIndex(); + + BooleanQuery query = new BooleanQuery(); + query.Add(Tq("b"), Lucene.Net.Search.BooleanClause.Occur.SHOULD); + query.Add(Tq("c"), Lucene.Net.Search.BooleanClause.Occur.SHOULD); + FieldQuery fq = new FieldQuery(query, true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + Assert.AreEqual(8, stack.termList.Count); + Assert.AreEqual("b(6,7,3)", stack.Pop().ToString()); + Assert.AreEqual("b(8,9,4)", stack.Pop().ToString()); + Assert.AreEqual("c(10,11,5)", stack.Pop().ToString()); + Assert.AreEqual("b(14,15,7)", stack.Pop().ToString()); + Assert.AreEqual("b(16,17,8)", stack.Pop().ToString()); + Assert.AreEqual("c(18,19,9)", stack.Pop().ToString()); + Assert.AreEqual("b(26,27,13)", stack.Pop().ToString()); + Assert.AreEqual("b(30,31,15)", stack.Pop().ToString()); + } + + [Test] + public void Test1Phrase() + { + MakeIndex(); + + FieldQuery fq = new FieldQuery(PqF("c", "d"), true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + Assert.AreEqual(3, stack.termList.Count); + Assert.AreEqual("c(10,11,5)", stack.Pop().ToString()); + Assert.AreEqual("c(18,19,9)", stack.Pop().ToString()); + Assert.AreEqual("d(20,21,10)", stack.Pop().ToString()); + } + + private void MakeIndex() + { + // 111111111122222 + // 0123456789012345678901234 (offsets) + // a a a b b c a b b c d e f + // 0 1 2 3 4 5 6 7 8 9101112 (position) + String value1 = "a a a b b c a b b c d e f"; + // 222233333 + // 678901234 (offsets) + // b a b a f + //1314151617 (position) + String value2 = "b a b a f"; + + Make1dmfIndex(value1, value2); + } + + [Test] + public void Test1TermB() + { + makeIndexB(); + + FieldQuery fq = new FieldQuery(Tq("ab"), true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + Assert.AreEqual(2, stack.termList.Count); + Assert.AreEqual("ab(2,4,2)", stack.Pop().ToString()); + Assert.AreEqual("ab(6,8,6)", stack.Pop().ToString()); + } + + [Test] + public void Test2TermsB() + { + makeIndexB(); + + BooleanQuery query = new BooleanQuery(); + query.Add(Tq("bc"), Lucene.Net.Search.BooleanClause.Occur.SHOULD); + query.Add(Tq("ef"), Lucene.Net.Search.BooleanClause.Occur.SHOULD); + FieldQuery fq = new FieldQuery(query, true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + Assert.AreEqual(3, stack.termList.Count); + Assert.AreEqual("bc(4,6,4)", stack.Pop().ToString()); + Assert.AreEqual("bc(8,10,8)", stack.Pop().ToString()); + Assert.AreEqual("ef(11,13,11)", stack.Pop().ToString()); + } + + [Test] + public void Test1PhraseB() + { + makeIndexB(); + + FieldQuery fq = new FieldQuery(PqF("ab", "bb"), true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + Assert.AreEqual(4, stack.termList.Count); + Assert.AreEqual("ab(2,4,2)", stack.Pop().ToString()); + Assert.AreEqual("bb(3,5,3)", stack.Pop().ToString()); + Assert.AreEqual("ab(6,8,6)", stack.Pop().ToString()); + Assert.AreEqual("bb(7,9,7)", stack.Pop().ToString()); + } + + private void makeIndexB() + { + // 1 11 11 + // 01 12 23 34 45 56 67 78 89 90 01 12 (offsets) + // aa|aa|ab|bb|bc|ca|ab|bb|bc|cd|de|ef + // 0 1 2 3 4 5 6 7 8 9 10 11 (position) + String value = "aaabbcabbcdef"; + + Make1dmfIndexB(value); + } + + [Test] + public void Test1PhraseShortMV() + { + MakeIndexShortMV(); + + FieldQuery fq = new FieldQuery(Tq("d"), true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + Assert.AreEqual(1, stack.termList.Count); + Assert.AreEqual("d(6,7,3)", stack.Pop().ToString()); + } + + [Test] + public void Test1PhraseLongMV() + { + MakeIndexLongMV(); + + FieldQuery fq = new FieldQuery(PqF("search", "engines"), true, true); + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + Assert.AreEqual(4, stack.termList.Count); + Assert.AreEqual("search(102,108,14)", stack.Pop().ToString()); + Assert.AreEqual("engines(109,116,15)", stack.Pop().ToString()); + Assert.AreEqual("search(157,163,24)", stack.Pop().ToString()); + Assert.AreEqual("engines(164,171,25)", stack.Pop().ToString()); + } + + [Test] + public void Test1PhraseMVB() + { + MakeIndexLongMVB(); + + FieldQuery fq = new FieldQuery(PqF("sp", "pe", "ee", "ed"), true, true); // "speed" -(2gram)-> "sp","pe","ee","ed" + FieldTermStack stack = new FieldTermStack(reader, 0, F, fq); + Assert.AreEqual(4, stack.termList.Count); + Assert.AreEqual("sp(88,90,61)", stack.Pop().ToString()); + Assert.AreEqual("pe(89,91,62)", stack.Pop().ToString()); + Assert.AreEqual("ee(90,92,63)", stack.Pop().ToString()); + Assert.AreEqual("ed(91,93,64)", stack.Pop().ToString()); + } + } +}