lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Piotr Pęzik <piotr.pe...@gmail.com>
Subject SpanNearQuery -- bug or feature?
Date Sat, 11 Jan 2014 00:01:05 GMT
Hi,

could anyone please tell me if the following behavior is expected in 
Lucene 4.5?

Let's assume we have an index with two documents:

1. contents: "test bunga bunga test"
2. contents: "test bunga test"

We run two SpanNearQueries against this index:

1. spanNear([contents:bunga, contents:bunga], 0, true)
2. spanNear([contents:bunga, contents:bunga], 0, false)

For the first query we get 1 hit. The first document in the example 
above gets matched and the second one doesn't. This make sense, because 
we want a  the term "bunga" followed by another "bunga" here.

For the second query both documents get matched. Why does the second 
document with a single occurrence of 'bunga' get matched?

A complete example follows.

Thanks in advance!



Piotr


-----------

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.search.spans.SpanNearQuery;
import org.apache.lucene.search.spans.SpanQuery;
import org.apache.lucene.search.spans.SpanTermQuery;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.util.Version;
import java.io.StringReader;
import static org.junit.Assert.assertEquals;

class SpansBug {

     public static void main(String [] args) throws Exception {

         Directory dir = new RAMDirectory();
         Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_45);
         IndexWriterConfig iwc = new 
IndexWriterConfig(Version.LUCENE_45, analyzer);

         IndexWriter writer = new IndexWriter(dir, iwc);
         String contents = "contents";
         Document doc1 = new Document();
         doc1.add(new TextField(contents, new StringReader("test bunga 
bunga test")));
         Document doc2 = new Document();
         doc2.add(new TextField(contents, new StringReader("test bunga 
test")));

         writer.addDocument(doc1);
         writer.addDocument(doc2);

         writer.commit();

         IndexSearcher searcher = new 
IndexSearcher(DirectoryReader.open(dir));

         SpanQuery stq1 = new SpanTermQuery(new Term(contents,"bunga"));
         SpanQuery stq2 = new SpanTermQuery(new Term(contents,"bunga"));
         SpanQuery [] spqa = new SpanQuery[]{stq1,stq2};

         SpanNearQuery spanQ1 = new SpanNearQuery(spqa,0, true);
         SpanNearQuery spanQ2 = new SpanNearQuery(spqa,0, false);

         System.out.println(spanQ1);

         TopDocs tdocs1 = searcher.search(spanQ1,10);
         assertEquals(tdocs1.totalHits ,1);

         System.out.println(spanQ2);

         TopDocs tdocs2 = searcher.search(spanQ2,10);
         //Why does the following assertion fail?
         assertEquals(tdocs2.totalHits ,1);


     }
}


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message