lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Selvia <ssel...@gmail.com>
Subject Exact Phrase Search returning in correct results
Date Wed, 11 Jun 2014 16:48:07 GMT
I’m having an issue searching for an exact phrase with Lucene 4.7.  My use case loaded the
Declaration of Independence into 
a Lucene search database.  I search for “it becomes” and I get two hits; one for “it,
becomes” and another for a line that just has
“becomes” at the end of the line.

Expected:

“When, in the course of human events, it becomes necessary for one people to dissolve the”

Not Expected:

“powers from the consent of the governed. That whenever any form of government becomes”

Below is my load code and search code:

Directory idxLinesDir = FSDirectory.open(“test lucene index”);
Analyzer analyzerLines = new StandardAnalyzer(Version.LUCENE_47);
IndexWriterConfig iwcLines = new IndexWriterConfig(Version.LUCENE_47, analyzerLines);
iwcLines.setOpenMode((idxLinesFile.exists()) ? IndexWriterConfig.OpenMode.CREATE_OR_APPEND
: IndexWriterConfig.OpenMode.CREATE);

IndexWriter writerLines = new IndexWriter(idxLinesDir, iwcLines);

for (int i = 0; i < arrayListOfLines.size(); i++)
{
     Document docLine = new Document();
     docLine.add(new StringField("docIndex", String.format("%06d", pageNumber) + ":" + String.format("%06d",
i), Field.Store.YES));
     docLine.add(new TextField(“lineText", arrayListOfLines.get(i), Field.Store.YES));

     writerLines.addDocument(docLines);
}

// Search Code

Directory idxDir = FSDirectory.open(idxFile);
IndexReader reader = DirectoryReader.open(idxDir);
IndexSearcher searcher = new IndexSearcher(reader);
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_47);
QueryParser parser = new QueryParser(Version.LUCENE_47, “lineText”, analyzer);
parser.setDefaultOperator(QueryParser.AND_OPERATOR);
parser.setPhraseSlop(0);
                
Query query = parser.createPhraseQuery(“lineText”, “it becomes”);                
TotalHitCountCollector collector = new TotalHitCountCollector();
searcher.search(query, collector);
TopDocs results = searcher.search(query, Math.max(1, collector.getTotalHits()));
ScoreDoc[] hits = results.scoreDocs;


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message