lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4826) PostingsHighlighter doesn't keep the top N best scoring passages
Date Tue, 12 Mar 2013 20:09:13 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13600411#comment-13600411
] 

Robert Muir commented on LUCENE-4826:
-------------------------------------

+1!

Here is a smaller test: in order to trick it to fail, you must have something like
Great Sentence. Crappy Sentence. Good Sentence.

otherwise they never make it into the PQ to demonstrate the bug...

{code}
  public void testPassageRanking() throws Exception {
    Directory dir = newDirectory();
    IndexWriterConfig iwc = newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random(),
MockTokenizer.SIMPLE, true));
    iwc.setMergePolicy(newLogMergePolicy());
    RandomIndexWriter iw = new RandomIndexWriter(random(), dir, iwc);
    
    FieldType offsetsType = new FieldType(TextField.TYPE_STORED);
    offsetsType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
    Field body = new Field("body", "", offsetsType);
    Document doc = new Document();
    doc.add(body);
    
    body.setStringValue("This is a test.  Just highlighting from postings. This is also a
much sillier test.  Feel free to test test test test test test test.");
    iw.addDocument(doc);
    
    IndexReader ir = iw.getReader();
    iw.close();
    
    IndexSearcher searcher = newSearcher(ir);
    PostingsHighlighter highlighter = new PostingsHighlighter();
    Query query = new TermQuery(new Term("body", "test"));
    TopDocs topDocs = searcher.search(query, null, 10, Sort.INDEXORDER);
    assertEquals(1, topDocs.totalHits);
    String snippets[] = highlighter.highlight("body", query, searcher, topDocs, 2);
    assertEquals(1, snippets.length);
    assertEquals("This is a <b>test</b>.  ... Feel free to <b>test</b>
<b>test</b> <b>test</b> <b>test</b> <b>test</b>
<b>test</b> <b>test</b>.", snippets[0]);
    
    ir.close();
    dir.close();
  }
{code}
                
> PostingsHighlighter doesn't keep the top N best scoring passages
> ----------------------------------------------------------------
>
>                 Key: LUCENE-4826
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4826
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/highlighter
>            Reporter: Michael McCandless
>             Fix For: 5.0, 4.3
>
>         Attachments: LUCENE-4826.patch
>
>
> The comparator we pass to the PQ is just backwards ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message