lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject RE : New highlighter package available
Date Thu, 25 Sep 2003 20:52:55 GMT
Thanks for the feedback on the highlighter package.
Here are some responses to the issues raised:

>>what may be the performance implications seeing that
>>the method query.rewrite(reader) seems to be called twice, one for
>>querying, once for highlighting.

One solution is to do this before calling the highlighter:

  query=query.rewrite(reader); //turn into a primitive query
  Hits hits =;
  QueryHighlightExtractor h =
new QueryHighlightExtractor(reader, query, new StandardAnalyzer(), "<B>", "</B>");

Would you want the highlighter to enforce this optimisation by insisting that
queries passed to it are not multi-term ones that require expansion? That
way we would not need to pass an IndexReader to the Highlighter constructors and should 
redefine them to be capable of throwing a "QueryNotRewrittenException" if we find
un-expanded queries are passed.
It seems a bit heavy-handed to beat people over the head like this for not passing 
a pre-optimized query. Maybe the best solution is to remove support for highlighting
multi-term queries entirely from the highlighter - the caller must call rewrite() BEFORE calling
the highlighter if they expect multi-terms to be highlighted. I think thats my favoured approach
- thoughts?

>>Is it possible to split the logic (2 classes ?) which :
>>a) handles highlighting
>>b) grabs Query terms (method getTerms and its dependencies)

The TextHighlighter class is already a class that purely handles highlighting (independent
query terms).
The getTerms() function is made public in QueryHighlighter as I thought it might be of 
use to some people. I guess I could move it into a static function on a utility class somewhere

but I struggle to think of uses outside of text highlighting? Surely the query classes offer
better metadata about a query (eg phrases, boosts etc) so does this "Term[] getTerms(Query)"
warrant a specialised home anywhere?

>>Does anyone know if this package supports highlighting in MultiSearcher
This works but looks ugly:
//setup index 1
RAMDirectory ramDir1 = new RAMDirectory();
IndexWriter writer1 = new IndexWriter(ramDir1, new StandardAnalyzer(), true);
Document d = new Document();
Field f = new Field(FIELD_NAME, "multiOne", true, true, true);
IndexReader reader1 =;

//setup index 2
RAMDirectory ramDir2 = new RAMDirectory();
IndexWriter writer2 = new IndexWriter(ramDir2, new StandardAnalyzer(), true);
d = new Document();
f = new Field(FIELD_NAME, "multiTwo", true, true, true);
IndexReader reader2 =;

IndexSearcher searchers[]=new IndexSearcher[2]; 
searchers[0] = new IndexSearcher(ramDir1);
searchers[1] = new IndexSearcher(ramDir2);
MultiSearcher multiSearcher=new MultiSearcher(searchers);
query = QueryParser.parse("multi*", FIELD_NAME, new StandardAnalyzer());
System.out.println("Searching for: " + query.toString(FIELD_NAME));
hits =;

//Now do some query expansion
Query expandedQueries[]=new Query[2];
Query combinedExpandedQuery=query.combine(expandedQueries);

//NB The reader passed here is irrelevant as the query is expanded
QueryHighlightExtractor highlighter = new QueryHighlightExtractor(this, reader2, combinedExpandedQuery,
new StandardAnalyzer());

Thanks again


View raw message