lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From markharw...@yahoo.co.uk
Subject RE : New highlighter package available
Date Thu, 25 Sep 2003 20:52:55 GMT
Thanks for the feedback on the highlighter package.
Here are some responses to the issues raised:

>>what may be the performance implications seeing that
>>the method query.rewrite(reader) seems to be called twice, one for
>>querying, once for highlighting.

One solution is to do this before calling the highlighter:

  query=query.rewrite(reader); //turn into a primitive query
  Hits hits = searcher.search(query);
  QueryHighlightExtractor h =
new QueryHighlightExtractor(reader, query, new StandardAnalyzer(), "<B>", "</B>");

Would you want the highlighter to enforce this optimisation by insisting that
queries passed to it are not multi-term ones that require expansion? That
way we would not need to pass an IndexReader to the Highlighter constructors and should 
redefine them to be capable of throwing a "QueryNotRewrittenException" if we find
un-expanded queries are passed.
It seems a bit heavy-handed to beat people over the head like this for not passing 
a pre-optimized query. Maybe the best solution is to remove support for highlighting
multi-term queries entirely from the highlighter - the caller must call rewrite() BEFORE calling
the highlighter if they expect multi-terms to be highlighted. I think thats my favoured approach
- thoughts?


>>Is it possible to split the logic (2 classes ?) which :
>>a) handles highlighting
>>b) grabs Query terms (method getTerms and its dependencies)

The TextHighlighter class is already a class that purely handles highlighting (independent
of 
query terms).
The getTerms() function is made public in QueryHighlighter as I thought it might be of 
use to some people. I guess I could move it into a static function on a utility class somewhere

but I struggle to think of uses outside of text highlighting? Surely the query classes offer
better metadata about a query (eg phrases, boosts etc) so does this "Term[] getTerms(Query)"
function
warrant a specialised home anywhere?


>>Does anyone know if this package supports highlighting in MultiSearcher
>>environments?
This works but looks ugly:
//setup index 1
RAMDirectory ramDir1 = new RAMDirectory();
IndexWriter writer1 = new IndexWriter(ramDir1, new StandardAnalyzer(), true);
Document d = new Document();
Field f = new Field(FIELD_NAME, "multiOne", true, true, true);
d.add(f);
writer1.addDocument(d);
writer1.optimize();
writer1.close();
IndexReader reader1 = IndexReader.open(ramDir1);

//setup index 2
RAMDirectory ramDir2 = new RAMDirectory();
IndexWriter writer2 = new IndexWriter(ramDir2, new StandardAnalyzer(), true);
d = new Document();
f = new Field(FIELD_NAME, "multiTwo", true, true, true);
d.add(f);
writer2.addDocument(d);
writer2.optimize();
writer2.close();
IndexReader reader2 = IndexReader.open(ramDir2);



IndexSearcher searchers[]=new IndexSearcher[2]; 
searchers[0] = new IndexSearcher(ramDir1);
searchers[1] = new IndexSearcher(ramDir2);
MultiSearcher multiSearcher=new MultiSearcher(searchers);
query = QueryParser.parse("multi*", FIELD_NAME, new StandardAnalyzer());
System.out.println("Searching for: " + query.toString(FIELD_NAME));
hits = multiSearcher.search(query);

//Now do some query expansion
Query expandedQueries[]=new Query[2];
expandedQueries[0]=query.rewrite(reader1);
expandedQueries[1]=query.rewrite(reader2);
Query combinedExpandedQuery=query.combine(expandedQueries);


//NB The reader passed here is irrelevant as the query is expanded
QueryHighlightExtractor highlighter = new QueryHighlightExtractor(this, reader2, combinedExpandedQuery,
new StandardAnalyzer());


Thanks again
Mark


  
  



Mime
View raw message