lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From markharw...@yahoo.co.uk
Subject Re: Query Term Collector (was: Re: New highlighter package available)
Date Sat, 04 Oct 2003 09:47:05 GMT
With regards to Korfut's TermCollector proposition:
I do not like the new requirement for all query classes to implement getTerms(). This is effectively
what they are currently
required to do in the query.rewrite() method - express their high-level logic in primitive
terms.

I beleive the getTerms() implementation should make use of this existing feature of all query
objects (as I have done in
QueryHighlightExtractor.java), and not create a new set of requirements for all query classes
- lets not add complexity where its
not needed.
So, I think the real question is should there be a home for a getTerms() function that operates
on primitive (rewritten) queries?

We can move some of the logic in QueryHighlightExtractor.java to somewhere core if the consensus
is that 
this is a generally useful feature (though I have yet to think of one outside of highlighting)

Incidentally, it may be of interest to note that I am busy packaging up a getTopTerms() feature
that analyses the contents 
of query result sets and returns the "significant" terms and phrases found in the result set
based on their relative frequency
compared to that of the corpus. 
Its quite effective and of use in query expansion and highlighting. 
This may be of interest to those proposing query.getTerms() changes.

Cheers
Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message