lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Giulio Cesare Solaroli <giulio.ces...@gmail.com>
Subject Re: How to extract matching terms for a document given a query
Date Wed, 16 Jun 2004 21:50:50 GMT
On Wed, 16 Jun 2004 22:31:07 +0100, markharw00d@yahoo.co.uk
<markharw00d@yahoo.co.uk> wrote:
> 
> Yes, highlighting multi-term queries does require a query.rewrite() call to expand those
terms before
> calling the highlighter.
> BUT, you could load the results documents into a temporary RAMDirectory and expand the
query by rewriting it
> against THAT instead of the original index - it would still produce the term expansions
you need.

That's a nice idea. The rewritten query would be much smaller when
expanded against an index containing only the selected documents.

> The trouble is that would require one pass of the analyzer over the full documents' text.
> You would have to tokenize the text again when you called the highlighter unless you
wrote code to cache the
> results of the token stream created while indexing the results into your RamDirectory.
This pre-tokenized stream
> could then be passed to the highlighter.

If I got this right, these steps should be preformed on the frontend
application, right?
It could make sense, if my frontend application was written in Java,
but this is not the case. :-[

So I need to process as much information as possible on the search
application (that does not have access to the document full text) to
allow the frontend application to highlight the relevant chunk of the
selected documents.

I think my question could be restated as follow: given a fully
expanded query, which is the best way to find out the terms that match
any given document, without having access to the whole document
content?

[...]

Ciao,

Giulio Cesare Solaroli

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message