lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <>
Subject Re: lucene explanation
Date Mon, 22 Dec 2008 22:00:51 GMT
Warning! I'm really reaching on this....

But it seems you could use TermDocs/TermEnum to
good effect here. Basically, you should be able, for a
given term, use the above to determine whether
doc N had a hit in one of your fields pretty efficiently.
There's even a WildcardTermEnum that will iterate
over wildcards.

Filters are surprisingly fast to construct, so you could
use the above to construct a filter on each term for
each field. Then determining whether the doc is a hit
for a particular field is just a matter of seeing if
that bit is on in the relevant filter.

Either one should be waaaay under 30 seconds,
although I don't know how big your index is
or how encompassing your wildcard searches


On Mon, Dec 22, 2008 at 4:48 PM, Chris Salem <> wrote:

> Hello,
> I'm wondering what the best way to accomplish this is.
> When a user enters text to search on it customarily searches 3 fields,
> resume_text, profile_text, and summary_text, so a standard query would be
> something like:
> (resume_text:(query) OR profile_text:(query) OR summary_text:(query))
> For each hit (up to 50) I'd like to find out which part of the query
> matched with the document.  Right now I use the Explanation object, here's
> the code:
> int len = hits.length();
> if(len > 50) len = 50;
> for(int i=0; i<len; i++){
> Explanation ex = searcher.explain(Query.parse("resume_text:(query)"),
> if(ex.isMatch()) ...
> ex = searcher.explain(Query.parse("profile_text:(query)"),;
> if(ex.isMatch()) ...
> ex = searcher.explain(Query.parse("summary_text:(query)"),;
> if(ex.isMatch()) ...
> }
> This works fine with regular queries, but if someone does a query with a
> wildcard search times increase to more than 30 seconds.  Is there a better
> way to do this?
> Thanks
> Sincerely,
> Chris Salem

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message