lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Garski <mgar...@myspace.com>
Subject RE: Detecting why a collection of documents matched a query
Date Tue, 14 Oct 2008 01:16:04 GMT
I've seen this question come up a few times on the list in the past with
the potential solutions of:

1. Parsing out the results of the Explain() method
2. Perform a regex on the data post-search to determine which field
contained the match 
3. Searching each field independently and removing duplicates
post-search

We've been experimenting with performing multiple searches - one search
against all desired fields to find the set of documents that match, then
executing a search against each field independently to determine where
the match took place.  The solution works but has a performance
drawback, especially as the number of fields in the original query grows
larger.  

It would be nice to have some sort of search option and an overloaded
collector that would collect each hit along with the list of fields the
hit matched on, but that sort of functionality is not present at this
time.  I'm curious if anyone has started any thoughts or work in this
area, as we are about to begin determining what would be involved in
such a change and are comfortable in making (and by extension submitting
to JIRA) the necessary changes to add this feature.

Thanks,

Michael

-----Original Message-----
From: Khawaja Shams [mailto:ksshams@gmail.com] 
Sent: Sunday, October 12, 2008 2:50 PM
To: java-user@lucene.apache.org
Subject: Detecting why a collection of documents matched a query

Hello,  I noticed that indexSearcher.explain() method is not supposed to
be
run for a large collection of documents, so I am looking for an
alternative
that just explains why a document matched without all the scoring
information. Basically, I would like to know which field of the document
was
responsible for getting it included in the results so I can give users
some
indication of what matched. We present the results 100 documents at a
time.
I would appreciate any ideas or directions towards implementation.


Thanks!


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message