lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tricia Williams <>
Subject Re: document diversity
Date Thu, 01 Oct 2009 17:17:33 GMT
Hi Mike,

The first thing that comes to mind is to run a query for each document 
type (assuming that you have a field that stores the type) and qualify 
the document type: for example type:pdf.  Then you would have to write 
something to combine the query results drawing an equal number of hits 
from each query result.  I think that the score would still be able to 
provide you with an appropriate relevance score if you wanted to 
maintain relevance order across all the hits from all document types, 
but my logic could be faulty there.


Michael Masters wrote:
> I was wondering if there is any way to control what kind of documents
> are returned from a search. For example, lets say we have an index
> built from different types of documents (pdf, txt, html, etc.). Is
> there a way to have the first x results have a specified distribution
> of document types? It would be nice to have an even number of results
> that are from pdfs, txt files, and html files.
> Any help would greatly be appreciated.
> -Mike
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message