lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: document diversity
Date Sat, 03 Oct 2009 22:25:23 GMT
I'm curious, can you elaborate more on the deeper use case for this?

Perhaps just implementing faceting on doc type would be sufficient?   
That way users can drill in on doc type.  Alternatively, I suppose you  
could implement a hit collector that accesses a field cache on the doc  
type field and promotes lesser seen doc types until they are evenly  
represented.  Could also likely write a Function query that does a  
similar thing.  I'd imagine you need to be careful to control your  
memory.

-Grant

On Oct 1, 2009, at 12:56 PM, Michael Masters wrote:

> I was wondering if there is any way to control what kind of documents
> are returned from a search. For example, lets say we have an index
> built from different types of documents (pdf, txt, html, etc.). Is
> there a way to have the first x results have a specified distribution
> of document types? It would be nice to have an even number of results
> that are from pdfs, txt files, and html files.
>
>
> Any help would greatly be appreciated.
>
>
> -Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message