lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Libbrecht <>
Subject Re: document diversity
Date Tue, 06 Oct 2009 20:59:56 GMT
Just as you can add a query that will boost better things with a  
higher quality, you can add a query for a higher revenue.

Basically, the default operator "should" in boolean-clauses can be  
used exactly for that: do not force this query to be matched but raise  
boost if there's something that matches.

The translation of the user-query, itself, is marked as "must" (and  
inside this one is orred in all the various flavours, e.g. per  
language, phonetic...).


Le 06-oct.-09 à 22:33, Michael Masters a écrit :

> My initial description may have been a little abstract. Maybe I should
> explain exactly what I'm trying to do. My company has various revenue
> channels, one of which is per click. If a user does a search, we would
> like to show results with the greatest revenue, although we don't want
> people to be able to buy all the top results. Hence, we would like to
> have some way of mixing results. The mixing of results could be based
> of potential revenue, relevancy, which revenue stream the result is
> associated with, etc.
> The previously mentioned ideas are great btw.
> -Mike
> On Sat, Oct 3, 2009 at 4:25 PM, Grant Ingersoll  
> <> wrote:
>> I'm curious, can you elaborate more on the deeper use case for this?
>> Perhaps just implementing faceting on doc type would be  
>> sufficient?  That
>> way users can drill in on doc type.  Alternatively, I suppose you  
>> could
>> implement a hit collector that accesses a field cache on the doc  
>> type field
>> and promotes lesser seen doc types until they are evenly  
>> represented.  Could
>> also likely write a Function query that does a similar thing.  I'd  
>> imagine
>> you need to be careful to control your memory.
>> -Grant
>> On Oct 1, 2009, at 12:56 PM, Michael Masters wrote:
>>> I was wondering if there is any way to control what kind of  
>>> documents
>>> are returned from a search. For example, lets say we have an index
>>> built from different types of documents (pdf, txt, html, etc.). Is
>>> there a way to have the first x results have a specified  
>>> distribution
>>> of document types? It would be nice to have an even number of  
>>> results
>>> that are from pdfs, txt files, and html files.
>>> Any help would greatly be appreciated.
>>> -Mike
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail:
>>> For additional commands, e-mail:
>> --------------------------
>> Grant Ingersoll
>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
>> using
>> Solr/Lucene:
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

View raw message