lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karthick Duraisamy Soundararaj <>
Subject Re: Diversifying Search Results - Custom Collector
Date Tue, 21 Aug 2012 16:04:23 GMT
On Tue, Aug 21, 2012 at 11:32 AM, Tanguy Moal <> wrote:

> Sorry then, my approach really disables pagination jumps. You're left with
> the 'next' button only, or an "infinite-scroll" type of pagination, which
> may not be what you wanted to do...

You are right.

> Did you try disabling tf/idf and using random field as a secondary sort
> ? I'm pretty sure it will give you the best results with best efforts.
I was little nervous about turning off idf as I was concerned it might
affect relevancy. Considering that idf promotes documents with unique words
across the index more than the ones, turning off idf might make sense in an
ecommerce application.  What do you think?

But then, I dont get idea of random field for secondary sort. Secondary
sort is applicable only in the cases where the scores/primary sort values
are tied right? So I am not quite sure as how it would fit in here.


> Tanguy
> 2012/8/21 Karthick Duraisamy Soundararaj <>
>> Hello Tanguy,
>>                         I need pagination. The problem with your approach
>> is that, to achieve pagination, you need to do a sort at application level
>> for sorting rather than at the solr level which I think would become messy.
>> Do you see a way around this?
>> Thanks,
>> Karthick
>> On Tue, Aug 21, 2012 at 10:33 AM, Tanguy Moal <>wrote:
>>> Hello Karthick,
>>> 2012/8/21 Karthick Duraisamy Soundararaj <>
>>>>  *"Find all the highest scoring document for each manufacuturer in the
>>>> current result set and place them ahead of the rest. Here as you can see,
>>>> the idea is to display one product from each unique manufacturer first"
>>>> *. Now to decide how many unique manufacturer to show before the
>>>> normal ordering can be determined relative to the total number of unique
>>>> manufacturers. Like for example, if there are 90 unique manufacturers,
>>>> display products from 45 (approx 50%) first before displaying the rest of
>>>> the products.
>>> That's exactly what grouping will do. At least for the first sentence.
>>> You can ask for many items in each group, display only the first and store
>>> the others "somewhere", for later use. When you reach your "merchant
>>> representation  threshold" (say 50% of total number of groups) then you can
>>> start picking the items you stored "somewhere" to display them at randomly
>>> chosen positions. That won't help pagination, though.
>>> Could that help you ?

View raw message