lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karthick Duraisamy Soundararaj <d.s.karth...@gmail.com>
Subject Re: Diversifying Search Results - Custom Collector
Date Wed, 22 Aug 2012 11:37:59 GMT
Hey Mikhail,
                  Yes. Thats a very good idea and a certain solution for my
problem:). But two solr calls for each search results might be a concern.
Maybe I should tweak https://issues.apache.org/jira/browse/SOLR-1093 a
little bit so it takes the grouping results and boots them.

Other way i think is to come up with a new field type with a custom
comparator and a new collector.

On Wed, Aug 22, 2012 at 2:13 AM, Mikhail Khludnev <
mkhludnev@griddynamics.com> wrote:

> one more idea:
> first search is grouped by brand with limit 1, it gives you a most
> relevant products for this particular search. than second search boost top
> products from the first search result by ie. q=original:query
> ID:(44,56,78,99,22)^1000
>
>
>
> On Tue, Aug 21, 2012 at 8:04 PM, Karthick Duraisamy Soundararaj <
> d.s.karthick@gmail.com> wrote:
>
>>
>> On Tue, Aug 21, 2012 at 11:32 AM, Tanguy Moal <tanguy.moal@gmail.com>wrote:
>>
>>> Sorry then, my approach really disables pagination jumps. You're left
>>> with the 'next' button only, or an "infinite-scroll" type of pagination,
>>> which may not be what you wanted to do...
>>
>>  You are right.
>>
>>
>>> Did you try disabling tf/idf and using random field as a secondary sort
>>> ? I'm pretty sure it will give you the best results with best efforts.
>>>
>> I was little nervous about turning off idf as I was concerned it might
>> affect relevancy. Considering that idf promotes documents with unique words
>> across the index more than the ones, turning off idf might make sense in an
>> ecommerce application.  What do you think?
>>
>> But then, I dont get idea of random field for secondary sort. Secondary
>> sort is applicable only in the cases where the scores/primary sort values
>> are tied right? So I am not quite sure as how it would fit in here.
>>
>>       --
>>
>>> Tanguy
>>>
>>> 2012/8/21 Karthick Duraisamy Soundararaj <d.s.karthick@gmail.com>
>>>
>>>> Hello Tanguy,
>>>>                         I need pagination. The problem with your
>>>> approach is that, to achieve pagination, you need to do a sort at
>>>> application level for sorting rather than at the solr level which I think
>>>> would become messy. Do you see a way around this?
>>>>
>>>> Thanks,
>>>> Karthick
>>>>
>>>>
>>>> On Tue, Aug 21, 2012 at 10:33 AM, Tanguy Moal <tanguy.moal@gmail.com>wrote:
>>>>
>>>>> Hello Karthick,
>>>>>
>>>>> 2012/8/21 Karthick Duraisamy Soundararaj <d.s.karthick@gmail.com>
>>>>>
>>>>>>  *"Find all the highest scoring document for each manufacuturer in
>>>>>> the current result set and place them ahead of the rest. Here as
you can
>>>>>> see, the idea is to display one product from each unique manufacturer
first"
>>>>>> *. Now to decide how many unique manufacturer to show before the
>>>>>> normal ordering can be determined relative to the total number of
unique
>>>>>> manufacturers. Like for example, if there are 90 unique manufacturers,
>>>>>> display products from 45 (approx 50%) first before displaying the
rest of
>>>>>> the products.
>>>>>>
>>>>>
>>>>> That's exactly what grouping will do. At least for the first sentence.
>>>>> You can ask for many items in each group, display only the first and
store
>>>>> the others "somewhere", for later use. When you reach your "merchant
>>>>> representation  threshold" (say 50% of total number of groups) then you
can
>>>>> start picking the items you stored "somewhere" to display them at randomly
>>>>> chosen positions. That won't help pagination, though.
>>>>>
>>>>> Could that help you ?
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Tech Lead
> Grid Dynamics
>
> <http://www.griddynamics.com>
>  <mkhludnev@griddynamics.com>
>
>

Mime
View raw message