lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Khludnev <mkhlud...@griddynamics.com>
Subject Re: Diversifying Search Results - Custom Collector
Date Wed, 22 Aug 2012 14:27:00 GMT
SOLR-1093, which is a little bit vague itself, doesn't help for
implementing my approach, because second query is build in according to the
results of the first one.

On Wed, Aug 22, 2012 at 3:37 PM, Karthick Duraisamy Soundararaj <
d.s.karthick@gmail.com> wrote:

> Hey Mikhail,
>                   Yes. Thats a very good idea and a certain solution for
> my problem:). But two solr calls for each search results might be a
> concern. Maybe I should tweak
> https://issues.apache.org/jira/browse/SOLR-1093 a little bit so it takes
> the grouping results and boots them.
>
> Other way i think is to come up with a new field type with a custom
> comparator and a new collector.
>
> On Wed, Aug 22, 2012 at 2:13 AM, Mikhail Khludnev <
> mkhludnev@griddynamics.com> wrote:
>
>> one more idea:
>> first search is grouped by brand with limit 1, it gives you a most
>> relevant products for this particular search. than second search boost top
>> products from the first search result by ie. q=original:query
>> ID:(44,56,78,99,22)^1000
>>
>>
>>
>> On Tue, Aug 21, 2012 at 8:04 PM, Karthick Duraisamy Soundararaj <
>> d.s.karthick@gmail.com> wrote:
>>
>>>
>>> On Tue, Aug 21, 2012 at 11:32 AM, Tanguy Moal <tanguy.moal@gmail.com>wrote:
>>>
>>>> Sorry then, my approach really disables pagination jumps. You're left
>>>> with the 'next' button only, or an "infinite-scroll" type of pagination,
>>>> which may not be what you wanted to do...
>>>
>>>  You are right.
>>>
>>>
>>>> Did you try disabling tf/idf and using random field as a secondary sort
>>>> ? I'm pretty sure it will give you the best results with best efforts.
>>>>
>>> I was little nervous about turning off idf as I was concerned it might
>>> affect relevancy. Considering that idf promotes documents with unique words
>>> across the index more than the ones, turning off idf might make sense in an
>>> ecommerce application.  What do you think?
>>>
>>> But then, I dont get idea of random field for secondary sort. Secondary
>>> sort is applicable only in the cases where the scores/primary sort values
>>> are tied right? So I am not quite sure as how it would fit in here.
>>>
>>>       --
>>>
>>>> Tanguy
>>>>
>>>> 2012/8/21 Karthick Duraisamy Soundararaj <d.s.karthick@gmail.com>
>>>>
>>>>> Hello Tanguy,
>>>>>                         I need pagination. The problem with your
>>>>> approach is that, to achieve pagination, you need to do a sort at
>>>>> application level for sorting rather than at the solr level which I think
>>>>> would become messy. Do you see a way around this?
>>>>>
>>>>> Thanks,
>>>>> Karthick
>>>>>
>>>>>
>>>>> On Tue, Aug 21, 2012 at 10:33 AM, Tanguy Moal <tanguy.moal@gmail.com>wrote:
>>>>>
>>>>>> Hello Karthick,
>>>>>>
>>>>>> 2012/8/21 Karthick Duraisamy Soundararaj <d.s.karthick@gmail.com>
>>>>>>
>>>>>>>  *"Find all the highest scoring document for each manufacuturer
in
>>>>>>> the current result set and place them ahead of the rest. Here
as you can
>>>>>>> see, the idea is to display one product from each unique manufacturer
first"
>>>>>>> *. Now to decide how many unique manufacturer to show before
the
>>>>>>> normal ordering can be determined relative to the total number
of unique
>>>>>>> manufacturers. Like for example, if there are 90 unique manufacturers,
>>>>>>> display products from 45 (approx 50%) first before displaying
the rest of
>>>>>>> the products.
>>>>>>>
>>>>>>
>>>>>> That's exactly what grouping will do. At least for the first
>>>>>> sentence. You can ask for many items in each group, display only
the first
>>>>>> and store the others "somewhere", for later use. When you reach your
>>>>>> "merchant representation  threshold" (say 50% of total number of
groups)
>>>>>> then you can start picking the items you stored "somewhere" to display
them
>>>>>> at randomly chosen positions. That won't help pagination, though.
>>>>>>
>>>>>> Could that help you ?
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>> Tech Lead
>> Grid Dynamics
>>
>> <http://www.griddynamics.com>
>>  <mkhludnev@griddynamics.com>
>>
>>
>
>
>
>


-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

<http://www.griddynamics.com>
 <mkhludnev@griddynamics.com>

Mime
View raw message