lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karthick Duraisamy Soundararaj <d.s.karth...@gmail.com>
Subject Re: Diversifying Search Results - Custom Collector
Date Wed, 22 Aug 2012 18:21:23 GMT
Yeah the SOLR-1093 itself is a little vague but the core idea is to run
multiple queries in a request. The patch is an implementation that runs the
sub queries serially.

On Wed, Aug 22, 2012 at 10:27 AM, Mikhail Khludnev <
mkhludnev@griddynamics.com> wrote:

> SOLR-1093, which is a little bit vague itself, doesn't help for
> implementing my approach, because second query is build in according to the
> results of the first one.
>
> On Wed, Aug 22, 2012 at 3:37 PM, Karthick Duraisamy Soundararaj <
> d.s.karthick@gmail.com> wrote:
>
>> Hey Mikhail,
>>                   Yes. Thats a very good idea and a certain solution for
>> my problem:). But two solr calls for each search results might be a
>> concern. Maybe I should tweak
>> https://issues.apache.org/jira/browse/SOLR-1093 a little bit so it takes
>> the grouping results and boots them.
>>
>> Other way i think is to come up with a new field type with a custom
>> comparator and a new collector.
>>
>> On Wed, Aug 22, 2012 at 2:13 AM, Mikhail Khludnev <
>> mkhludnev@griddynamics.com> wrote:
>>
>>> one more idea:
>>> first search is grouped by brand with limit 1, it gives you a most
>>> relevant products for this particular search. than second search boost top
>>> products from the first search result by ie. q=original:query
>>> ID:(44,56,78,99,22)^1000
>>>
>>>
>>>
>>> On Tue, Aug 21, 2012 at 8:04 PM, Karthick Duraisamy Soundararaj <
>>> d.s.karthick@gmail.com> wrote:
>>>
>>>>
>>>> On Tue, Aug 21, 2012 at 11:32 AM, Tanguy Moal <tanguy.moal@gmail.com>wrote:
>>>>
>>>>> Sorry then, my approach really disables pagination jumps. You're left
>>>>> with the 'next' button only, or an "infinite-scroll" type of pagination,
>>>>> which may not be what you wanted to do...
>>>>
>>>>  You are right.
>>>>
>>>>
>>>>> Did you try disabling tf/idf and using random field as a secondary
>>>>> sort ? I'm pretty sure it will give you the best results with best efforts.
>>>>>
>>>> I was little nervous about turning off idf as I was concerned it might
>>>> affect relevancy. Considering that idf promotes documents with unique words
>>>> across the index more than the ones, turning off idf might make sense in
an
>>>> ecommerce application.  What do you think?
>>>>
>>>> But then, I dont get idea of random field for secondary sort. Secondary
>>>> sort is applicable only in the cases where the scores/primary sort values
>>>> are tied right? So I am not quite sure as how it would fit in here.
>>>>
>>>>       --
>>>>
>>>>> Tanguy
>>>>>
>>>>> 2012/8/21 Karthick Duraisamy Soundararaj <d.s.karthick@gmail.com>
>>>>>
>>>>>> Hello Tanguy,
>>>>>>                         I need pagination. The problem with your
>>>>>> approach is that, to achieve pagination, you need to do a sort at
>>>>>> application level for sorting rather than at the solr level which
I think
>>>>>> would become messy. Do you see a way around this?
>>>>>>
>>>>>> Thanks,
>>>>>> Karthick
>>>>>>
>>>>>>
>>>>>> On Tue, Aug 21, 2012 at 10:33 AM, Tanguy Moal <tanguy.moal@gmail.com>wrote:
>>>>>>
>>>>>>> Hello Karthick,
>>>>>>>
>>>>>>> 2012/8/21 Karthick Duraisamy Soundararaj <d.s.karthick@gmail.com>
>>>>>>>
>>>>>>>>  *"Find all the highest scoring document for each manufacuturer
in
>>>>>>>> the current result set and place them ahead of the rest.
Here as you can
>>>>>>>> see, the idea is to display one product from each unique
manufacturer first"
>>>>>>>> *. Now to decide how many unique manufacturer to show before
the
>>>>>>>> normal ordering can be determined relative to the total number
of unique
>>>>>>>> manufacturers. Like for example, if there are 90 unique manufacturers,
>>>>>>>> display products from 45 (approx 50%) first before displaying
the rest of
>>>>>>>> the products.
>>>>>>>>
>>>>>>>
>>>>>>> That's exactly what grouping will do. At least for the first
>>>>>>> sentence. You can ask for many items in each group, display only
the first
>>>>>>> and store the others "somewhere", for later use. When you reach
your
>>>>>>> "merchant representation  threshold" (say 50% of total number
of groups)
>>>>>>> then you can start picking the items you stored "somewhere" to
display them
>>>>>>> at randomly chosen positions. That won't help pagination, though.
>>>>>>>
>>>>>>> Could that help you ?
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Sincerely yours
>>> Mikhail Khludnev
>>> Tech Lead
>>> Grid Dynamics
>>>
>>> <http://www.griddynamics.com>
>>>  <mkhludnev@griddynamics.com>
>>>
>>>
>>
>>
>>
>>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Tech Lead
> Grid Dynamics
>
> <http://www.griddynamics.com>
>  <mkhludnev@griddynamics.com>
>
>

Mime
View raw message