lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Runo <mr...@zappos.com>
Subject Re: Optimizing & Improving results based on user feedback
Date Thu, 29 Jan 2009 23:43:30 GMT
Agreed, it seems that a lot of the algorithms in these papers would  
almost be a whole new RequestHandler ala Dismax. Luckily a lot of them  
seem to be built on Lucene (at least the ones that I looked at that  
had code samples).

Which papers did you see that actually talked about using clicks? I  
don't see those, beyond "Addressing Malicious Noise in Clickthrough  
Data" by Filip Radlinski and also his "Query Chains: Learning to Rank  
from Implicit Feedback" - but neither is really on topic.

Thanks for your time!

Matthew Runo
Software Engineer, Zappos.com
mruno@zappos.com - 702-943-7833

On Jan 29, 2009, at 11:36 AM, Walter Underwood wrote:

> Thanks, I didn't know there was so much research in this area.
> Most of the papers at those workshops are about tuning the
> entire ranking algorithm with machine learning techniques.
>
> I am interested in adding one more feature, click data, to an
> existing ranking algorithm. In my case, I have enough data to
> use query-specific boosts instead of global document boosts.
> We get about 2M search clicks per day from logged in users
> (little or no click spam).
>
> I'm checking out some papers from Thorsten Joachims and from
> Microsoft Research that are specifically about clickthrough
> feedback.
>
> wunder
>
> On 1/27/09 11:15 PM, "Neal Richter" <nrichter@gmail.com> wrote:
>
>> OK I've implemented this before, written academic papers and patents
>> related to this task.
>>
>> Here are some hints:
>>   - you're on the right track with the editorial boosting elevators
>>   - http://wiki.apache.org/solr/UserTagDesign
>>   - be darn careful about assuming that one click is enough evidence
>> to boost a long
>>     'distance'
>>   - first page effects in search will skew the learning badly if you
>> don't compensate.
>>        95% of users never go past the first page of results, 1% go
>> past the second
>>        page.  So perfectly good results on the second page get
>> permanently locked out
>>   - consider forgetting what you learn under some condition
>>
>> In fact this whole area is called 'learning to rank' and is a hot
>> research topic in IR.
>> http://web.mit.edu/shivani/www/Ranking-NIPS-05/
>> http://research.microsoft.com/en-us/um/people/lr4ir-2007/
>> https://research.microsoft.com/en-us/um/people/lr4ir-2008/
>>
>> - Neal Richter
>>
>>
>> On Tue, Jan 27, 2009 at 2:06 PM, Matthew Runo <mruno@zappos.com>  
>> wrote:
>>> Hello folks!
>>>
>>> We've been thinking about ways to improve organic search results  
>>> for a while
>>> (really, who hasn't?) and I'd like to get some ideas on ways to  
>>> implement a
>>> feedback system that uses user behavior as input. Basically, it'd  
>>> work on
>>> the premise that what the user actually clicked on is probably a  
>>> really good
>>> match for their search, and should be boosted up in the results  
>>> for that
>>> search.
>>>
>>> For example, if I search for "rain boots", and really love the  
>>> 10th result
>>> down (and show it by clicking on it), then we'd like to capture  
>>> this and use
>>> the data to boost up that result //for that search//. We've  
>>> thought about
>>> using index time boosts for the documents, but that'd boost it  
>>> regardless of
>>> the search terms, which isn't what we want. We've thought about  
>>> using the
>>> Elevator handler, but we don't really want to force a product to  
>>> the top -
>>> we'd prefer it slowly rises over time as more and more people  
>>> click it from
>>> the same search terms. Another way might be to stuff the keyword  
>>> into the
>>> document, the more times it's in the document the higher it'd  
>>> score - but
>>> there's gotta be a better way than that.
>>>
>>> Obviously this can't be done 100% in solr - but if anyone had some  
>>> clever
>>> ideas about how this might be possible it'd be interesting to hear  
>>> them.
>>>
>>> Thanks for your time!
>>>
>>> Matthew Runo
>>> Software Engineer, Zappos.com
>>> mruno@zappos.com - 702-943-7833
>>>
>>>
>


Mime
View raw message