lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Runo <>
Subject Re: Optimizing & Improving results based on user feedback
Date Thu, 29 Jan 2009 23:43:30 GMT
Agreed, it seems that a lot of the algorithms in these papers would  
almost be a whole new RequestHandler ala Dismax. Luckily a lot of them  
seem to be built on Lucene (at least the ones that I looked at that  
had code samples).

Which papers did you see that actually talked about using clicks? I  
don't see those, beyond "Addressing Malicious Noise in Clickthrough  
Data" by Filip Radlinski and also his "Query Chains: Learning to Rank  
from Implicit Feedback" - but neither is really on topic.

Thanks for your time!

Matthew Runo
Software Engineer, - 702-943-7833

On Jan 29, 2009, at 11:36 AM, Walter Underwood wrote:

> Thanks, I didn't know there was so much research in this area.
> Most of the papers at those workshops are about tuning the
> entire ranking algorithm with machine learning techniques.
> I am interested in adding one more feature, click data, to an
> existing ranking algorithm. In my case, I have enough data to
> use query-specific boosts instead of global document boosts.
> We get about 2M search clicks per day from logged in users
> (little or no click spam).
> I'm checking out some papers from Thorsten Joachims and from
> Microsoft Research that are specifically about clickthrough
> feedback.
> wunder
> On 1/27/09 11:15 PM, "Neal Richter" <> wrote:
>> OK I've implemented this before, written academic papers and patents
>> related to this task.
>> Here are some hints:
>>   - you're on the right track with the editorial boosting elevators
>>   -
>>   - be darn careful about assuming that one click is enough evidence
>> to boost a long
>>     'distance'
>>   - first page effects in search will skew the learning badly if you
>> don't compensate.
>>        95% of users never go past the first page of results, 1% go
>> past the second
>>        page.  So perfectly good results on the second page get
>> permanently locked out
>>   - consider forgetting what you learn under some condition
>> In fact this whole area is called 'learning to rank' and is a hot
>> research topic in IR.
>> - Neal Richter
>> On Tue, Jan 27, 2009 at 2:06 PM, Matthew Runo <>  
>> wrote:
>>> Hello folks!
>>> We've been thinking about ways to improve organic search results  
>>> for a while
>>> (really, who hasn't?) and I'd like to get some ideas on ways to  
>>> implement a
>>> feedback system that uses user behavior as input. Basically, it'd  
>>> work on
>>> the premise that what the user actually clicked on is probably a  
>>> really good
>>> match for their search, and should be boosted up in the results  
>>> for that
>>> search.
>>> For example, if I search for "rain boots", and really love the  
>>> 10th result
>>> down (and show it by clicking on it), then we'd like to capture  
>>> this and use
>>> the data to boost up that result //for that search//. We've  
>>> thought about
>>> using index time boosts for the documents, but that'd boost it  
>>> regardless of
>>> the search terms, which isn't what we want. We've thought about  
>>> using the
>>> Elevator handler, but we don't really want to force a product to  
>>> the top -
>>> we'd prefer it slowly rises over time as more and more people  
>>> click it from
>>> the same search terms. Another way might be to stuff the keyword  
>>> into the
>>> document, the more times it's in the document the higher it'd  
>>> score - but
>>> there's gotta be a better way than that.
>>> Obviously this can't be done 100% in solr - but if anyone had some  
>>> clever
>>> ideas about how this might be possible it'd be interesting to hear  
>>> them.
>>> Thanks for your time!
>>> Matthew Runo
>>> Software Engineer,
>>> - 702-943-7833

View raw message