lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: Open Relevance Project?
Date Tue, 19 May 2009 03:53:42 GMT
+1.  Let's not get ahead of ourselves w/ changing the world or  
anything like that.  First and foremost, we need this for Lucene, if  
others benefit, so be it.  You are right on in that we need a shared,  
free way of judging whether Lucene is improving on relevance (even if  
it is already very good out of the box).  Otherwise, we can't even  
have the conversation.  For instance, it would help in evaluating the  
Axiomatic patch in JIRA or the SweetSpot stuff or a whole host of  
things (for instance, our current len. norm tends to favor shorter  
docs, is this the right default?)

On May 18, 2009, at 11:00 PM, Mark Miller wrote:

> Grant Ingersoll wrote:
>> Some interesting discussion at
> That was an interesting read. I think a lot of the argument misses  
> the point. It doesn't seem to me that the main benefit or intent  
> comes from 'bake offs' with other search engines ("Selling search  
> applications to enterprises isn't, in my experience, about winning  
> relevance bake-offs.") - the main benefit is allowing us to measure  
> changes and improvements to Lucene's relevancy calculations and to  
> make judgments about how Lucene currently performs. I see it easily  
> as important as the Lucene benchmark contrib. Its not going to be a  
> secret sauce, just like the benchmarker has been no secret sauce -  
> but its going to make it easier to reliably improve Lucene in the  
> future.
> - Mark
>> On May 18, 2009, at 1:57 PM, Grant Ingersoll wrote:
>>> On May 18, 2009, at 11:41 AM, Ted Dunning wrote:
>>>> On the other hand, it is likely that we could find query and  
>>>> click logs for
>>>> the documentation.
>>> Only if they are redacted/aggregated first.  ASF Members have  
>>> access, but we'd need to get permission to distribute (after  
>>> redaction/aggregation) I suspect.   Given the AOL marketing  
>>> fiasco, we'd have to go over them in pretty good detail before  
>>> releasing to make sure there is no personal information.  AFAIK,  
>>> I'm the only ASF Member who has so far volunteered on this thread  
>>> and I highly doubt I have the time for what I imagine to be a  
>>> pretty decent sized endeavor.
>>> Stripping IP address is pretty straightforward, but the query  
>>> terms might be a bit more involved.
>>> Still, can't hurt to find out what's involved.
>>> -Grant
> -- 
> - Mark

Grant Ingersoll

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

View raw message